Тёмный

How To Handle Missing Values in Categorical Features 

Krish Naik
Подписаться 981 тыс.
Просмотров 114 тыс.
50% 1

Hello All here is a video which provides the detailed explanation about how we can handle the missing values in categorical values
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: www.amazon.in/Hands-Python-Fi...
Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below
amazon url:
www.amazon.in/Hands-Machine-L...
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06
Subscribe my unboxing Channel
/ @krishnaikhindi
Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
Deep Learning Playlist: • Tutorial 1- Introducti...
Data Science Projects playlist: • Generative Adversarial...
NLP playlist: • Natural Language Proce...
Statistics Playlist: • Population vs Sample i...
Feature Engineering playlist: • Feature Engineering in...
Computer Vision playlist: • OpenCV Installation | ...
Data Science Interview Question playlist: • Complete Life Cycle of...
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: www.amazon.in/Hands-Python-Fi...
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY RU-vid CHANNEL

Опубликовано:

 

18 авг 2019

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 117   
@aksontv
@aksontv 4 года назад
Finally got right man to learn data science and ML. Thank you sir!
@mohitupadhayay1439
@mohitupadhayay1439 2 года назад
This was such an amazing life saver. I didn't even knew I had this question and the video just popped up. Didn't find this tutorial anywhere else.
@duvanmartinez8586
@duvanmartinez8586 4 года назад
Great work, you're awesome, you're the best youtuber I've found.
@gabrielburgos2533
@gabrielburgos2533 Год назад
You are the MVP, when no one has the answer, you do.
@soumikchakraborty90
@soumikchakraborty90 4 года назад
You are just awesome bro. Please make a video on AIC, AUC, ROC curve.
@abinashkumarsinha8958
@abinashkumarsinha8958 2 года назад
This helped me a lot in my project work. Very useful and very well explained.
@keshavbansal5148
@keshavbansal5148 4 года назад
started this playlist today, loving it
@pallabsaha4098
@pallabsaha4098 4 года назад
Very well explained. If you could show the same on a dataset and code that would be very helpful. Thank you sir for your videos. Love them all.
@doop9134
@doop9134 Год назад
I was stuck for days trying to figure out how to predict missing data using ML. This helped me understand so so so much better! 😍 Thank you so much!! 🙏💚
@shivambhayre5056
@shivambhayre5056 4 года назад
I have no words to say just a thanks🙏
@tumul1474
@tumul1474 4 года назад
thank you sir ! amazing video as always
@hv3300
@hv3300 4 года назад
Excellent video, as usual.
@andyjackson4563
@andyjackson4563 Год назад
Thanks for explaining these methods
@Geethu_Mohan_DA
@Geethu_Mohan_DA Год назад
Easy to understand. Thank you
@sandeepnallala48
@sandeepnallala48 2 года назад
doing a great work Krish. thanks a lot. Loved your Videos : )
@amedyasar9468
@amedyasar9468 3 года назад
it was quite short explaination and nice points to undersdtand. Tanks!
@itsmoolya
@itsmoolya 4 года назад
This is a good explanation!
@Susa270
@Susa270 2 года назад
Hello @ Krish Naik Hope you are doing well 🙂 First of all would like to thank you for such knowledgable videos. Most of the times your videos are really beam of hope. Can you please let me know where can I check the actual coding for the above mentioned concepts. It is a little difficult to get it in live scenario. Please guide, a humble request.
@fahimekheradmand5880
@fahimekheradmand5880 4 года назад
Excellent, Thank you
@mohiuddinshojib2647
@mohiuddinshojib2647 Год назад
that is really informative
@ankurbanerji6605
@ankurbanerji6605 3 года назад
Great explanation sir! Can you explain how to handle the missing values for multiple columns in a dataset
@daniellazarolazaro1033
@daniellazarolazaro1033 3 года назад
Thank you so much, this video actually helps a lot when you just got started like me hahahha, as I was saying, thank you so much for this great great great work!!!
@ZUBINABRAHAM
@ZUBINABRAHAM 3 года назад
Thanks for the video it was informative. Can we use KNN?
@abdulhakeem4715
@abdulhakeem4715 Месяц назад
clean explaination
@anandacharya9919
@anandacharya9919 4 года назад
Thank you for this video. Please also make video how to handle missing value and Outlier in continues variables.
@Nursin-rg1ey
@Nursin-rg1ey Год назад
thanks very much sir
@madunishant6052
@madunishant6052 4 года назад
Thanks! 😊
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
One more question- in some data set we find columns with many categories like Cars name column will have many cars name..In such case if we use this Unsupervised technique to create clusters, Won't it be too many clusters ?
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
Sir, Can we get code for Create a classifier algorithm method for Missing value?
@another_hindu
@another_hindu 3 года назад
Hello sir, maybe I am here too late but I still hope that you would acknowledge this question as it might be of immense value. I have a disputed question which basically revolves around knn imputer, scaling and the concept of data leakage. As the knn imputer works on the principles same as knn algo, it does share the pros and cons of knn algo, right. So wont it be better to simply scale the data first ? Also, in case I am separating out the train and test data in order to avoid data leakage, should I split the data and then scale, impute ? Or should I impute and then split,scale it ? In case I split first...which is the most common preference which stats should I use for the user input. And lastly how should I handle the label encoded columns if any ? Nobody is discussing on this when it is one of the most imp problems a person would likely face. Can you please make a video on this ?
@divyaharshad9985
@divyaharshad9985 5 месяцев назад
For technique 3 will it lead to multicollinearity in the data?
@thatguyadarsh
@thatguyadarsh 3 года назад
Amazing !! Use ML model to predict the NaN values.. That is clever sir.
@muzamilshah8028
@muzamilshah8028 4 года назад
lets consider i want to predict value for f1 & row 2 as you have mention but what if we have also missing value in f2,f3 but not in same row ..what will we do in that scenario ????
@MegaJaivardhan
@MegaJaivardhan 4 года назад
love you bro.. could you make a video AUC and ROC curve?
@chandrasekarank8583
@chandrasekarank8583 4 года назад
Sir what if i can label encode the data then i can do a simple imputer which will replace the nan values by the mean or median as i wanted. Sir please tell me whether this is a way to do
@hindajjouri9151
@hindajjouri9151 5 месяцев назад
thank you
@aronpollner
@aronpollner Год назад
Is there a Multivariate Imputer implementation for categorical values like a class from sklearn?
@madhurchaudhary5109
@madhurchaudhary5109 3 года назад
Hi Krish, This is well explained!! I have an ID column which has unique value but for some records, ID is null how I can handle this type of data.
@lukaszmichalak9985
@lukaszmichalak9985 4 года назад
Don't you increase correlation between features with those methods? If so - what that will bring to the output model - to the prediction?
@amitjajoo9510
@amitjajoo9510 4 года назад
sir thanks for making feature engineering playlist.
@user-vy4jo3lt2v
@user-vy4jo3lt2v 9 месяцев назад
If we want to apply classifier algorithm on multiple columns then its possible ?
@shaileshsahu9551
@shaileshsahu9551 4 года назад
Please add a video in the Data Science and ML playlist of how to create our own predictor or estimator classifier algorithm to predict both categorical and continuous variables.
@ashokpalivela311
@ashokpalivela311 3 года назад
thank you😍
@ele_wings7521
@ele_wings7521 4 года назад
thank you sir...
@12tlittle
@12tlittle 4 года назад
Thanks krish
@abhipraydumka8587
@abhipraydumka8587 4 года назад
Can you tell me how to assign a unique cateogry lets say U(undefined ) to missing cateogrical data
@anuragmishra6262
@anuragmishra6262 4 года назад
Can you please show practical implementation of the same. Thanks 😊
@nasiksami2351
@nasiksami2351 3 года назад
Amazing!
@tahamansoor599
@tahamansoor599 4 года назад
its great it would be better if u show us a hands on the dataset
@Saikrishna-lx9it
@Saikrishna-lx9it 4 года назад
Hi bro can you make one end to end chatbot video using rasa nlu, which is useful for all who are interested in nlp.
@Raja-tt4ll
@Raja-tt4ll 4 года назад
very nice video
@theoutlet9300
@theoutlet9300 3 года назад
since we are using output to predict our feature and then feature to predict our output, wouldnt it cause problems in prediction?
@bismeetsingh352
@bismeetsingh352 4 года назад
What do you do when you have missing values in textual data?
@pankajkar2008
@pankajkar2008 4 года назад
pure concepts
@kumarraju2923
@kumarraju2923 4 года назад
How the initial clusters are selected for missing values
@clivefernandes5435
@clivefernandes5435 4 года назад
Is method 3 widely used ? Never heard of it
@raghavkumar8333
@raghavkumar8333 4 года назад
Sir, I have a student attrition dataset where I need to predict the reasons for student dropping out in 2nd year who got admission in 1st year. An year consist of 2 terms and I have grades of student (a,b,c,d) in 6 different courses in 1st and 2nd terms now most of these grade columns of 6 different courses in 2nd term are missing. Intuitive I think it could be a reason for dropping out. My question is 1) Should I impute missing values in this case because it is possible that it is not missing those students already dropped out. So, should I create dummy variables 2) If I impute missing value what technique should I use to impute those missing categorical variables
@saurabhpathare4157
@saurabhpathare4157 3 года назад
I am always reluctant to delete or use mode for categorical values. This video explains a lot. Good approach! In technique 3, which classifier do you recommend for best efficiency?
@riteshmukhopadhyay6922
@riteshmukhopadhyay6922 2 года назад
KNN, there is no particular ways as such it depends on the dataset
@sachinborgave8094
@sachinborgave8094 4 года назад
Excellent Sir, can you please provide a python source code i.e. how to fill missing category data using logistics reg
@sadikbilal5149
@sadikbilal5149 2 года назад
Nice , plz u have code to implement that techniques?
@AutitsicDysexlia
@AutitsicDysexlia 3 года назад
This is what I did in DAX, but I did it in a more complex way... because I was using DAX. But it's effectively a RandomForest method that I used.
@sachinborgave8094
@sachinborgave8094 4 года назад
Hello sir... Please make a video that how to fill missing categories using logistic regression...
@ommehta4501
@ommehta4501 2 года назад
If we have date categorical feature and have some missing values, please tell me how to do with this
@sandyjust
@sandyjust 4 года назад
Great explanation of the concept. With unsupervised technique we might be in situation that both male and female falls under group 2. Then what would our approach?
@kaustabhmandal7483
@kaustabhmandal7483 4 года назад
I have also observed that in this video. You can put the the category with max frequency in that cluster.
@192Kiran
@192Kiran 4 года назад
Krish . could please do with datasets
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
Just a request...May you please upload codes for this also..-, I saw in many videos codes are missing for techniques..it will be very helpful if you provide us code. Thanks a lot
@preetnandeshwar5331
@preetnandeshwar5331 3 года назад
which missing catgorial method suit for which data set and why?or we just have to use it like HIT AND TRIAL METHOD? Plz anyone help me .I am begineer
@aditya_baser
@aditya_baser 4 года назад
Here, you only had one categorical column. What if you have multiple categorical columns, how do you go about with the missing value treatment in that case?
@mitultank7872
@mitultank7872 2 года назад
If I have the missing values in numerical column, and I want to fill that based on other categorical variable column . Then how can I handle that?
@VikasSharma-ye7pu
@VikasSharma-ye7pu 4 года назад
Hi krish ... Pls make video on in explaining 2 kaggle competition projects ...
@RAJI11000
@RAJI11000 4 года назад
Sir how can impute if feature value like 100 mbps
@napoleonx5259
@napoleonx5259 Год назад
كفو كريشنا ❤
@chinmaybhat9636
@chinmaybhat9636 4 года назад
Can you Share the Same thing by taking one dataset and showcase the same
@chirumadderla8129
@chirumadderla8129 2 года назад
If there are several missing values in the solar radiation data during the night times and early morning hours how to handle them .The dataset I considered is of one year
@ashwanikumar-zh1mq
@ashwanikumar-zh1mq 3 года назад
How to handel in regression oroblem
@sriraj8392
@sriraj8392 2 года назад
sir will u teach offline classes ...?
@ashwinkrishnan4285
@ashwinkrishnan4285 3 года назад
If we apply classifier algorithm to predict the Gender feature if it is male or female through other features including output feature as well, in training dataset and get the missing values of gender feature (Test dataset), and then finally when we go for the model to predict the classification of output hope it would be influenced or the data leakage would have happened as we considered that to fill missing column values? Please clarify on this point Krish..
@chirathabey7729
@chirathabey7729 3 года назад
It won't as much because even though we are training including the output feature, it only used for predicting the missing samples ONLY. Considering the fact that there is much less missing samples as compared to rest of the samples. If the missing samples are considerably high and have in many other features then it will certainly create a bias on the final prediction.
@sandipansarkar9211
@sandipansarkar9211 2 года назад
finished watching
@jaiminshah143
@jaiminshah143 3 года назад
How to handle missing(NaN) values in column having binary data values i.e Just 0 or 1 ?
@janinajochim1843
@janinajochim1843 4 года назад
Thank you for the video! Would you happen to know what to do in cases where the value is"Missing by design". I have a case where I am using the variable "Father's reaction to pregnancy" -- it has missing values for participants who did not know the father of the child because they didn't get this question :/
@sawradipsaha5377
@sawradipsaha5377 4 года назад
May be you can consider that as a different catagory.
@RajaKumar-ne9bt
@RajaKumar-ne9bt 2 года назад
Why we are skipping the output when doing clustering?
@RK-un6ou
@RK-un6ou 3 года назад
Why do we fill NaN values with mean or median? And why does it won't effect the dataset Can you explain a bit in this?
@1a17890
@1a17890 3 месяца назад
Sirji can you kindly show how it's done
@Analystmind
@Analystmind Год назад
What if my model's missing values are not categorically it's number
@shivambhayre5056
@shivambhayre5056 4 года назад
If it is in quantitative variables we can replace missing value by mean
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
Is it a question?, If yes, Then Yep You can take mean to replace Quantitative missing values
@analistaremoto
@analistaremoto 3 года назад
Niiiiiice!
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
You said to Create a classifier to predict the missing values. What to do if we have Linear regression problem and Missing values there?, Should we create classifier for that too? Please response
@chirathabey7729
@chirathabey7729 3 года назад
Yes, if you are trying to predict the missing value which belongs to a Categorical variable. Because when you are predicting missing value, your output variable will be the missing value variable and rest of the variables will become the input variables. You can think of you are trying to solve an entirely independent problem.
@AmitYadav-ig8yt
@AmitYadav-ig8yt 4 года назад
Sir, U took data set which has a missing value in just one column. You told about Predicting missing value my using other columns as Training set. Let's say we have a data set in which every columns have some missing values..In such case which columns should be use to predict missing values?
@kannadarecipes-6626
@kannadarecipes-6626 4 года назад
Following
@habilmohammed5127
@habilmohammed5127 4 года назад
Following
@leilafakhraei78
@leilafakhraei78 4 года назад
Following
@barnadipdey8486
@barnadipdey8486 4 года назад
yes Amit I have the same query ,if you had solved this please dm me.
@mohammadarif8057
@mohammadarif8057 4 года назад
Sir can you provide a practical approach with complex data set ...that would be great thank you
@akshayvilayatkar7985
@akshayvilayatkar7985 4 года назад
How we can handle alphanumeric missing values in dataset. I can not got out of this problem ,Please help krish
@shaikhkashif9973
@shaikhkashif9973 Год назад
Sir pehle outliers fill yah null values fill karna chahiye ols answer
@Justme-dk7vm
@Justme-dk7vm Месяц назад
Sir why do you have the same voice as my college chairman? 😩💓
@archanapereira1333
@archanapereira1333 4 года назад
How to identify dependent n independent variables in a dataset ?
@chirathabey7729
@chirathabey7729 3 года назад
It depends on the problem description. It describes what the problem is. So, your output variable / dependent variable will give the answers to your problem. Rest of the features will become your independent variables
@nhprml6324
@nhprml6324 4 года назад
we can replace missing values with corresponding feature's mean value.
@cutyoopsmoments2800
@cutyoopsmoments2800 4 года назад
Bro I want to make my career in Machine Learning. Kindly guide...
@dineshkumar-kc7vt
@dineshkumar-kc7vt 4 года назад
im unable to overcome this problem. I have initially done is get_dummies for the Dataset and i want to handle the missing values but i'm getting error so as TypeError: '(slice(None, None, None), slice(0, 2, None))' is an invalid key Please Help Me
@chirathabey7729
@chirathabey7729 3 года назад
Before you apply One-Hot-Encoding, do the missing value treatment first
@junaidlatif2881
@junaidlatif2881 Год назад
But how to apply!
@vasusharma1773
@vasusharma1773 4 года назад
sir if you could just show this in a code, it will be very helpful
@arjyabasu1311
@arjyabasu1311 4 года назад
Sir please upload the implementation of these methods !!
@harshtiwari8765
@harshtiwari8765 3 года назад
can u send me the notes for feature enginerring which was given by Krish naik ? Help is appreciated
@jaypatil4786
@jaypatil4786 4 года назад
I have one easy question ...but I not remember it now please tell me to view how many missing values in dataset
@saravananm2280
@saravananm2280 4 года назад
dataset.isnull().sum()
@martinlyuba5105
@martinlyuba5105 9 месяцев назад
Great tutorila. your email please
Далее
3M❤️ #thankyou #shorts
00:14
Просмотров 2,5 млн
AI VS ML VS DL VS Data Science
9:45
Просмотров 2,8 млн
Dealing With Missing Data - Multiple Imputation
11:02