Тёмный
No video :(

Live-Feature Engineering-All Techniques To Handle Missing Values- Day 1 

Krish Naik
Подписаться 1 млн
Просмотров 132 тыс.
50% 1

github link: github.com/kri...
Join the Ineuron Affordable course
ineuron1.viewp...
Please donate if you want to support the channel
Gpay: krishnaik06@okicici
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
Please do subscribe my other channel too
/ @krishnaikhindi
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06

Опубликовано:

 

12 июл 2020

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 136   
@nikhilparmar9
@nikhilparmar9 4 года назад
Very helpful. When u said, u are putting up these useful tutorials in public domain so that people can learn and will help them for the Job in this Covid Situation... Man u just earned a lot lot of respect and blessings. Thank you Krish. 🤟🏻🙏🏻
@harshsinghal1087
@harshsinghal1087 3 года назад
Superb, thanks for the knowledge, session starts at 7:40, writing in order to save time
@adamsmohammed4499
@adamsmohammed4499 4 года назад
God bless Krish Naik for everything he has done so far on his channel. The channel is a gold mine. You are indeed a God sent.
@KeDeng-fm3bs
@KeDeng-fm3bs 4 месяца назад
I cannot thank you enough for all your videos and efforts, sir. You are an AMAZING teacher, and I wish you all the best all the time. Many thanks and blessings to you! Thank you !!!
@sandipaghosh8850
@sandipaghosh8850 2 года назад
Sir i am watching this video today.. and i really want to thank you ... you are a amazing person, teacher and a great motivator ... your you tube channel is a gold mine , really sir. If i really succeed in my life one day, there is a lot of contribution of yours sir🙏🏻🙏🏻.Please bless us as our guru.. and God bless you for doing such a amazing job everyday for all the needy students..
@sivasambhulenka3086
@sivasambhulenka3086 4 года назад
Man is very honest, one can learn from here and very helpful
@shilashm5691
@shilashm5691 2 года назад
MCAR -->> there is no relationship between the observed value and unobserved value. So the missingness is because of random Eg. We are visiting the library and go through the Entry book and we notice there is 5% of missing records. So in this case the 5% missing records can be of any record in the entry book. There is no particular answer why there are missing values. So probablility of missing value is same for all record's in dataset MNAR --> There is a relationship between missing values and the unobserved values. There is no relationship between observed value and the missing values.. Eg. We are conducting a survey of depression .It is each individual choice to poll so here missing value is depend on the individual who is unknown. So it is MNAR MAR --> there is a relationship between the observed value and missing values, not btw unobserved and missing values. And there can also a missingness in subgroup of the observed values. Eg. We are conducting a survey of salary. Here most of the men won't poll and some of women won't poll for salary.. So the missing values of salary has a relatiomln with the gender.. I hope this will clear the doubt. Don't rely on single source, Nobody knows everything, so don't trust anyone🤣
@mahalerahulm
@mahalerahulm 3 года назад
This is how you should learn things -- Agree !!
@gunjantoora863
@gunjantoora863 2 года назад
You teach better than most professors at universities in the US
@Mon_isha09
@Mon_isha09 4 года назад
Its very helpful .You are an inspiration of mine whenever I stuck at any Topic first I remember u and ur videos that return my confidence. Thanks a lot.
@yahya89able
@yahya89able 4 года назад
Your channel is disturbingly fantastic
@sumankumari-gl3ze
@sumankumari-gl3ze Год назад
I really like your all video sir way of teaching is very good. Thankyou sir
@Maverick-ld2xc
@Maverick-ld2xc 4 года назад
Bless you brother. All nice instructive videos
@vkasrajpurohit1614
@vkasrajpurohit1614 3 года назад
U enumerate so well..
@simplytech222
@simplytech222 3 года назад
Love your streams and makes me motivated to keep learning everyday and be better
@fun_fin3704
@fun_fin3704 3 года назад
This content helps me quite well and i must say your way of teaching is amazing ❣️
@geekyprogrammer4831
@geekyprogrammer4831 3 года назад
Very comprehensive!! I loved each and every bit of it
@tanujsharma5492
@tanujsharma5492 2 года назад
I m learning DS from different playlists in your channel, sir!..so nice experience.. but first time I knew that you speak hindi so nice😁❤..
@pavithrad9543
@pavithrad9543 3 года назад
Great job Krish. Your sessions are very much useful for many people.
@sameerpandey5561
@sameerpandey5561 3 года назад
thanks for such wonderful sessions
@victorhenostroza1871
@victorhenostroza1871 2 года назад
Other disadvantage, u dont care about impact of otehr Xs Variables, so I prefer using multivariate imputation techniques like MICE. Thanks for your fideo
@shivanggoyal7508
@shivanggoyal7508 4 года назад
I LOve yOu siR .....kya padate ho aap...maze se aagaye ...interest or bad gaya data science m..
@lilyfullery4779
@lilyfullery4779 Год назад
Hi Krish , i appriciate your work , thank you for this video
@marijatosic217
@marijatosic217 3 года назад
Had finals at my Uni, so happy to be back :D
@shikhar_anand
@shikhar_anand 3 года назад
Krish... Thank you for a wonderful video and free knowledge...
@ashishanand1466
@ashishanand1466 3 года назад
Thanks for your contribution it means a lot to us
@parthsonagara9562
@parthsonagara9562 3 года назад
Thank You Krish, for providing such tutorials.
@anikasingh2464
@anikasingh2464 3 года назад
Loved it ❤️
@gurdeepsinghbhatia2875
@gurdeepsinghbhatia2875 4 года назад
VERY VERY NICE SIR , U R DOING VERY GREAT JOB , HUGE RESPECT SIR
@srishtikumari6664
@srishtikumari6664 3 года назад
Amazing! You are doing great work.
@jothiramsanjeevi6469
@jothiramsanjeevi6469 4 года назад
Thanks 😊 for the video!.
@soulfood_12
@soulfood_12 3 года назад
Really good videos.. I am learning a lot from your tutorials
@alihaiderabdi9939
@alihaiderabdi9939 2 года назад
Thanks a ton krish sir !!!! very informative
@shubhamyeole2881
@shubhamyeole2881 2 года назад
Very helpfull lecture sir ...
@maheshvangala8472
@maheshvangala8472 4 года назад
@Krish Naik what are your views on Matlab I heard that Matlab will help you in understanding the ML algorithms and implementing them while Python comes with in built functions
@MdRakibHosen
@MdRakibHosen 3 года назад
Awesome tutorial. Thanks bro .
@ushirranjan6713
@ushirranjan6713 4 года назад
Amazing Video Sir!! Thanks!!
@suryaprakash6564
@suryaprakash6564 2 года назад
Thank u so much very helpfull this tutorial
@laythherzallah3493
@laythherzallah3493 2 года назад
Great thank you
@dra.talwar4592
@dra.talwar4592 3 года назад
nice... much needed, btw satik hindi punch Google pe sab mil jaata h 😁
@benedict6695
@benedict6695 3 года назад
Thanks Krish!
@priyankgupta3931
@priyankgupta3931 3 года назад
Wonderful session Sir !
@akshajshah4040
@akshajshah4040 4 года назад
great great job hats off
@rajeshgaikwad9343
@rajeshgaikwad9343 4 года назад
@19:00 - I think there is nothing like Discrete continuous data. Data can either be discrete or continuous. Make me correct if i wrong
@abhishekvarshney9961
@abhishekvarshney9961 4 года назад
I think he meant to write Quantitative data instead of Continuous data.
@affanazam209
@affanazam209 3 года назад
continuous data can be discrete.
@pankajkumarbarman765
@pankajkumarbarman765 4 года назад
Wonderful session sir 🔥💖 thank you very very much
@manishsingh278
@manishsingh278 4 года назад
Thank you sir, its a great learning experience
@mukeshkumar-kh2fh
@mukeshkumar-kh2fh 2 года назад
sir can we replace NaN value of column by mean in such a way that if other parameter value is in a particular range than find the mean and replace . Example..if column BMI has NaN value then if age of that person is 45 then we first find the mean BMI of people with a age of range 40 to 50 and replace with this.Similarly,for other person have NaN BMI ... then first check the age of that person and set an interval age and find mean and replace...
@mohitpatidar8880
@mohitpatidar8880 3 года назад
Thank you so much, it is very helpful
@ShaneZarechian
@ShaneZarechian Месяц назад
You should make a more distilled version of this video
@venkatasaikumarb706
@venkatasaikumarb706 3 года назад
tq soo much sir
@soujanyapanasa6662
@soujanyapanasa6662 2 года назад
Thanks very useful
@raj-nq8ke
@raj-nq8ke 3 года назад
million likes on this video.
@shankarprabhur1813
@shankarprabhur1813 Год назад
hi sir, if a unique id is missing in that situation how to handle that.(For example, a data set about the customer in that customer id is missing means what we want to do . (condition should not delete the row in the customer id column )
@user-jc9nv6lj9u
@user-jc9nv6lj9u 4 года назад
Hey Krish: Can you do a video on how to model outliers for time series analysis?
@vaibhavshukla9777
@vaibhavshukla9777 4 года назад
Thank you so much Sir ❤️🌟
@rehanbaig71
@rehanbaig71 4 года назад
For most frequent occurrence, Mode is calculated and Median is the central value of any set of values. Sir, you said we will replace NaN values with most frequent occurrence of variables. But after this part of video you started calculating median. If I am confused, please do correct me. Thanks.
@ShoaibKhan-sd1sr
@ShoaibKhan-sd1sr 2 года назад
+1
@sujanb3513
@sujanb3513 3 года назад
Doubt: Embarked NAN values are from the cabin value 'B28' does It have any relationship(cabin,Embarked).
@dallalstreet1775
@dallalstreet1775 4 года назад
thank you so much krish!!!
@sandipansarkar9211
@sandipansarkar9211 2 года назад
finshed watching
@manojrangera5955
@manojrangera5955 3 года назад
Mean/median/mode use for MCAR missing dataset but age is not MCAR.. So i didnt get why we use it there?... Can anybody tell me.. I am confused...
@siccasim1520
@siccasim1520 Год назад
@KrishNaik So to help me better understand, the variables cabin and age are missing not at random because both are missing for the same reason which is because the passengers are not alive ? is this correct?
@amanjyotiparida5818
@amanjyotiparida5818 3 года назад
54:02 we also studied recursion.
@jainitafulwadwa8181
@jainitafulwadwa8181 3 года назад
Mean imputation is not robust to outliers. Only median and mode are
@paragsonawane3685
@paragsonawane3685 3 года назад
Very helpful
@asawanted
@asawanted 3 года назад
What was the purpose of adding Cabin_null? Why didn't we do df['Cabin'].isnull().mean()? Editted: Yes got it. So if you do mean * 100, you get the perc.
@asawanted
@asawanted 3 года назад
Does imputation result in overfit because, for e.g in this case where there are lot of nan values, they are replaced by the median. This means when plotted, lot of values will be close to or on the median?
@jaiminshah143
@jaiminshah143 3 года назад
How to handle missing(NaN) values in column having binary data values i.e Just 0 or 1 ?
@manavshah2119
@manavshah2119 3 года назад
Sir How to Embarked are related to Cabin and Age Missing values Sir Can you Give brief Explenation on it i am not clear at some points.
@ShoaibKhan-sd1sr
@ShoaibKhan-sd1sr 2 года назад
sir in this video at 1:06:16 if in the mean/median/mode method we have to replace NaN values with most occurring values then why we are using median instead, why we r not using mode here?? (bcz mode ultimately used for finding the most occurring value in the set). waiting for ur reply....
@eminembts2832
@eminembts2832 2 года назад
Because there might be some outliers who knows outliers could be the most frequent occured so median is the best way to get things done i think
@PoojaKumari-kz7iz
@PoojaKumari-kz7iz 4 года назад
how we build code where all text which has already been predicted ... we will not redo it again .. we will only do the prediction on the newly added text ??? can you please suggest some ideas ?? how to implement this ?
@satyakidhar2058
@satyakidhar2058 4 года назад
Hello Sir, I am getting a bit confused while analysing the percentage of survival and death using groupby and cabin_null at 48:22. If we see the Survival column there are also values which shows survival = 1 but Cabin is Nan eg- 3rd row and 9th row. Can you please explain my doubt? Thank you, You are true mentor
@rashedin6356
@rashedin6356 10 месяцев назад
​randomly showing missing values irrespective of whether they survive or not
@jitenkumarsahoo667
@jitenkumarsahoo667 4 года назад
Hi sir..can we handle missing values by finding mean/mode value of respective categories of another features for which it is Nan instead of replacing mean of whole value?
@chaitu037
@chaitu037 4 года назад
We will get the notification sometime in between the video. After 30 min or so. Is there anything could be done so that we can get notification ahead
@nayanmehta9552
@nayanmehta9552 2 года назад
hi krish i am new to your videos so can you please guide me through which playlist i should go first
@stuttzzzi
@stuttzzzi 2 года назад
Love ur videos..incase i get a job .il be giving first salary to ur channel/you
@detacreations1999
@detacreations1999 3 года назад
ecxept using median to fill the nan of age can we use mode to fill na?
@dugeshwarify
@dugeshwarify 7 месяцев назад
Main content starts from 22:15
@abrahammathew9783
@abrahammathew9783 2 года назад
Hi Krish, I have couple of doubts. 1.Here Age feature is not MCAR right rather MNAR, it is missing because those passengers died. So why we have used mean/median imputation. 2. By median imputation, most of the data would lie at the center so the impact is more at kurtosis than spread/variance. Could you clarify....
@shilashm5691
@shilashm5691 2 года назад
Yes age feature is MNAR, because it has a relationship with survived and missing values are not because of randomness...
@muhammadbilalhaneefqureshi48
He considered only three variables (Age, Fare, and Survived) from the dataset, not the whole dataset. Now you have only three features irrespective of the previous relationship we found in the main dataset. You have to find NMAR, MCAR, and MAR from only these three features, as we have seen fare and survived had no null values, Age is now representing the MCAR relationship, that's why he used mean median mode for Age.
@ocean2738
@ocean2738 Год назад
But agr survived pr bi depend krega na (observed data) so it should be mar??
@ocean2738
@ocean2738 Год назад
Then why this imputation used
@ocean2738
@ocean2738 Год назад
Explain this thing
@gauravfamily2209
@gauravfamily2209 2 года назад
confusion in mean/median/mode technique. Not clear. Where we will use mean/median/mode in MCAR ?
@muhammadbilalhaneefqureshi48
He considered only three variables (Age, Fare, and Survived) from the dataset, not the whole dataset. Now you have only three features irrespective of the previous relationship we found in the main dataset. You have to find NMAR, MCAR, and MAR from only these three features, as we have seen fare and survived had no null values, Age is now representing the MCAR relationship, that's why he used mean median mode for Age.
@rahulbhardwaj6487
@rahulbhardwaj6487 3 года назад
when we compute the mean or standard deviation of a feature having missing value (NaN) .So while doing computation this NaN value is igonred or replaced by Zero ?
@raj-nq8ke
@raj-nq8ke 3 года назад
it is ignored.
@sreigurushyam
@sreigurushyam 4 года назад
Hi Krish, We have to apply the mean/ median/mode to variables of MCAR not MNAR right, and as per the video Age is a type of MNAR in Titanic Dataset right ? Not sure if i have missed it
@anjalynair
@anjalynair 3 года назад
I have got the same doubt.
@swethabeeram6106
@swethabeeram6106 3 года назад
I too got the same doubt....any one plz suggest
@muhammadbilalhaneefqureshi48
He considered only three variables (Age, Fare, and Survived) from the dataset, not the whole dataset. Now you have only three features irrespective of the previous relationship we found in the main dataset. You have to find NMAR, MCAR, and MAR from only these three features, as we have seen fare and survived had no null values, Age is now representing the MCAR relationship, that's why he used mean median mode for Age.
@detacreations1999
@detacreations1999 3 года назад
when i use the subplot part it says module is not callable..why?
@rahalmehdiabdelaziz8121
@rahalmehdiabdelaziz8121 3 года назад
The kernel of day1 is not in github
@spicytuna08
@spicytuna08 2 года назад
how is it cabin data is missing? when someone is on-board on a ship, i would assume that cabinet number is assigned. that data should not be missing. with each name, there should be cabinet number associated imo.
@manishchauhan5625
@manishchauhan5625 2 года назад
This data is collected after the accident has happened, and they mostly gathered the data by talking to survivors and some they got from the data stored. So when people are dead and we don't have the data in the record, we cant even get it by talking to the people as they are dead hence missing.
@rahulpandey735
@rahulpandey735 2 года назад
Krish I want to join your DS membership but am unable to join due to some technical issue. Can you provide the Gpay account?
@kaviarasan4999
@kaviarasan4999 4 года назад
Any one tell me please only feature eng previous video can i understand these live videos...please tell if any prerequiste video is there in playlist.
@kolukuluriaditya2284
@kolukuluriaditya2284 3 года назад
follow feature engineering playlist krish naik
@ManishSharma-tp3eb
@ManishSharma-tp3eb 3 года назад
high bais high variance cause overfitting
@utkarshkathuria2931
@utkarshkathuria2931 2 года назад
when you said that MCAR is when there is no relationship between missing data and any other values missing in dataset. Then while defining mean,median,mode you considered AGE, but age is in relationship with Cabin. Sorry I didn't understand the logic here.
@eminembts2832
@eminembts2832 2 года назад
but he didnt take cabin as his part in dataset he took age fare and survived and i think they(fare,survived) dont have NaN in there dataset so
@muhammadbilalhaneefqureshi48
He considered only three variables (Age, Fare, and Survived) from the dataset, not the whole dataset. Now you have only three features irrespective of the previous relationship we found in the main dataset. You have to find NMAR, MCAR, and MAR from only these three features, as we have seen fare and survived had no null values, Age is now representing the MCAR relationship, that's why he used mean median mode for Age.
@bhaveshchiplunkar8437
@bhaveshchiplunkar8437 3 года назад
Not able to find this notebook in github repository
@RashmiKumari-kz5zt
@RashmiKumari-kz5zt 3 года назад
Hi krish, how do I get membership for your channel
@amishbhatnagar2976
@amishbhatnagar2976 2 года назад
how to take your membership ????
@user-oz2ng6ne1x
@user-oz2ng6ne1x Год назад
Video Starts around 7 minutes in
@sapawar007
@sapawar007 3 года назад
Hello Sir I am unable to buy the plan.Is that still there?
@bharathbn9225
@bharathbn9225 4 месяца назад
watching the playlist in 2024
@the-ghost-in-the-machine1108
bro can you add english subtitles to your videos. Just a feedback!
@sweetyscandor1113
@sweetyscandor1113 4 года назад
If Age is belongs to MNAR, why you used MCAR method to replace Age column?
@pratikbhansali4086
@pratikbhansali4086 3 года назад
Did u get the answer
@122arvind
@122arvind 4 года назад
I am 15+yr exp in sys admin,learning ML,DL quickly,could i get success,will some one consider me ,i like DS a lot now.should i switch in this field
@saurabh3614
@saurabh3614 3 года назад
First Ask yourself why you want to switch the field to DS?, Is your current filed not giving enough learning to survive in IT? and salary must not only be the reason to get into DS(don't understand what do you mean by "I like a DS a lot" its not a sweet or pretty girl or something), DS field, not just a piece of cake,(you should have analytical thinking, math, statistics, linear algebra, and probability knowledge) at 15+ yr experience, you have to compete with not only a fresh grade as well as a guy having 2+year exp in DS. The only way you can make differentiate by showcasing some managerial experience, client-facing exp, solutions design, and also good at storytelling mainly soft skill+ all that exp also which a fresh college graduate come up with DS/ML exp) So hope you got the intuition here, and Most importantly your learning rate also matter which mainly depends on the model inside in your brain, how quickly you adapt the DS, its should not too small or. too big.. All the above said is the null Hypothesis with a significance value of 50%. . Enjoy
@122arvind
@122arvind 3 года назад
@@saurabh3614 Thanks for feedback, from how long u r in same DS field,
@rohitkamra1628
@rohitkamra1628 4 года назад
what is Telegram group name? Can someone share?
@nisamlc4685
@nisamlc4685 4 года назад
@krishnaik06
@nisamlc4685
@nisamlc4685 4 года назад
#first of all go and try bro
@rohitkamra1628
@rohitkamra1628 4 года назад
@@nisamlc4685 thanks bro🙂
@swethabeeram6106
@swethabeeram6106 3 года назад
@@nisamlc4685 how to join
@nisamlc4685
@nisamlc4685 3 года назад
In Telegram search Discussion on ML and DL by Krish
@shaunakchadha4204
@shaunakchadha4204 3 года назад
Ye hum , Ye Krish SIr hain aur ye humari Pawri ho rahi hai
@lokeshladdha4520
@lokeshladdha4520 3 года назад
56:45
Далее
ТАЙНЫ И ЗАГАДКИ ИНТЕРНЕТА 2
41:37
13 Карт - Мафия | 5 серия
08:51
Просмотров 283 тыс.
Qora Gelik
00:26
Просмотров 431 тыс.
I gave 127 interviews. Top 5 Algorithms they asked me.
8:36
Does this sound illusion fool you?
24:55
Просмотров 660 тыс.
Handling Missing Values (with Rob Mulla)
1:16:07
Просмотров 9 тыс.
ТАЙНЫ И ЗАГАДКИ ИНТЕРНЕТА 2
41:37