Тёмный
No video :(

One Hot Encoding | Handling Categorical Data | Day 27 | 100 Days of Machine Learning 

CampusX
Подписаться 228 тыс.
Просмотров 100 тыс.
50% 1

Опубликовано:

 

22 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 114   
@pankajbeldar9799
@pankajbeldar9799 Год назад
You deserve billions of subscribers ....you are best teacher for me in the entire world
@yashwanthyash1382
@yashwanthyash1382 Год назад
True❤❤ he deserves.
@Eraj_Poetry_Official6
@Eraj_Poetry_Official6 11 месяцев назад
yes exactly
@ajaykushwaha-je6mw
@ajaykushwaha-je6mw 3 года назад
I don't have word for your appreciation, your teaching awesome, content awesome, explanation awesome. Thank you so much for such informative video.
@Garrick645
@Garrick645 4 месяца назад
Ekdum spoon feeding content hai, loved it. School ke baad pheli baar aise pedigree mili hai.
@Alive-Ness
@Alive-Ness Год назад
no doubt he is one of the best teacher if you want to learn ML🙌
@sarmadali5110
@sarmadali5110 Год назад
Can someone explain that why we don't use fit in test and only use transform
@Alive-Ness
@Alive-Ness Год назад
@@sarmadali5110 if we use fit on test data then the model will also learn the test data and so it will overfit the test data and we will not be able to find that our model is good enough or not on unseen data
@hritikroshanmishra3630
@hritikroshanmishra3630 Год назад
@@Alive-Ness thanks
@kumarmanishpradhan
@kumarmanishpradhan 3 месяца назад
Some one who got into deep in took students into depth also. Love you @CampusX
@RandomAIDude
@RandomAIDude 11 дней назад
OneHotEncoder(min_frequency = 100) it will automatically detect infrequent categories and combine them into one. thanks for you effort ♥
@prakharagarwal9448
@prakharagarwal9448 3 года назад
Great series, learning so much , that too in hindi Please machine learning ke baad deep learning, nlp, opencv ke bhi series lana
@zkhan2023
@zkhan2023 3 года назад
yes sir deep learning pay bhi banai videos
@hasanrants
@hasanrants Месяц назад
thank you Sir for the valuable content. completed on 21st July 2024, 4:30PM.
@sneharj2036
@sneharj2036 2 года назад
Wow.Amazing video. Wonderful explaination. Thanku so much .Campus X is really very good channel.
@tathagatasharma
@tathagatasharma Год назад
This channel is a gold mine.
@subhashdixit5167
@subhashdixit5167 Год назад
Gazab, maja aa gaya... Awesome content sir. Wish yeh channel mujhe pehle pata chta, I could have done some wonder. Thanks,,,This content is much much better than paid course
@11aniketkumar
@11aniketkumar Год назад
gender is a nominal data still if i treat it like ordinal data, i will get a column of '0' and '1'. But suppose, i divide 'gender' column into two columns using ohe, but since both columns are dependent on each other, I drop one column. So, now for 'gender' I have a single column with zero and one as two types of entries, indirectly I have treated nominal data like ordinal data, end result in both cases is same.
@kindaeasy9797
@kindaeasy9797 8 месяцев назад
in this car selling price data set , i think brand , fuel type and also owner columns consist of ordinal categorical data .one might not consider fuel type to be ordinal but if we practically think then fuel type also effects the selling price of car, example , electric cars are comparitively expensive , and there is some similar trend with other fuel types as well depending on the company
@Hammadisteachingchemistry
@Hammadisteachingchemistry 4 месяца назад
Bhai kia smart baat boli he apne
@Hammadisteachingchemistry
@Hammadisteachingchemistry 4 месяца назад
Apko pradhanmantri banadena chahiye
@kindaeasy9797
@kindaeasy9797 3 месяца назад
@@Hammadisteachingchemistry hmm chem walo mai itna dimag kaha hota hai
@mohankumar-cw5lw
@mohankumar-cw5lw 2 года назад
Very simple. Very informative. Very clear.
@MuhammadAsif-hu3id
@MuhammadAsif-hu3id 7 месяцев назад
World's best teacher and channel for ML
@user-sg8ld4lq3k
@user-sg8ld4lq3k 2 месяца назад
your teaching style give me very important understanding
@BlackRock_07
@BlackRock_07 Год назад
Sir I have seen lot of videos of data science related but i only good understand from your channel only ... thankyou very much sir
@musiclover-xy8ii
@musiclover-xy8ii Год назад
Its so easy to understand , thank you brother 🥰🥰🥰
@saumyashah6622
@saumyashah6622 3 года назад
Sir, one suggestion for day 28, please include OHE on most frequent variables (using Scikit learn). Here you have done it using pandas
@campusx-official
@campusx-official 3 года назад
Can't be done using sklearn
@saumyashah6622
@saumyashah6622 3 года назад
@@campusx-official ok. Got it 👍
@arun5351
@arun5351 3 года назад
@@campusx-official what if we change categories with less frequency(some threshold) to 'others' category and commit this change for our dataframe, initially. And then we can use OHE from scikitlearn?
@mandarmore.9635
@mandarmore.9635 Год назад
you are amazing teacher thank you for making this video
@viral_fight0
@viral_fight0 12 дней назад
fuel should be ordinal because p >> d >> ......... in case of price .
@neerajtadhiyal3152
@neerajtadhiyal3152 Год назад
not gonna lie . this is the first vid I am watching on ur channel and at 4:06 I subscribed u .
@math_section
@math_section 2 месяца назад
Best. If i say in a word. From Bangladesh
@shaktikantpatra
@shaktikantpatra 9 месяцев назад
Great way of teaching. Keep it up
@debasissahoo7559
@debasissahoo7559 10 месяцев назад
I accept you are the best teach in the world if once i get chance to meet my life will really greatfull
@AnkurSingh-kj9wu
@AnkurSingh-kj9wu Год назад
What an explaination!!! Superb..
@ng2530
@ng2530 Год назад
Best instructor !!
@amarAK47khan
@amarAK47khan Год назад
thanks so much for this bro. love from across the border :)
@basavarajangadi2043
@basavarajangadi2043 Год назад
your explanation is very nice , easy to understand ..........pls keep posting more videos related to Data
@Manishkumar-iw1cy
@Manishkumar-iw1cy Год назад
Thank you for easy explanation😃
@MohitSingh-jb9tb
@MohitSingh-jb9tb 8 месяцев назад
Amazing explanation..
@rishabhkapoor5105
@rishabhkapoor5105 Год назад
Bhai truly awesome explanation style, great videos!
@rivupangas2735
@rivupangas2735 Год назад
sir i have a doubt why we are using OHE for the owner column instead of using ordinal encoder?
@Hammadisteachingchemistry
@Hammadisteachingchemistry 4 месяца назад
Same question
@sonal008
@sonal008 Год назад
Only one thing to say- 'itna knowledge laate kaha se ho?' 😂 i just knew one hot encoding but here I learnt so many other ways apart from this
@karankantyadav4400
@karankantyadav4400 Год назад
sir, the process which you did for the most frequent categories, can you please tell how to do the same in a pipeline?
@pavangoyal6840
@pavangoyal6840 Год назад
Excellent !!!
@sameer9045
@sameer9045 Год назад
thanks you said iss jugaad ki zaroorat ni padegi. bcoz i got confused there. BTH i'm on your 100 dayML series
@littlemeow1562
@littlemeow1562 Год назад
thanks sir , this video really helped me 💙💙
@yashjain6372
@yashjain6372 Год назад
awesome as always
@bikashthapa8622
@bikashthapa8622 8 месяцев назад
Thank you so much
@SACHINKUMAR-px8kq
@SACHINKUMAR-px8kq Год назад
Thankyou So much Sir
@alimuiz5328
@alimuiz5328 4 месяца назад
Amazing video, sir. Just wanted to ask why did you not do OneHot Encoding before splitting the data?
@hanishche
@hanishche Год назад
Hi sir, quick doubt. Let's say i did OHE on training data (Jan -April data), did train test validation split and all worked welI. Now, took another data (June-Aug) for predicting using trained algorithm above but the problem is i don't have same no of categories for a column(OHE) and it threw error saying expected 30 cols but got only 25 cols. In these scenarios how should I approach. And also one more question on Missing data implementation, lets say in the same unknown data(June-Aug) one column has all NaN values but in train data we had data nd did Labelencoding. So in here what should I impute missing values of unknown data with.? @CampusX
@Michael-yd9zc
@Michael-yd9zc 11 месяцев назад
Subtitles would be grand!
@prashantmarathe6515
@prashantmarathe6515 3 года назад
Great Explain !!!!
@sandipansarkar9211
@sandipansarkar9211 Год назад
finished watching and coding
@purushottammitra1258
@purushottammitra1258 3 года назад
In 22:03 only transform is used with ohe object . Why not xtest new = ohe. fit_transform is used??
@talkswithRishabh
@talkswithRishabh 2 года назад
Thanks sir 🙏🙏
@krithwal1997
@krithwal1997 2 года назад
Awsome explanation bro ❤
@HarshKumar-qy4im
@HarshKumar-qy4im 23 дня назад
you add bramd and km_DRIVEN in X_TRAININ WHILE YOU DONOT ADD BRAND AND KM_DRIVEN IN TESTING
@PriyankaSingh-gm6we
@PriyankaSingh-gm6we 2 года назад
Thanks for Great content...
@yesminani848
@yesminani848 Месяц назад
Should i use one hot encoding for prediction dataset???
@barshabanik7212
@barshabanik7212 2 года назад
sir what about the dropping of the first column in case of one hot encoding with top categories. There can be also a problem of multicoliinearity right?
@TusharMishra-bt1fn
@TusharMishra-bt1fn 4 месяца назад
Why he did TRAIN TEST SPLIT at 18:30 , before applying ONE-HOT-ENCODING, and after that he applied ONE HOT ENCODING only to XTRAIN , NOT XTEST THIS WILL BE A PROBLEM
@noone0978
@noone0978 3 месяца назад
sir can't we put owner into ordinal categorical data and use ordinal encoding because we can give it priority wise as first owner first priority,second owner second prioroty and third owner last priority
@jiteshsingh6030
@jiteshsingh6030 2 года назад
Supereb Superb 👌🔥
@zkhan2023
@zkhan2023 3 года назад
Thanks sir
@prata7143
@prata7143 11 месяцев назад
1:42 according to feminists, male-female encoding shall be done with ordinal encoding with females being 1.
@jorgesisco981
@jorgesisco981 2 года назад
fun fact, I noticed you were speaking hindi in min 14 😂, still by just watching what you do was still helpful!! thanks! EDIT: you are swithing langauges LOL 😂 when I noticed the hindi in my mind I was like: mmmm weird I did not notice, then on min 18 you switched back to english, it drove me insane for no reason. Good video anyways! keep it up.
@campusx-official
@campusx-official 2 года назад
Sorry😂 We Indians do this quite a bit. We also have a term for this... we call it Hinglish. Hope you dont mind.
@jorgesisco981
@jorgesisco981 2 года назад
​@@campusx-official No problem at all, of course I am missing some explanations, but my priority is to see the code, if you don't mind me asking, I have cathegorical features where some features have like 50 unique values and other feature have like 7000 unique values. I can't afford to just take in consideration 10 or 30 more frequent values, because I need to be able to predice any target based on those unique values, do you think it's ok to use one hot encoding for this?
@acharjyaarijit
@acharjyaarijit Год назад
Sir, is it a good idea to use 'owner' as ordinal data?
@renurenu7629
@renurenu7629 День назад
after using hstack ...how can v see our whole dataset
@jroamindia1754
@jroamindia1754 8 месяцев назад
Pandas don't remember as in?? After converting into dummies and storing it in variable then use it whenever it requires. Cant we do that?
@MRAgundli
@MRAgundli 3 месяца назад
done
@awsstudentacademy8210
@awsstudentacademy8210 Год назад
if i have dataset for predicting energy consumption of bikes, its output is not categorical, how can i convert that to numerical data?
@kanzafatima173
@kanzafatima173 11 месяцев назад
sir in previous videos you said that we fit and transform on trained data so my question is that pd.get_dummies method which is used to apply onehotencoding is also applied on trained data? or just the dataframe
@maddybuddy7013
@maddybuddy7013 Год назад
For 32 different catagories of brands why we are not using ordinal encoding with the most number of brands as 0 and least number as 31
@11aniketkumar
@11aniketkumar Год назад
After performing one hot encoding for brands of car, we should remove the first column right?
@vinitpatidar5617
@vinitpatidar5617 Год назад
How do we get column names for encoded columns of FUEL and OWNER using OneHotEncoder(drop='first') way. The encoded columns comes as np arrays.
@sarmadali5110
@sarmadali5110 Год назад
Can someone explain that why we don't use fit in test and only use transform
@tejaskamble8731
@tejaskamble8731 6 месяцев назад
❤🔥🔥
@AryanGuleria-kj1gt
@AryanGuleria-kj1gt 6 месяцев назад
values is used for ???
@abhisheksharda459
@abhisheksharda459 Год назад
hello sir, what if my target variable is also categorical feature(nominal), do i need to encode that as well before giving to ml model?
@deveshtyagi2996
@deveshtyagi2996 Год назад
sir do we remove 1 column in one hot encoding for linear algorithms only or while using all ML algorithms
@ashoksuthar
@ashoksuthar 2 года назад
Why are we not using this for Output?
@Star-xk5jp
@Star-xk5jp 7 месяцев назад
day2-date:10/1/24
@chessfreak8813
@chessfreak8813 2 года назад
thankss bhai roc and auc ka kr do plz
@waqarjoiya2540
@waqarjoiya2540 4 месяца назад
❤❤❤
@mohmmedshahrukh8450
@mohmmedshahrukh8450 Год назад
bro but why you did not remove the first column in the car name encoding in the last, this will increase the collinearity right?
@highflyer30
@highflyer30 Год назад
sir not able to downlaod file from github from last two videos plz help
@saumyamishra1148
@saumyamishra1148 2 года назад
Jo output numeric me convert hokar aaya hai usko excel sheet me kaise layenge
@saurabhbarasiya4721
@saurabhbarasiya4721 3 года назад
TypeError Traceback (most recent call last) in ----> 1 ohe = OneHotEncoder(drop="first",sparse=False) 2 ohe.fit_transform(X_train[["fuel","owner"]]) TypeError: __init__() got an unexpected keyword argument 'drop' please help me to solve this issue.
@Rider-jn6zh
@Rider-jn6zh 2 года назад
please share dataset link..not able to download it from github
@adarsh_kumar_sharma_8638
@adarsh_kumar_sharma_8638 Месяц назад
Sir can you please provide me this note or file
@MuhammadYahyaKhan-r2m
@MuhammadYahyaKhan-r2m 28 дней назад
mera to true false a rha hai 0 and 1 ki jgah
@user-xr9wm8ft5z
@user-xr9wm8ft5z 11 месяцев назад
I am getting the output as true false. Why sir? How I will get the output as 1 & 0
@user-xr9wm8ft5z
@user-xr9wm8ft5z 11 месяцев назад
Please sir help me 🙏
@sisami2109
@sisami2109 Год назад
Man, I have no idea what he's saying I'm just stealing the code
@tanmayshinde7853
@tanmayshinde7853 2 года назад
While train test split why 'x' is capital and 'y'is small?
@pritamdas4441
@pritamdas4441 2 года назад
convention..you can use any
@siddhartharaja9413
@siddhartharaja9413 2 года назад
because y is column vector(one column only) ,and x has multiple columns,just to signify this thing X is capital and y is small
@rahulpathak8415
@rahulpathak8415 5 месяцев назад
Sir mere columns abhi bhi True False aa rahe hai
@aditya_yadav_01
@aditya_yadav_01 3 месяца назад
Pass dtype=int parameter while using function
@ajaykushwaha-je6mw
@ajaykushwaha-je6mw 2 года назад
xyz = np.hstack((car[['brand','km_driven']].values,car_train_new)) a = pd.DataFrame(xyz). Sir ek doubt hai, if Data set is small then this approach is good. How can we get column names and add to the data frame a so that we can see the transform data with column name.
@EmohGame
@EmohGame 2 года назад
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 counts[counts
@beyondanalysis8915
@beyondanalysis8915 2 года назад
i am getting the same error, please let me know if you have corrected this code
@JustPython
@JustPython 11 месяцев назад
@as8401
@as8401 Год назад
You deserve billions of subscribers ....you are best teacher for me in the entire world
Далее
#JasonStatham being iconic
00:38
Просмотров 205 тыс.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Просмотров 492 тыс.
This is why Deep Learning is really weird.
2:06:38
Просмотров 382 тыс.
#JasonStatham being iconic
00:38
Просмотров 205 тыс.