Тёмный

How to impute missing data in categorical features (using MICE) 

Selva Prabhakaran (ML+)
Подписаться 31 тыс.
Просмотров 2,1 тыс.
50% 1

Welcome to the tenth video of the series "Build your First Machine Learning Project". In this, we'll see how to impute categorical data using MICE in Python.
This video will provide in-depth information on imputing categorical data with python codes walk through.
So let's understand it.
Chapters
0:00 - 0:33 Intro
0:34- 4:52 How to impute categorical data using MICE
4:53 -8:24 Begin MICE imputation
8:24-8:40 Conclusion
In order to make the best out of this, please watch this series in the order in playlist: Build Your First ML Model Playlist: • Build Your FIRST Machi...
Checkout Complete Machine Learning Plus Self Paced Online Courses here:
edu.machinelearningplus.com/s...
Join ML+ membership for exclusive Data science content
Previous Lesson:
Impute Missing Values using MICE : • Multiple Imputation by...
Earlier Lessons:
1. Build your first ML Project: • Build Your FIRST Machi...
2. How to Formulate ML Problem: • Build Your First ML Pr...
3. Setup Python Environment: • Setup Python Environme...
4. Jupyter Notebook Tutorial: • Jupyter Notebook Tutor...
5. What is ML Modeling: • What is ML Modeling? (...
6. Reduce the size of Pandas Dataframe: • Reduce the memory size...
7. What is EDA: • Exploratory Data Analy...
8. How to impute missing Data: • How to handle missing ...
Let me know in the comments section if you have any questions!
🤝 Like, Share, Subscribe for more!
Follow us on our social media handles for all updates, events and live sessions-
✅ Instagram: / machinelearningplus
✅ LinkedIn: / machine-learning-plus
✅ RU-vid: / @machinelearningplus
✅ Twitter: / r_programming
✅ Website: www.machinelearningplus.com/
If you enjoyed this video, be sure to throw it a like and make sure to subscribe to not miss any future videos!
Thanks for watching!
#mlmodeling, #python, #machinelearning, #artificialintelligence, #pandas, #datascience

Наука

Опубликовано:

 

13 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 13   
@machinelearningplus
@machinelearningplus 2 месяца назад
I teach complete ML Mastery Roadmap (self paced courses) to master Data Science from scratch: edu.machinelearningplus.com/s/pages/ds-career-path
@ekpenyongokpo4900
@ekpenyongokpo4900 3 месяца назад
Thank you, for the great tutorial
@saurabhsonawane7110
@saurabhsonawane7110 4 месяца назад
Life saver!
@machinelearningplus
@machinelearningplus 11 месяцев назад
1. It uses the MICE algorithm to impute missing data. Consuder checking out the previous video on MICE: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-BjyUbk258o4.html 2. Mostly yes
@sabalunax
@sabalunax 5 месяцев назад
You are a genius! Your video helped me a lot! :)
@machinelearningplus
@machinelearningplus 5 месяцев назад
Happy to help :)
@ketanbutte3497
@ketanbutte3497 11 месяцев назад
great video.. Some doubts- 1. Whats the intuition behind the baysian intuition. How the genders were assigned for missing places? 2. Can this be safely used for any categorical data which has missing values?? Again great work...
@user-wu4lc2ix1h
@user-wu4lc2ix1h 5 месяцев назад
Thank you, great video!! But shouldn't you use one hot enconding instead of label encoding because it is a nominal cat variable?
@machinelearningplus
@machinelearningplus 5 месяцев назад
The idea is to convert it to numeric column, which can be done using labelencoder itself. Since, there are only 2 categories in gender, it shouldn't really matter if you want to use one hot encoding
@ssffyy9401
@ssffyy9401 5 месяцев назад
​@@machinelearningplus Thank you for the clarification. I have a question regarding the use of encoding techniques in machine learning. Typically, we opt for one-hot encoding over label encoding for nominal categorical variables to avoid implying any inherent order or hierarchy to the model during prediction tasks. However, in a scenario where the dataset includes a nominal categorical column with more than two classes and the purpose is not for ML prediction but for imputation to address missing values, would employing label encoding to prepare the dataset for the imputer potentially mislead the imputation process?
@machinelearningplus
@machinelearningplus 4 месяца назад
Thanks for the question. Yes, I would think so. Especially since more than 2 categories are involved
@aminvahdati9476
@aminvahdati9476 10 месяцев назад
what is the difference of this video with Mode imputation? it did the same thing with long codes
@shankars4384
@shankars4384 9 месяцев назад
simulation studies suggests that mean imputation is possibly the worst missing data handling method available. this is from research papers. MICE method is like a one size fits all approach and much better. mode imputation messes with bias and variance and screws up the model.
Далее
마시멜로우로 체감되는 요즘 물가
00:20
Просмотров 18 млн
Cat Corn?! 🙀 #cat #cute #catlover
00:54
Просмотров 10 млн
How to WALK as a MODEL 💁👠👠
14:08
Просмотров 2,2 млн
Choose a phone for your mom
0:20
Просмотров 6 млн