Тёмный

Scikit-Learn Model Pipeline Tutorial 

Greg Hogg
Подписаться 195 тыс.
Просмотров 27 тыс.
50% 1

Thank you for watching the video!
Learn Python, SQL, & Data Science for free at mlnow.ai/ :)
Subscribe if you enjoyed the video!
Best Courses for Analytics:
---------------------------------------------------------------------------------------------------------
IBM Data Science (Python): bit.ly/3Rn00ZA
Google Analytics (R): bit.ly/3cPikLQ
SQL Basics: bit.ly/3Bd9nFu
Best Courses for Programming:
---------------------------------------------------------------------------------------------------------
Data Science in R: bit.ly/3RhvfFp
Python for Everybody: bit.ly/3ARQ1Ei
Data Structures & Algorithms: bit.ly/3CYR6wR
Best Courses for Machine Learning:
---------------------------------------------------------------------------------------------------------
Math Prerequisites: bit.ly/3ASUtTi
Machine Learning: bit.ly/3d1QATT
Deep Learning: bit.ly/3KPfint
ML Ops: bit.ly/3AWRrxE
Best Courses for Statistics:
---------------------------------------------------------------------------------------------------------
Introduction to Statistics: bit.ly/3QkEgvM
Statistics with Python: bit.ly/3BfwejF
Statistics with R: bit.ly/3QkicBJ
Best Courses for Big Data:
---------------------------------------------------------------------------------------------------------
Google Cloud Data Engineering: bit.ly/3RjHJw6
AWS Data Science: bit.ly/3TKnoBS
Big Data Specialization: bit.ly/3ANqSut
More Courses:
---------------------------------------------------------------------------------------------------------
Tableau: bit.ly/3q966AN
Excel: bit.ly/3RBxind
Computer Vision: bit.ly/3esxVS5
Natural Language Processing: bit.ly/3edXAgW
IBM Dev Ops: bit.ly/3RlVKt2
IBM Full Stack Cloud: bit.ly/3x0pOm6
Object Oriented Programming (Java): bit.ly/3Bfjn0K
TensorFlow Advanced Techniques: bit.ly/3BePQV2
TensorFlow Data and Deployment: bit.ly/3BbC5Xb
Generative Adversarial Networks / GANs (PyTorch): bit.ly/3RHQiRj

Опубликовано:

 

5 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 50   
@GregHogg
@GregHogg Год назад
Take my courses at mlnow.ai/!
@TheCsePower
@TheCsePower Год назад
Thanks Greg. This made me realise how non-standard my code is. I learnt: - Use copy or deepcopy and not assignment. - Always perform preprocessing on the train and test separately. - sklearn pipelines have nothing to do with ETL pipelines from Data Engineering. - sklearn transfers have nothing to do with NLP Transformers. - sk elarn estimators have nothing to do with Statistics estimators.
@GregHogg
@GregHogg Год назад
Super glad you got some useful pointers!!
@crepantherx
@crepantherx 2 года назад
Keep Posting Greg, I am Data Analyst by profession and your video certainly helps a lot
@GregHogg
@GregHogg 2 года назад
That's awesome! Thank you 😄
@hansenmarc
@hansenmarc 2 года назад
Great stuff! I’m curious why you used FunctionTransformer instead of ColumnTransformer, which could run the two scalers in parallel? Also, since FunctionTransformer is stateless, the documentation says that fit just checks the input rather than actually fitting the scaling parameters. Doesn’t that lead to data leakage since applying transform to test data won’t use parameters learned from fitting on the training data?
@kyleGrealis
@kyleGrealis 2 месяца назад
thanks, Greg. really good explanation and structured example. this makes it easy to create a template for easy reuse!
@AmitabhSuman
@AmitabhSuman Год назад
A very practical video, that I came across on Pipelines. Thank you for this video!
@GregHogg
@GregHogg Год назад
Awesome that's great to hear. You're very welcome ☺️☺️
@JJGhostHunters
@JJGhostHunters 2 года назад
Great tutorial! I use the MinMaxScaler with the option to scale from -1 to 1 instead of 0 to 1 when I am dealing with values that can be positive and negative. Seems to be fine, but I may need to reconsider going forward. I have never noticed any issues though.
@JJGhostHunters
@JJGhostHunters 2 года назад
I would love to see a tutorial that covers using pipelines with multilayer perceptron models (MLPs), CNNs and LSTMS.
@alexrook5604
@alexrook5604 Год назад
I undstand what you are doing here but I have two questions that I think would be helpful and would make it easier to follow along and replicate you steps. 1) Where did you get the data. I can't the california_housing dataset that is already in the train/test form. 2) Why not use scikit-learn tooling rather than doing it yourself? Like you could have used train/test split or pipelines (or column transformer... or similar stuff). That just has me confused.
@rahiiqbal1294
@rahiiqbal1294 8 месяцев назад
This was very helpful, thank you :)
@Nadia-db6nb
@Nadia-db6nb Год назад
Thanks for the great tutorial. Can you make a video on how to combine multiple feature selection methods and feature extraction using python?
@lythien390
@lythien390 2 года назад
Thank you Greg! It's a great video!
@GregHogg
@GregHogg 2 года назад
Glad to hear it!
@brandonn8166
@brandonn8166 Год назад
Just out of curiosity, is there a reason you don't use train_test_split to get X and y values?
@NikitaShilyaev
@NikitaShilyaev 10 месяцев назад
yes, why he uses X_train for train_predictions instead of another dataset X_valid
@marcofogale9719
@marcofogale9719 7 месяцев назад
Perfect explanation. Thanks a lot
@GregHogg
@GregHogg 7 месяцев назад
Very welcome 😁
@nabanitadasgupta
@nabanitadasgupta 10 месяцев назад
Thank you for the video!
@ilanyutsis9653
@ilanyutsis9653 2 месяца назад
When you do the StandardScaler().fit on the dataframe, what is the meaning of this operation? what is happening?
@TheFrankyguitar
@TheFrankyguitar 10 месяцев назад
Thanks for this amazing video! Would that work also with a statsmodels model?
@GregHogg
@GregHogg 10 месяцев назад
Thanks so much!! And I'm not sure, haven't tried :)
@00SeijiHan00
@00SeijiHan00 10 месяцев назад
TYSM bro really appreciate this
@GregHogg
@GregHogg 10 месяцев назад
Very welcome!!
@allanmachado2011
@allanmachado2011 6 месяцев назад
Thank you!
@talyb7383
@talyb7383 2 года назад
Thanks for the great tutorial! what do I need to change to create a pipeline for an image classification model? like the cifar10 model?
@GregHogg
@GregHogg 2 года назад
Well, everything. You probably won't be using scikit for that. And you're very welcome!
@talyb7383
@talyb7383 2 года назад
@@GregHogg I didnt explained myself clearly... I want to create a pipeline that receives a trained cifar10 model an also make preprocessing on the e data set ? so I cant use your way?
@junaidlatif2881
@junaidlatif2881 Год назад
How to transform y variable and then fit model. And after how to reverse transform for the scatter plotting
@tareq8109
@tareq8109 2 года назад
Bro can you show how to make youtube and any video downloader make by python
@krzysztofzaucha3592
@krzysztofzaucha3592 6 месяцев назад
nice video Greg
@GregHogg
@GregHogg 6 месяцев назад
Thanks so much!!
@adriandiaz5688
@adriandiaz5688 Год назад
Great Video!
@GregHogg
@GregHogg Год назад
Thank you Adrian!
@juampaaa90
@juampaaa90 Год назад
awesome ty
@fabio336ful
@fabio336ful 2 года назад
Did you say pipelines doesn't function for classifications problems? Min: 1:07
@GregHogg
@GregHogg 2 года назад
Does, not doesn't
@fabio336ful
@fabio336ful 2 года назад
@@GregHogg thanks 🙏🏼
@Supernyv
@Supernyv 11 месяцев назад
Awesome !
@GregHogg
@GregHogg 11 месяцев назад
Thank you!
@m18293
@m18293 Год назад
Can you share this notebook?
@GregHogg
@GregHogg Год назад
dang i think i lost it, sorry
@AceOnBase1
@AceOnBase1 8 месяцев назад
Bro you literally just copied this out of a textbook lmao but I respect the grind.
@MrAhsan99
@MrAhsan99 2 года назад
you are ❤
@GregHogg
@GregHogg 2 года назад
❤️
@johnspivack
@johnspivack 9 месяцев назад
Too confusing. Too many tangents, doesn't cover the main idea clearly. Downvoted.
@GregHogg
@GregHogg 9 месяцев назад
Well I upvoted it to counter you
@n8trh
@n8trh 3 дня назад
What tangents? This video was not only to the point from the start, but it also went into depth with useful examples. If you thought those were tangents, I recommend watching again, maybe with more care this time.
Далее
Тренд Котик по очереди
00:10
Просмотров 335 тыс.
Please Master These 10 Python Functions…
22:17
Просмотров 170 тыс.
LSTM Time Series Forecasting Tutorial in Python
29:53
Просмотров 213 тыс.
Data Pipelines Explained
8:29
Просмотров 156 тыс.
How I'd Learn AI (If I Had to Start Over)
15:04
Просмотров 816 тыс.