Тёмный

Build Your First Machine Learning Project [Full Beginner Walkthrough] 

Dataquest
Подписаться 59 тыс.
Просмотров 111 тыс.
50% 1

We'll learn how to build an end-to-end machine learning project. We'll cover the main steps in building a machine learning project, then walk you through writing the Python code to create the project.
In the project, we'll try to predict how many medals each country will win in the olympics using a linear regression model.
At the end, you'll have a full machine learning project that you can continue working on.
You can find the README and code here - github.com/dataquestio/projec... .
Chapters
00:00 Introduction
00:40 7-step project process
10:15 Loading the data
12:10 Data exploration
18:05 Building our model
22:30 Measuring error
26:30 Is the model good?
34:20 Wrap-up and next steps
---------------------------------
Join 1M+ Dataquest learners today!
Master data skills and change your life.
Sign up for free: bit.ly/3O8MDef

Опубликовано:

 

10 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 91   
@Fakipo
@Fakipo Год назад
I was just studying the concepts for so long and getting overwhelmed, this video definitely helped to get the bigger picture.
@kumelachewmaru2225
@kumelachewmaru2225 Год назад
love the simplicity of your step by step method. I am absorbing a lot in just one pass. Thank you and well done.
@michaelmitchell155
@michaelmitchell155 9 месяцев назад
A very comprehensive and well explained intro into the workings of the project. I got a lot out of it. Thank you.
@allahjoseph
@allahjoseph 8 месяцев назад
Thank you for providing such a great resource and making ML so digestible! YOU are who introduced me to machine learning, and I love it. I'm looking forward to applying everything I learn to my own projects!!!
@hiteshallakki1740
@hiteshallakki1740 2 года назад
Great video.Really liked the way you explained it before ,instead diving into the code.Thanks
@Prathmesh_salve
@Prathmesh_salve 3 месяца назад
First person i saw who is explaing just perfectly and can be understand by a student thanks ane keep it up sir.
@sm-pz8er
@sm-pz8er 4 месяца назад
Perfect. Best video I’ve found precious and easy to understand so far. Thank you
@maxivy
@maxivy Год назад
You are a very good teacher and deserve more subs.
@bumohamed624
@bumohamed624 Год назад
Thanks a lot , it helps to understand ML with basic steps
@DEDE-ix9lg
@DEDE-ix9lg 10 месяцев назад
Amazing . this was simple and great . very very very well done !!!!
@josearmandovivero408
@josearmandovivero408 Год назад
Thanks! This video is exactly what I needed 😀
@rosemaryonondje7953
@rosemaryonondje7953 2 года назад
A great video! This answered some of my questions. Thanks
@Dataquestio
@Dataquestio 2 года назад
Glad it was helpful!
@chessconfused6528
@chessconfused6528 Год назад
Your courses are awesome!!!
@sushantshankar8477
@sushantshankar8477 3 месяца назад
Loved it! superb explanation 😍
@pluderr3947
@pluderr3947 4 месяца назад
@ 12:14 teams.corr()["medals"] didn't work for me so I did corr = teams.drop(["team", "country"], axis=1).corr()["medals"] print(corr) for those who are also running into the same issues as me :)
@kayo5011
@kayo5011 3 месяца назад
It worked thanks
@user-kj6vz1qo4h
@user-kj6vz1qo4h 2 месяца назад
or u can use teams.corr(numeric_only="true")["medals"]
@rodo2220
@rodo2220 2 месяца назад
@@user-kj6vz1qo4h thank you!
@JoshuaStorm-zi1wy
@JoshuaStorm-zi1wy Месяц назад
@@user-kj6vz1qo4h Thanks!
@kavinesh4470
@kavinesh4470 Месяц назад
Thanks🙏🙇
@user-wp6lj6xl5z
@user-wp6lj6xl5z 11 месяцев назад
excellent video sir ji.... thanks a lot for such concepts... and your English fluency is amazing Indian
@scarlettran-
@scarlettran- 8 месяцев назад
thank you so much!!! you are a really good teacher
@viewpoint8976
@viewpoint8976 День назад
This video realy shows how things are done.
@TT-oy8bq
@TT-oy8bq 3 месяца назад
Incredible Teaching !
@IsoAktiv
@IsoAktiv 11 месяцев назад
Albania was in the olympics 1992 and i guess any other countries in that csv were also. They just did not win any medals, that's why there are missing values. So actually setting them to zero instead of dropping them is more accurate. In theory you would prefer first or second party data, in this case u would have to do some research to clarify the reson for missing values in the data set.
@ranahuzaifa147
@ranahuzaifa147 10 месяцев назад
Thank you for the video.
@sabuein
@sabuein Год назад
Thank you.
@willII0522
@willII0522 Год назад
I like how you showed to use the later data to test the model, but do you have a video that shows how to use the data to predict the future Olympics?
@luqmanjuzaili5213
@luqmanjuzaili5213 Месяц назад
Thank you for the amazing video! However, when I tried running this, I received a value error teams.corr()["medals"] This seems to be because the "Team" and "Country" column are in string, and hence making it impossible to get a corr value. So i removed them just to obtain the corr values. But it seems to work for you without filtering the string type columns out. Any ideas why?
@tajinjahan7446
@tajinjahan7446 Год назад
hi... how to sort the excel data into integer values?
@crispineda4630
@crispineda4630 Год назад
Is there a reason why the Plots disappear after running the code a second time on Jupyter notebook? They don't show anything anymore.
@hanazhafirahhanifah8175
@hanazhafirahhanifah8175 Год назад
Thank you for such a nice video! I have a question though about the error_ratio. You said countries like FRA, CAN, and RUS get a lot of medals in the olympics and it shown that their error ratio is low. With what should I compare the value of error_ratio?
@1622roma
@1622roma Год назад
what a good question! I hope he responds back to you.
@yayasssamminna
@yayasssamminna Год назад
why do you want to compare it?
@rayr268
@rayr268 4 месяца назад
Would love a math course that is shown directly relating to ML that I can take to get up to speed. for someone that might be self taught in tech w/ only a highschool education
@user-cb6dm1qd4v
@user-cb6dm1qd4v 8 месяцев назад
Great video! What coding software did you end up using for this (I haven't seen this python software before which is why I ask)?
@enyelgomezmoya9682
@enyelgomezmoya9682 7 месяцев назад
That's Jupyter
@muradbayr9900
@muradbayr9900 22 дня назад
Please guys help me on the first step got stuck cannot import csv kinda problem with pandas
@ShivendraParmar-dp3rp
@ShivendraParmar-dp3rp 14 дней назад
guys why after test['predictions']=predictions , size of array disturbing instead of 405*8 its coming 405*413 can anyone help me out with it
@raanonyms7926
@raanonyms7926 6 месяцев назад
My like turned this to 2K 😊
@user-lw8zw5lq8l
@user-lw8zw5lq8l 9 месяцев назад
sir that was really simple and very well explained also excellently organised...... yet I struggled at one point I couldn't convert string(teams) to float while performing the corelation....if you see this hope you reply .....
@SiddheshRajale
@SiddheshRajale 8 месяцев назад
did you found out the solution
@darrentan271
@darrentan271 4 месяца назад
@@SiddheshRajaledf.corr(numeric_only=True)
@nagrotte
@nagrotte 7 месяцев назад
best
@deepakkumaracid4529
@deepakkumaracid4529 4 месяца назад
From where I got data?
@haythamroshdy4189
@haythamroshdy4189 Год назад
I love your English Your English is so perfect as indian
@raja.57
@raja.57 11 месяцев назад
It will be a good pratcise to use x_test,y_test,x_train,y_train instead of predictors, target, and it wil also be a good practise to use x , y as independent and dependent variable instead of test , and so on
@baeche
@baeche 8 месяцев назад
Great video. What python interpreter are you using?
@eduardtoronto
@eduardtoronto 8 месяцев назад
Maybe you meant IDE (integrated development environment)? Python only has one interpreter, it's builit-in and it compiles/interpretes the code. I'm pretty sure the IDE he is using in the video is Project Jupyter (interactive development environment) which is pretty much a standard environment in machine learning, data analytics, statistical analysis etc.
@baeche
@baeche 8 месяцев назад
Sorry, of course I meant IDE@@eduardtoronto What differs from mine (PyCharm) is that the code gets executed immediately and the result are showd. I have to use the print command for that. Or is the video just edited?
@eduardtoronto
@eduardtoronto 8 месяцев назад
@@baeche In jupyter ENTER inserts a new line, SHIFT+ENTER executes the code. Everything gets executed immediately. It depends on the functions he's using e.g. copy() gets executed but it wont print any output whereas something like 'shape' will output the result to the console like print.
@baeche
@baeche 7 месяцев назад
Thank you very much@@eduardtoronto I moved to google colab where the last command gets printed too. I find google colab handy as I can work in the browser. Where I experience problems is accessing a SQL Server (not SQ Lite, mysql). Any idea where I can look for help? ChatGPT could not.
@prashanthbabu1397
@prashanthbabu1397 11 месяцев назад
Hi , I really loved your video. I was trying to follow along, but got an error and cant move forward. I would love it if you could help me fix it. i got an error for the predictions = reg.predict(test[predictors]). It kept saying ValueError: The feature names should match those that were passed during fit. Feature names unseen at fit time: - age - country - medals - team - year what do i do?
@moyinoluwaanoma
@moyinoluwaanoma 8 месяцев назад
Hello, Did you get this resolved yet? Having the same issue now.
@bumohamed624
@bumohamed624 Год назад
it gives an error when i run correlation step complaining on data type of team, how can handle ?
@Mynamegeoph
@Mynamegeoph Год назад
I have this too, were you able to fix it?
@lalithsai5392
@lalithsai5392 Год назад
@@Mynamegeoph teams[teams.columns[2:]].corr()["medals"] use this
@noisysod7330
@noisysod7330 11 месяцев назад
@@lalithsai5392 Thanks lalithsai5392, would have been stuck without you!
@gmfPimp
@gmfPimp 8 месяцев назад
I bet this is an issue with doing it locally and not using a Jupyter Notebook because I had this problem as well. The best way around this is: teams.corr(numeric_only=True)["medals"] That will only generate value against numeric fields.
@allahjoseph
@allahjoseph 8 месяцев назад
code community!! @@lalithsai5392
@paaviethranjayabalan6735
@paaviethranjayabalan6735 8 месяцев назад
why seaborn but not matlib>?
@bhu0091
@bhu0091 5 месяцев назад
you can use whatever you like, it's all about experimenting ;)
@ShortLessonsDaily
@ShortLessonsDaily 2 месяца назад
Telugu lo chey bro
@praskatti
@praskatti 7 месяцев назад
Great video. Thanks for sharing your knowledge and expertise. I ran into an issue in the "corr()" step. teams.corr()["medals"] ValueError: could not convert string to float: 'AFG'. May be I can remove this column before doing the corr() call.
@h4ytham268
@h4ytham268 5 месяцев назад
i had the same issue. what did you do to solve it?
@hongyangtan9897
@hongyangtan9897 5 месяцев назад
teams.drop(["country", "team"], axis=1).corr()["medals"] this code can work
@abhijeet800
@abhijeet800 4 месяца назад
add this to corr(numeric_only=True)["medals"]
@user-ew4jp1fk3p
@user-ew4jp1fk3p 3 месяца назад
@@hongyangtan9897 thank uu
@daudisraf5564
@daudisraf5564 3 месяца назад
There seems to be a problem when I run 'teams.corr()["medals"]'. Keeps throwing an error "ValueError: could not convert string to float: 'AFG''. Checked unique values and NaN. confused!
@liewkangzhen157
@liewkangzhen157 Месяц назад
I faced the same problem as well, but managed to solve it. The error is due to some columns in teams that are nonnumerical like team and country, so i created a new table, ie teams = teams.drop(columns = [‘team’, ‘country’]) and it should work. Hope this helps.
@vanshikatripathi2579
@vanshikatripathi2579 4 месяца назад
teams=pd.read_csv("teams.csv") This line giving me a huge error how to correct it or what i had wrong
@user-ew4jp1fk3p
@user-ew4jp1fk3p 3 месяца назад
go to the document u download and take it's link and put it Instead teams.csv
@user-jg1bk9sd4r
@user-jg1bk9sd4r 5 месяцев назад
I like how you showed to use the later data to test the model, but do you have a video that shows how to use the data to predict the future Olympics?
Далее
ML Was Hard Until I Learned These 5 Secrets!
13:11
Просмотров 233 тыс.
ХЕРЕЙД БОИТСЯ МОЕЙ СОБАКИ!
37:08
All Learning Algorithms Explained in 14 Minutes
14:10
Просмотров 193 тыс.