Тёмный

Let’s Write a Decision Tree Classifier from Scratch - Machine Learning Recipes #8 

Google for Developers
Подписаться 2,4 млн
Просмотров 534 тыс.
50% 1

Hey everyone! Glad to be back! Decision Tree classifiers are intuitive, interpretable, and one of my favorite supervised learning algorithms. In this episode, I’ll walk you through writing a Decision Tree classifier from scratch, in pure Python. I’ll introduce concepts including Decision Tree Learning, Gini Impurity, and Information Gain. Then, we’ll code it all up. Understanding how to accomplish this was helpful to me when I studied Machine Learning for the first time, and I hope it will prove useful to you as well.
You can find the code from this video here:
goo.gl/UdZoNr
goo.gl/ZpWYzt
Books!
Hands-On Machine Learning with Scikit-Learn and TensorFlow goo.gl/kM0anQ
Follow Josh on Twitter: / random_forests
Check out more Machine Learning Recipes here: goo.gl/KewA03
Subscribe to the Google Developers channel: goo.gl/mQyv5L

Наука

Опубликовано:

 

30 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 232   
@dabmab2624
@dabmab2624 4 года назад
Why can't all professors explain things like this? My professor: "Here is the idea for decision tree, now code it"
@exoticme4760
@exoticme4760 4 года назад
agreed!
@pauls60r
@pauls60r 4 года назад
I realized years after graduation, many professors either have received no training in teaching or have little interest in teaching, undergrads in particular. I can't say I've learned more on RU-vid than I did in college but I have a whole lot of "OOOOOH, that's what my professor was talking about!" moments when watching videos like this. This stuff would've altered my life 20 years ago.
@carol8099
@carol8099 3 года назад
Same! I really wish they could dig more into the coding part, but they either don't cover it or don't teach coding well.
@avijitmandal9124
@avijitmandal9124 3 года назад
hey can someone give the link for doing pruning
@Skyfox94
@Skyfox94 3 года назад
Whilst I definitely agree, I have to say that, in order to understand algorithms like this one, you'll have to just work through them. No matter how many interesting and well thought out videos you watch, it'll always be most effective if you afterwards try and build it yourself. The fact that you're watching this in your free time shows that you are interested in the topic. That's also worth a lot. Sometimes you'll only be able to appreciate what professors taught you, after you get out of college/uni and realize how useful it would have been.
@nbamj88
@nbamj88 6 лет назад
In nearly 10 min, he explained the topic extremely well Amazing job.
@learnsharegrowwithgh2181
@learnsharegrowwithgh2181 4 года назад
right
@betaga2261
@betaga2261 5 месяцев назад
Because he knows how to write an explanation tree
@cbrtdgh4210
@cbrtdgh4210 5 лет назад
This is the best single resource on decision trees that I've found, and it's a topic that isn't covered enough considering that random forests are a very powerful and easy tool to implement. If only they released more tutorials!
@donking6996
@donking6996 3 года назад
I am crying tears of joy! How can you articulate such complex topics so clearly!
@FacadeMan
@FacadeMan 6 лет назад
Thanks a lot, Josh. To a very basic beginner, every sentence you say is a gem. It took me half hour to get the full meaning of the first 4 mins of the video, as I was taking notes and repeating it to myself to grasp everything that was being said. The reason I wanted to showcase my slow pace is to say how important and understandable I felt in regard to every sentence. And, it wasn't boring at all. Great job, and please, keep em coming.
@Leon-pn6rb
@Leon-pn6rb 4 года назад
I'm curious, how did your career pan out? Still in ml?
@learnsharegrowwithgh2181
@learnsharegrowwithgh2181 4 года назад
you are right he is
@georgevjose
@georgevjose 6 лет назад
Finally after a year. Pls continue this course.
@WilloftheWinds
@WilloftheWinds 6 лет назад
Welcome back Josh, thought we would never get another awesome tutorial, thanks for your good work.
@anupam1
@anupam1 6 лет назад
Thanks, was really looking for this series...nice to see you back
@TomHarrisonJr
@TomHarrisonJr 5 лет назад
One of the clearest and most accessible presentations I have seen. Well done! (and thanks!)
@BlueyMcPhluey
@BlueyMcPhluey 6 лет назад
loving this series, glad it's back
@shreyanshvalentino
@shreyanshvalentino 6 лет назад
a year later, finally!
@sundayagu5755
@sundayagu5755 4 года назад
As a beginner, this work has given me hope to pursue a career in ML. I have red and understood the concepts of Decision Tree. But the code becomes a mountain which has been levelled. Jose, thank you my brother and may God continue to increase you 🙏.
@mindset873
@mindset873 3 года назад
I've never seen any other channels like this. So deep and perfect.
@falmanna
@falmanna 6 лет назад
Please keeps this series going. It's awesome!
@riadhsaid3548
@riadhsaid3548 5 лет назад
Even it took me more than 30 minutes to complete & understand the video. I can not tell you how this explanation is amazing ! This is how we calculate the impurity ! PS: G(k) = Σ P(i) * (1 - P(i)) i = (Apple, Grape,Lemon) 2/5 * (1- 2/5) + 2/5 * (1- 2/5) + 1/5 *(1-1/5)= 0.4 * (0.6) + 0.4 * (0.6) + 0.2 * (0.8)= 0.24 + 0.24 + 0.16 = 0.64
@senyaisavnina
@senyaisavnina 4 года назад
or 1 - (2/5)^2 - (2/5)^2 - (1/5)^2
@vardhanshah8843
@vardhanshah8843 3 года назад
Thank you very much for this explanation I went to the comment section to ask this question but you answer it very nicely.
@gautamgadipudi8213
@gautamgadipudi8213 4 года назад
Thank you Josh! This is my first encounter with machine learning and you made it very interesting.
@debanjandhar6395
@debanjandhar6395 6 лет назад
Awesome video, helped me lot.... Was struggling to understand these exact stuffs.....Looking forward to the continuing courses.
@BestPromptHub
@BestPromptHub 6 лет назад
You have no idea how your videos helped me out on my journey on Machine Learning. thanks a lot Josh you are awesome. 回复
@johnstephen399
@johnstephen399 6 лет назад
This was awesome. Please continue this series.
@lenaara4569
@lenaara4569 6 лет назад
You explained it so well. I have been struggling to get it since 2 days. great job !!
@guccilover2009
@guccilover2009 5 лет назад
amazing video!!! Thank you so much for the great lecture and showing the python code to make us understand the algorithm better!
@alehandr0s
@alehandr0s 4 года назад
In the most simple and comprehensive way. Great job!
@BreakPhreak
@BreakPhreak 6 лет назад
Started to watch the series 2 days ago, you are explaining SO well. Many thanks! More videos on additional types of problems we can solve with Machine Learning would be very helpful. Few ideas: traveling salesman problem, generating photos while emulating analog artefacts or simple ranking of new dishes I would like to try based on my restaurants' order history. Even answering with the relevant links/terminology would be fantastic. Also, would be great to know what problems are still hard to solve or should not be solved via Machine Learning :)
@tymothylim6550
@tymothylim6550 3 года назад
Thank you very much for this video! I learnt a lot on how to understand Gini Coefficient and how it is used to pick the best questions to split the data!
@stefanop.6097
@stefanop.6097 6 лет назад
Please continue your good work! We love you!
@JulitaOtusek
@JulitaOtusek 5 лет назад
I think you might confusing Information Gain and Gini Index. Information gain is reduce of entropy, not reduce of gini impurity. I almost did a mistake in my Engineering paper because of this video. But I luckily noticed different definition of information gain in a different source. Maybe it's just thing of naming but it can mislead people who are new in this subject :/
@liuqinzhe508
@liuqinzhe508 2 года назад
Yes. Information gain and Gini index are not really related to each other when we generate a decision tree. They are two different approaches. But overall still a wonderful video.
@leonelp9593
@leonelp9593 Год назад
thanks for clarify this!
@hbunyamin
@hbunyamin 4 года назад
I have already known the concept; however, when I have to translate the concept into code ... I find it quite difficut and this video explains that smoothly. Thank you so much for the explanation!
@techteens694
@techteens694 4 года назад
The same case here man
@learnsharegrowwithgh2181
@learnsharegrowwithgh2181 4 года назад
humm he is great teacher
@AyushGupta-kp9xf
@AyushGupta-kp9xf 3 года назад
So much value in just 10 mins, this is Gold
@AbdulRahman-jl2hv
@AbdulRahman-jl2hv 4 года назад
thank you for such a simple yet comprehensive explanation.
@congliulyc
@congliulyc 6 лет назад
best and most helpful tutorial ever seen! Thanks!
@sajidbinmahamud2414
@sajidbinmahamud2414 6 лет назад
Long time! i've been waiting for so long
@huuhieupham9059
@huuhieupham9059 5 лет назад
Thanks for your sharing. You made it easy to understand for everybody
@Sanchellios
@Sanchellios 6 лет назад
OH MYYYYYYYYY!!!! You're back! I'm SOSOOOOOOSOSOSOSOSOSOOO happy!
@sarrakharbach
@sarrakharbach 5 лет назад
That was suuuuper amazing!! Thanks for the video!
@adampaxton5214
@adampaxton5214 3 года назад
Great video and such clear code to accompany it! I learned a lot :)
@rodrik1
@rodrik1 6 лет назад
best video on decision trees! super clear explanation
@andrewbeatty5912
@andrewbeatty5912 6 лет назад
Brilliant explanation !
@dcarter666
@dcarter666 5 лет назад
Ty
@ryanp9441
@ryanp9441 2 года назад
so INSTRUCTIVE. thank you so much for your clear & precise explanation
@gorudonu
@gorudonu 6 лет назад
Was waiting for the next episode! Thank you!
@learnsharegrowwithgh2181
@learnsharegrowwithgh2181 4 года назад
yes
@msctube45
@msctube45 3 года назад
Thank you Josh for preparing and explaining this presentation aa well as the software to help the understanding of the topics. Great job!
@avijitmandal9124
@avijitmandal9124 3 года назад
do you have link for doing pruning
@Xiaoniana
@Xiaoniana 4 года назад
Thank Thank's it was very informative. It took me hours to understand what is meant. Keep going
@MW2ONLINEGAMER100
@MW2ONLINEGAMER100 5 лет назад
Thank you so much, beautifully written code too.
@dinasamir2778
@dinasamir2778 4 года назад
It is great course. I hope you continue and make videos to all machine learning algorithms. Thanks Alot.
@leiverandres
@leiverandres 6 лет назад
Great explanation, thank you so much!
@muratcan__22
@muratcan__22 5 лет назад
perfect video on the implementation and the topic
@mrvzhao
@mrvzhao 6 лет назад
At first glance this almost looks like Huffman coding. Thanks for the great vid BTW!
@user-qh5qo2tr7l
@user-qh5qo2tr7l 4 года назад
I like your video, man. Its real simple and cool.
@gautambakliwal826
@gautambakliwal826 6 лет назад
You have saved weeks amount of work. So short yet so deep. Guys first try to understand the code then watch the video.
@elliottgermanovich3081
@elliottgermanovich3081 4 года назад
This was awesome. Thanks!
@adityagawhale
@adityagawhale 5 лет назад
Please cover ID 3 algorithm, explanation for CART was great!
@adamtalent3559
@adamtalent3559 4 года назад
Thanks for your lovely lecture.how to catagorize more than 2 prediction classes at the same time ?
@jaydevparmar9876
@jaydevparmar9876 6 лет назад
great to see you back
@sergior.m.5694
@sergior.m.5694 5 лет назад
Best explanation ever, thank you sir
@omarsherif88
@omarsherif88 Год назад
Awesome tutorial, many thanks!
@xavierk99
@xavierk99 5 лет назад
That's a really good video. Very enlightening, thanks =)
@RajChauhan-hd9hu
@RajChauhan-hd9hu 5 лет назад
If the training_data in the code you showed is very large then how to make necessary changes to get the same output?
@aryamanful
@aryamanful 5 лет назад
I don't generally comment on videos but this video has so much clarity something had to be said
@bhuvanagrawal1323
@bhuvanagrawal1323 5 лет назад
Could you make a similar video on fuzzy decision tree classifiers or share a good source for studying and implementing them?
@Yaxoi
@Yaxoi 6 лет назад
Great series!
@dragolov
@dragolov 5 лет назад
Thanks for sharing. Respect!
@christospantazopoulos8049
@christospantazopoulos8049 6 лет назад
Excellent explanation keep it up!
@leoyuanluo
@leoyuanluo 4 года назад
best video about decision tree thus far
@doy2001
@doy2001 5 лет назад
Impeccable explanation!
@houjunliu5978
@houjunliu5978 6 лет назад
Yaaaay! Your back!
@venkateshkoka8508
@venkateshkoka8508 6 лет назад
do you have the code/video for changing the decisiontreeclassifier to decisionTreeRegressor??
@erikslatterv
@erikslatterv 6 лет назад
You’re back!!!
@user-ib2jb1bi8d
@user-ib2jb1bi8d 6 лет назад
After such a long time!
@njagimwaniki4321
@njagimwaniki4321 5 лет назад
How come at 6:20 he calls it average but doesn't divide it by 2? Also the same thing in a stack overflow question it seems to be called entropy after. Is this correct?
@erviveker
@erviveker 6 лет назад
Where I can find code for decision tree in Tensorflow? Is there any repo?
@ricardohincapie1537
@ricardohincapie1537 4 года назад
Such a good video! I have very clear now
@macfa7355
@macfa7355 5 лет назад
Thanks for video I Hope that learn machine learning more would you make a script for kor ?
@qwertybrain
@qwertybrain 4 года назад
Thanks! Well done!
@supriyakarmakar1111
@supriyakarmakar1111 5 лет назад
I get lots of idea , thanks sir.But my question to you that if the data set is too large then what will i do ?
@mingzhu8093
@mingzhu8093 5 лет назад
Question about calculating impurity. If we do probability, we first draw data which give us probability of 0.2 then we draw label which give us another 0.2. Shouldn't the impurity be 1 - 0.2*0.2=0.96?
@uditarpit
@uditarpit 5 лет назад
It is easy to find best split if data is categorical. How do split happens in a time optimized way if variable is continuous unlike color or just 2 values of diameter? Should I just run through min to max values? Can median be used here? Please suggest!!
@browneealex288
@browneealex288 3 года назад
At 8:41 He says Now the previous call returns and this node become decision node. What does that mean? How is this possible to return to the root node(false branch(upper line ))after executing the final return of the function. Please give your thoughts it will help me a lot.
@farahiyahsyarafina2183
@farahiyahsyarafina2183 6 лет назад
thank you! you're amazing.
@panlis6243
@panlis6243 6 лет назад
I don't get one thing here. How do we determine the number for the question. Like I understand that we try out different features to see which gives us the most info but how do we choose the number and condition for it?
@aryamanful
@aryamanful 5 лет назад
I have a follow up question. How did we come up with the questions. As in..how did we know we would like to ask if the diameter is > 3, why not ask if diameter > 2?
@devakihanda9552
@devakihanda9552 5 лет назад
Is there a way to visualize a decision tree using python in processing?
@user-xw8xw7vt9o
@user-xw8xw7vt9o Год назад
Sooo dooope !!!! Helpful 🔥🔥🔥
@moeinhasani8718
@moeinhasani8718 5 лет назад
very useful.this the best tutorial out on web
@saichandarreddytarugu4997
@saichandarreddytarugu4997 6 лет назад
How to choose k value in knn value. Based on accuracy or square root of length of test data. Can anyone help me.
@edgarpanganiban9339
@edgarpanganiban9339 6 лет назад
Help! I cant print my_tree using the print_tree() function.... It only show this: Predict {'Good': 47, 'Bad': 150, 'Average': 89}.Please help...
@fathimadji8570
@fathimadji8570 3 года назад
Excuse me, I am still not clear about how the value of 0.64 comes out, can you explain a little more?
@ritikvimal4915
@ritikvimal4915 4 года назад
well explained in such a short time
@KamEt-69
@KamEt-69 3 года назад
How comes that in the calculation of the GINI Impurity we remove from the impurity the square of the probability of each label?
@tenebreux3
@tenebreux3 5 лет назад
Is it possible to run an ML algorithm on multiple datasets?
@jakobmethfessel6226
@jakobmethfessel6226 4 года назад
I thought CART determined splits solely on gini index and that ID3 uses the average impurity to produce information gain.
@tooniatoonia2830
@tooniatoonia2830 2 года назад
I built a tree from scratch but I am stuck making a useful plot like is obtainable in sklearn. Any help?
@hyperealisticglass
@hyperealisticglass 5 лет назад
This single 9-minute video does a way better job than what my ML teacher did for 3 hours.
@marklybeer9038
@marklybeer9038 3 года назад
I know, right? I had the same experience with an instructor. . . it was a horrible memory. Thanks for the video!
@bharathjc4700
@bharathjc4700 6 лет назад
great video. thank a ton
@aydinahmadli7005
@aydinahmadli7005 5 лет назад
great tutorial!
@aseperate
@aseperate Год назад
The Gino impurity function in the code does not output the same responses listed in the video. It’s quite confusing.
@Julia-zi9cl
@Julia-zi9cl 5 лет назад
Does anyone knows how they created those flowcharts with tables at 1:24~ 1:27?
@mohammadbayat1635
@mohammadbayat1635 8 месяцев назад
Why Impurity is 0.62 after partitioning on "Is color green" on the left subtree?
@allthingsmmm
@allthingsmmm 4 года назад
Could you do an example in which the output triggers a method that changes it's self based on success or failure? An easier example, iterations increase or decrease based on probability; Or left, right up, down memorizing a maze pattern?
@senyotsedze3388
@senyotsedze3388 8 месяцев назад
you are awesome, man! but why is it that, the second question on if the color is yellow? you separated only apple when two grapes are red. Or is it because they are already taken care of at the first false split of the node?
@hiskaya
@hiskaya 8 месяцев назад
thanks! that was helpful)
@alirezagh1456
@alirezagh1456 4 месяца назад
One of the best course i ever seed
Далее
кукинг с Даниилом 🥸
01:00
Просмотров 491 тыс.
I Built a EXTREME School Bus!
21:37
Просмотров 6 млн
Decision and Classification Trees, Clearly Explained!!!
18:08
I gave 127 interviews. Top 5 Algorithms they asked me.
8:36
Decision Tree Classification Clearly Explained!
10:33
Просмотров 639 тыс.
Neural Networks Explained from Scratch using Python
17:38
Machine Learning Zero to Hero (Google I/O'19)
35:33
Просмотров 1,8 млн