This is the single best explanation of backprop in code that I've seen so far. I've once implemented a neural network from scratch, except autograd, so Micrograd is a good fit and so clear and accessible. Thanks Andrej!
Actually true. And exactly the same, I've once implemented a neural network from scratch, and I broadly understood, but this is the best explanation of backpropagation I've seen. Excellent work.
I can't even comprehend the level of mastery it must take, to be able to distill such a complex topic in such a simple format, and the humility to give it our for free so that others may learn. Thankyou so much Andrej for doing this, you're truly amazing.
This is literally gold, you have explained everything so intuitively and made it so much easier to understand! Thank you so much Andrej for sharing this in-depth knowledge for free!
@@ophello I know this is a year old comment, and my reply is pointless, but _technically_ 🤓Merriam Webster lists "used in an exaggerated way to emphasize a statement or description that is not literally true or possible" as one of the definitions. People define the dictionary. Not the other way around. And yes, it *literally* doesn't matter at all, but it annoyed me that you were wrong when trying to _correct_ somebody else's well meaning compliment.
"remember back in your calculus class?...." nope. I'm subscribing anyway whenever I need a humble reminder that I don't know anything and there are people way way smarter than I am.
The beautiful part of tech is the feeling of constantly being mind blown when realizing how little one knows and how much there is to learn. Studying micrograd has been on my list for a while thanks to George Hotz and this series is making the owning of this context so much easier. Loving it. ❤️
This was an exceptional lecture. Just wanted to say thank you for taking the time to make this. I have spent time in university courses, reading books, doing assignments and yet, I truly understood more out of this single lecture than from anything else prior.
That was incredible. Never has anyone been able to simplify Neural Networks in this manner for me. Please keep making such videos, you're doing gods work. By god, I mean the imminent AGI :)
Hey Andrej, idk if you'll read this but I wanted to echo others' appreciation for this fantastic introduction. I've been a SWE for many years but always ML-adjacent despite a maths background. This simple video has instilled a lot of intuition and confidence that I actually grasp what these NN's are doing and it's a lot of fuel in my engine to keep diving in. Thank you!
Andrej, the fact that you're making videos like this is AMAZING! Thank you so much for doing this. I will be spending some quality time with this one tonight (and probably tomorrow lol) and can't wait for the next one. Thank you, thank you, thank you!
Thank you Andrej, Your implementation of neural networks from scratch is impressive! The clarity and simplicity in your code make complex concepts like backpropagation much easier to grasp.
Thank you so much for doing a step by step simulation on how gradient descent works. I am grateful for the passion and effort you make in order to teach. These lessons are very essential as we continue to dive deep into learning.
Wanted to say thanks for that awesome backpropagation video. I've been scratching my head over this stuff for a while now - had all these bits and pieces floating around in my brain but couldn't quite connect the dots. Your explanation was like a lightbulb moment for me! Everything finally clicked into place. Really appreciate you putting this out there for us to learn from.🙌🙌🙌
It's the most apparent and most straightforward explanation of backpropagation and training of neural networks I have ever learned, with effortless work to understand with a minor background in CS and Math!
man this is an absolute masterpiece. i can finish at me own pace, and the intricate details and possible bugs are explained clearly. Feels like Morgan Freeman narrating. I can listen to Andrej all day long.
58:41 "As long as you know how to create the local derivative - then that's all you need". Ok Karpathy. Next paper title "Local derivatives are all you need". Nice to see you on RU-vid! :))
This is a combination of topic mastery and communication expertise. I thought I fully understood gradient descent/backprop, and have used it for years. However, I've never dove into manual calculation of gradients because it felt...gratuitous. I'm glad I set aside the 2 hours for this video, however. Now I understand it at the level where I can explain it to an intern at a conceptual level without leaning on formulae and hand-waving, which is a great feeling. Thanks Andrej!
Amazing how you broke this down into first principles. I understood a lot of these concepts before now but I'm pleasantly surprised at how much clarity I gained by watching this video. Thank.
Thank you Andrej for this incredible and detailed video The clarity with which you explain backpropagation and the construction of micrograd is exceptional. Bravo and thank you for sharing your knowledge with us. You are an immeasurable source of inspiration
Thank you for making this! As someone trying to understand from the ground-up how neural nets are trained and how GPT works, for the purposes of skiling up in the AI Safety field, this was really educational and informative, while being super easy to follow! While I'm still a bit confused about some of the Python syntax (having not worked with it for a while and in not incredible depth), this was still super helpful in understanding conceptually what backpropagation and gradient descent looks like in a step-by-step fashion at the code level. Looking forward to working through more of your videos!
I'm pausing frequently each time I encounter something I don't understand and using GPT-4 as an assistant to dive deeper. thank you for this and other amazing instructional videos. I (we) truly appreciate your efforts.
This guy is a legend. Truly honored to listen to his lectures and finally understand all the operations happening under the hood. Lots of respect to Andrei Karpathy for devoting his time to educate the masses. No words to express the gratitude for his effort.
Absolute gold, watched this after 3B1B's series on neural nets and I must say, these videos have shifted my view on DL from dauntingly complex, evolving and fast paced to maybe I can learn. Really grateful for the content you post @AndrejKarpathy !
Lol, the patience to put this together AND the grace to let us see your bugs - all in one human. Thank you for spelling out the detail of what goes on in that simple diagram with boiler-plate description that we’ve all seen a million times. I finally feel like I really understand it. Now if I can just remember!
Wooow what an introduction! It is by far the best and the easiest to understand. The way you break up and simplify things in a way that we are not loosing the main focus on the WHY we are doing this, is absolutely impressive ! Thanks for sharing your knowledge.
With a strong background in calculus, it was pretty easy for me to understand backprop (I even was like 'HECK yeah I still got it' when I was basically answering to your questions throughout the video), but I have zero coding knowledge. I just started Python a few months back and now I'm getting used to it, but WOW am I equipped with everything neural nets with just one video. Thanks, Andrej!
These are amazing Andrej. Beautifully explained, logical, easy to follow. Thank you so much for generous knowledge sharing and time you put into creating the content.
I am a beginner in the path of AI and this video helps a LOT on how to implement and understand the core components of a neural net, thank you for this video and god bless you 🙏
This is my first ever comment on any RU-vid video, and I wanted to write this because of just how grateful I am for your video series. Your intuitive explanation has been a tremendous help for me to understand lots of concepts within NNs. Thank you for making world-class knowledge accessible, Andrej. Onto the next video I go. :)
I usually don't comment on RU-vid but dude seriously this has been hands down the best explanation for back propagation I have come across!! Thank you so much!
I just watched end to end. Now I will watch it again, but this time I will write out the code and take notes. Best ML video I have seen this year. Cant believe I havent stumbled onto it earlier
It is literally "Zero to Hero". For some reasons, it took me some days to fully understand this video, but i know i will cover all the videos in depth. It is so fortunate for me noticing your videos.
Greate lecture - simply and thoroughly explained with neat and clear code. One of the best lectures I have ever heard. Happy to have found this treasure. Thank you, Andrej!
Honestly, I have no words. This is an amazing presentation, in terms of code, math, logic. Can't wait to continue with the other videos. Just amazing. Thank you so much for taking the time, and sharing your knowledge
Just filled with gratitude today! Whatever led to this video... Whoever created diodes and integrated chips, whoever created the computer, made the personal computer accessible, video formats, to Al Gore for creating the internet, to folks making streaming video possible, to the creators of RU-vid, Jupyter, matplotlib, python....thank you all! Andrej, this video was magical 🙏
I am really appreciate your effort on this class. It was amazing, and finally I can said I have gotten a better understanding on this topic. You are a great an wonderful person. Be happy and take care...!!
2:07:00 i just get a big stupid smile on my face, seeing this magic, and understanding how gradient descent is done now 😁beautiful work and explanation!
Too good! So many 'aha' moments in one single lecture. I coded alongside, sometimes before the explanation (e.g. The zero_grad() before backward()). Very fulfilling experience. I will recommend it to my sons (one working and other in college) and other people. Also can't wait to follow along coding on more such teaching videos. PS: The bloopers after the curtains are funny 😀
Now that every other comment can validate the indispensability of this gem of a video, I would like to mention that the bloopers at the end really cracked me up 😂. Thank you for this humility! Really appreciate what you are doing!
Thank you for making this tutorial! I have always been on a lookout for something like this. Normal videos either discuss super deep details or go on a brief overview. This was a perfect balance between depth and showing the actual usage of what we built. Bingeing your playlist now! :D
@Andrej About implementing backward function recursively , I think it is actually BFS(bread first search) not topological sort. Topological sort starts from outer nodes(here data nodes and output node) and ends with inner nodes.
2:13:00 Micro grad implements back prop. Can create Value objects and do operations with it. In the background it creates a computational graph and keeps track of stuff. Can call backward prop of a value object with apply chain rule to do back prop.
If you're following along, at 1:54:44, you need to have implemented an __radd__ function in the Value object to allow you to subtract a value object from an int
The reason you use a set at 23:11 is so if you do a + a you don't end up with a in the _prev tuple twice, which would presumably screw with the backdrop later.
Absolutely amazing lecture and repo. This really helped me grasp the concept of backpropagation, both mathematically and programmatically! My only critique is that your implementation of the MSE loss function in your lecture was missing the 1/n, making it more of a sum squared error than a mean squared error.
Why the inputs to the operations are considered 'children' becomes clear when you go to actually run the backprop. Basically you topologically sort the computational graph and start with the final output and then go backward from there. So if the final output is the root, then the inputs to that output are its children.