Тёмный
Alfredo Canziani
Alfredo Canziani
Alfredo Canziani
Подписаться
Music, math, and deep learning from scratch
06 - Optimisation and gradient ascent
58:59
21 день назад
Chapter 1, video 1-3
0:46
Месяц назад
00 - Course introduction
2:48
Месяц назад
03 - Inference with neural nets
1:07:19
Год назад
13L - Optimisation for Deep Learning
1:51:32
2 года назад
13 - The Truck Backer-Upper
1:01:22
2 года назад
04L - ConvNet in practice
51:41
2 года назад
02L - Modules and architectures
1:42:27
3 года назад
Комментарии
@TomChenyangJI
@TomChenyangJI 3 дня назад
only a few words on his own masterpiece work haha
@alfcnz
@alfcnz 3 дня назад
🤭🤭🤭
@PedroAugusto-kg1ss
@PedroAugusto-kg1ss 6 дней назад
Hello! First of all, thank you for uploading the material. Very, very good course. However, in this part of EBMs, I'm a little bit confused: Lets supposed that I've trained a denoising AE (or other variation) with a bunch of y's. After training, how do I use it in practice? I'd pick a random z and use to generate a y_tilde? From which distribution I'd sample such z?
@user-co6pu8zv3v
@user-co6pu8zv3v 8 дней назад
Thank you, Alfredo! :)
@alfcnz
@alfcnz 8 дней назад
You’re welcome! 😀
@sudarshantak2680
@sudarshantak2680 10 дней назад
@alfcnz
@alfcnz 10 дней назад
😀😀😀
@user-co6pu8zv3v
@user-co6pu8zv3v 10 дней назад
Thank you, Alfredo. You made such a great visualization!
@alfcnz
@alfcnz 10 дней назад
😊😊😊
@aloklal99
@aloklal99 12 дней назад
How were the neural nets trained before 1985 ie before back prop was invented?
@alfcnz
@alfcnz 12 дней назад
I have a few videos on that on my most recent playlist, second chapter. There, I explain how the Perceptron (a binary neuron with an arbitrary number of inputs) used an error correction strategy for learning. Let me know if you have any other question. 😇😇😇 Chapter 2, video 4-6 ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-g4sSU6B99Ek.html
@aloklal99
@aloklal99 8 дней назад
@@alfcnz thanks! 🙏
@user-co6pu8zv3v
@user-co6pu8zv3v 13 дней назад
Thank you, Alfredo :)
@alfcnz
@alfcnz 13 дней назад
🤗🤗🤗
@NewGirlinCalgary
@NewGirlinCalgary 15 дней назад
Amazing Lecture!
@alfcnz
@alfcnz 15 дней назад
🥳🥳🥳
@user-co6pu8zv3v
@user-co6pu8zv3v 15 дней назад
Thank you Alfredo, you have made a very clear explanation of this topic. :)
@user-co6pu8zv3v
@user-co6pu8zv3v 20 дней назад
Thank you, Alfredo!
@alfcnz
@alfcnz 20 дней назад
🥰🥰🥰
@housebyte
@housebyte 22 дня назад
This principle of running differential equations backward is used in diffusion when you find the lagrange Loss function from the Score which is the time reversing Langevin dynamic equation. Cost and Energy or Momentum and Energy. Both are deterministic reversible dynamic systems.
@alfcnz
@alfcnz 21 день назад
Without a timestamp I have no clue what you’re referring to.
@user-co6pu8zv3v
@user-co6pu8zv3v 22 дня назад
Thank you, Alfredo! I am happy that you are back
@alfcnz
@alfcnz 21 день назад
🥳🥳🥳
@user-co6pu8zv3v
@user-co6pu8zv3v 22 дня назад
Hello, Alfredo! :)
@alfcnz
@alfcnz 22 дня назад
Long time no see! 👋🏻
@dimitri30
@dimitri30 27 дней назад
Thank you for sharing. I'v one question about the NNs on scrambled data. If I had to make a prediction I would have said we will have an accuracy about 15%, not more, thanks to the amount of pixel that can help to determine which digit is corresponding. So is it enough to get an accuracy of 83-85% or there is something else? I supposed that the fully connected neural network would have duplicate the filters, but there is no change with the scrambled data.
@alfcnz
@alfcnz 27 дней назад
I don’t understand the question. Try asking in your native language.
@dimitri30
@dimitri30 27 дней назад
@@alfcnz Yes of course. If think my french explanation was not clear either. I would have assumed that with scrambled data, we would have had an accuracy of around 15%, not more (which is more than 10% thanks to the fact that by counting the number of pixels, the model can have an idea of which digit is the most probable). I have trouble understanding how the model can achieve as "good" results as 85% on scrambled data. Does the model count the number of pixels and determine it that way, or is there something else? I had assumed that in reality, the dense model would have worked like a ConvNet by learning the same kernels multiple times. Essentially, we would have had weight redundancy to get something similar to a ConvNet. Is it because of the lack of parameters in the dense network? If we had given a lot more parameters to it, would it have come back to having a ConvNet with weight redundancy to "simulate" the filter's movement? In french: J'aurai supposé qu'avec les données brouillées on aurait eu une précision d'environ 15% pas plus. (Ce qui est plus que 10% grâce au fait qu'en comptant le nombre de pixel le modèle peut avoir une idée de quel chiffre est le plus probable. J'ai du mal à comprendre comment le modèle peut avoir d'aussi "bon" résultats que 85% sur des données brouillées. Est-ce que le modèle compte le nombre de pixel et le détermine comme ça ou il y autre chose ? J'avais supposé qu'en réalité le modèle dense aurait fonctionné comme un ConvNet en apprenant plusieurs fois les mêmes kernel. En gros on aurait eu une redondance des poids pour avoir quelque chose de ressemblant à un ConvNet. Est-ce à cause du manque de paramètre du réseau Dense ? Si on avait donné beaucoup plus de paramètres à celui-ci, est-ce que ce serait revenu à avoir un convNet avec une redondance des poids pour "simuler" le déplacement du filtre ? Thank you
@alfcnz
@alfcnz 27 дней назад
There's a lot going on in this question. First, let's address the fully-connected model. The model does not care if you scramble the input or if you don't. If smartly initialised, the model will learn *the same* weights but with a permutated order. That's why the model performance is (basically) the same before and after permutation. Until here, are you following? Do you have any specific question on this first part of my answer?
@dimitri30
@dimitri30 26 дней назад
@alfcnz Thanks for your reply. I'm sorry for wasting your time, I just didn't pay enough attention to the fact that this is a DETERMINISTIC shuffle.
@alfcnz
@alfcnz 26 дней назад
Oh, yes! It is! The point here was to show how convolutional nets should be used only when specific assumptions hold for the input data. 😊😊😊
@Acn0w
@Acn0w 29 дней назад
Thanks a lot! Your content is giving me motivation to get back into this field. Keep it up please 👏🙏
@alfcnz
@alfcnz 29 дней назад
Happy to hear that! I'll keep feeding knowledge to my subscribers! 😇😇😇
@PicaPauDiablo1
@PicaPauDiablo1 29 дней назад
Thank you for posting this. Looks like a great hour is ahead.
@alfcnz
@alfcnz 29 дней назад
You bet! 😎😎😎
@tantzer6113
@tantzer6113 Месяц назад
@14:04 Paraphrase: Missing a positive (i.e., false negative) is more critical (i.e., worse) than a FALSE POSITIVE. (Note: "falsely identify a negative case" means "falsely identify AS A POSITIVE what is actually a negative case.)
@alfcnz
@alfcnz Месяц назад
This is true _for the specific case_ of medical diagnosis. The contrary is true for other applications, such as spam detection.
@TemporaryForstudy
@TemporaryForstudy Месяц назад
Loved the video ❤. hey i am working as an nlp engineer in india. do you have any remote opportunities for me? let me know if you have something.
@alfcnz
@alfcnz Месяц назад
Thanks for your appreciation! 🥰 Currently, I’m video editing and writing a textbook. Not sure these tasks are suitable for opportunities. 🥺
@Palatino-Art
@Palatino-Art 28 дней назад
@TemporaryForstudy *I am from India too learning machine learning can I get your contact?*
@CyberwizardProductions
@CyberwizardProductions Месяц назад
that's the entire reason to teach :) learn how to do something and pass it on
@alfcnz
@alfcnz Месяц назад
🥳🥳🥳
@tantzer6113
@tantzer6113 Месяц назад
@5:24 Paraphrase: "So, what is the accuracy of a classifier that classifies everything as HAM, detecting no SPAM, thus yielding NO POSITIVES ?"
@alfcnz
@alfcnz Месяц назад
Yup, precisely! 😊😊😊
@monanasery1992
@monanasery1992 Месяц назад
Thank you so much for sharing this series. I especially loved the vintage ConvNets and the brain part :) I have a question: I didn't understand how we define the number of feature maps. For example, in 1:27:00 , how did we go from 6 feature maps in layer 2 to 12 feature maps in layer 3? (By the way, there are 16 feature maps in layer 3 (C3) in the architecture of LeNet-5 in this paper: yann.lecun.com/exdb/publis/pdf/lecun-98.pdf (Fig 2. the architecture of LeNet-5).
@wolpumba4099
@wolpumba4099 Месяц назад
*Summary* *Probability Recap:* * *[**0:00**]* *Degree of Belief:* Probability represents a degree of belief in a statement, not just true or false. * *[**0:00**]* *Propositions:* Lowercase letters (e.g., cavity) represent propositions (statements). Uppercase letters (e.g., Cavity) are random variables. * *[**5:15**]* *Full Joint Probability Distribution:* Represented as a table, it shows probabilities for all possible combinations of random variables. * *[**10:08**]* *Marginalization:* Calculating the probability of a subset of variables by summing over all possible values of the remaining variables. * *[**17:04**]* *Conditional Probability:* The probability of an event happening given that another event has already occurred. Calculated as the ratio of joint probability to the probability of the conditioning event. * *[**16:14**]* *Prior Probability:* The initial belief about an event before observing any evidence. * *[**16:40**]* *Posterior Probability:* Updated belief about an event after considering new evidence. *Naive Bayes Classification:* * *[**32:48**]* *Assumption:* Assumes features (effects) are conditionally independent given the class label (cause). This simplifies probability calculations. * *[**32:48**]* *Goal:* Predict the most likely class label given a set of observed features (evidence). * *[**44:04**]* *Steps:* * Calculate the joint probability of each class label and the observed features using the naive Bayes assumption. * Calculate the probability of the evidence (observed features) by summing the joint probabilities over all classes. * Calculate the posterior probability of each class label by dividing its joint probability by the probability of the evidence. * Choose the class label with the highest posterior probability as the prediction. * *[**36:24**]* *Applications:* * *Digit Recognition:* Classify handwritten digits based on pixel values as features. * *[**47:34**]* *Spam Filtering:* Classify emails as spam or ham based on the presence of specific words. * *[**33:56**]* *Limitations:* * *Naive Assumption:* The assumption of feature independence is often unrealistic in real-world data. * *[**42:11**]* *Data Sparsity:* Can struggle with unseen feature combinations if the training data is limited. *Next Steps:* * *[**1:05:58**]* *Parameter Estimation:* Learn the probabilities (parameters) of the model from training data. * *[**59:53**]* *Handling Underflow:* Use techniques like logarithms and softmax to prevent numerical underflow when multiplying small probabilities. i used gemini 1.5 pro to summarize the transcript
@alfcnz
@alfcnz Месяц назад
They are a bit off. The first two titles should not be simultaneous nor at the very beginning. Similarly, Gemini thinks that the first two titles of Naïve Bayse Classification are also simultaneous. I can see, though, how these could be helpful, if refined a bit.
@yacinebelhadj9749
@yacinebelhadj9749 Месяц назад
thanks ☺️. can't wait for the release of your book
@alfcnz
@alfcnz Месяц назад
🥳🥳🥳
@edbertkwesi4931
@edbertkwesi4931 Месяц назад
alfrdo the besto
@alfcnz
@alfcnz Месяц назад
🤓🤓🤓
@monanasery1992
@monanasery1992 Месяц назад
Thank you so much for sharing this 🥰 This was the best video for learning gradient descent and backpropagation.
@alfcnz
@alfcnz Месяц назад
🥳🥳🥳
@guiomoff2438
@guiomoff2438 Месяц назад
Can the first part be found anywhere other than on youtube or on another youtube channel? Thank you for sharing your part of the course :)
@alfcnz
@alfcnz Месяц назад
I just asked. No, all I can share is the book chapters, as pointed out in the video. Sorry about that. ☹️
@groupconviction
@groupconviction Месяц назад
The course is awesome, your humour is even better!😂
@alfcnz
@alfcnz Месяц назад
Humour? 😮😮😮 Where? Who, me?
@tantzer6113
@tantzer6113 Месяц назад
Looking forward to an update on the book after you’ve read more of it.
@alfcnz
@alfcnz Месяц назад
🤓🤓🤓 It’s a bit slow and trivial. But, I may not be the _intended audience_ 😅😅😅
@SebastianRaschka
@SebastianRaschka Месяц назад
You are back! That's exciting. Also awesome topic. Naive Bayes was the very first algo I learned when I started with ML/AI stuff as a fresh student!
@alfcnz
@alfcnz Месяц назад
Heyyyyyy! 🤗🤗🤗 Yes! First time I’m learning/teaching probability & co! I’d say it’s rather a pretty framework! 😊
@dimitri30
@dimitri30 Месяц назад
Thank you for sharing this to everyone. I was bored by my university courses which are not very understood, passionate and advanced.
@alfcnz
@alfcnz Месяц назад
You’re welcome! 😉 Let me know if you have any questions about the course material! 🤓
@joeeeee8738
@joeeeee8738 Месяц назад
Great intro!! Btw, which software do you use to present?
@alfcnz
@alfcnz Месяц назад
Microsoft PowerPoint 😅
@petrdvoracek670
@petrdvoracek670 Месяц назад
Nice to see you again!
@alfcnz
@alfcnz Месяц назад
Hey, thanks! 🤩🤩🤩
@datagigs5478
@datagigs5478 Месяц назад
Do you cover the whole course on RU-vid ?
@alfcnz
@alfcnz Месяц назад
Please, check out the first video of the playlist, where an overview of the course in provided. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-GyKlMcsl72w.html
@joeeeee8738
@joeeeee8738 Месяц назад
What software do you use to present? Looks great!
@alfcnz
@alfcnz Месяц назад
Microsoft PowerPoint 🙃
@AntonioFlores-tk6pw
@AntonioFlores-tk6pw Месяц назад
Gracias!
@alfcnz
@alfcnz Месяц назад
¡De nada! 😀😀😀
@yussefleon4904
@yussefleon4904 Месяц назад
Hello, it’s been a while
@alfcnz
@alfcnz Месяц назад
Yeah… After the pandemics, classes have returned to be held in person. This semester it was a surprise I was going to teach remotely again. And I have to say I prefer it, since then I can share my work with y'all! ❤️❤️❤️
@st0ox
@st0ox Месяц назад
Delicious topics indeed
@alfcnz
@alfcnz Месяц назад
😋😋😋
@muthukamalan.m6316
@muthukamalan.m6316 Месяц назад
❤❤
@alfcnz
@alfcnz Месяц назад
🥰🥰🥰
@atinjanki2651
@atinjanki2651 Месяц назад
Thank you!
@alfcnz
@alfcnz Месяц назад
Welcome! 🥳🥳🥳
@user-xi4lu2so5z
@user-xi4lu2so5z Месяц назад
May I ask if the author still insists on using torch in lua......🤭
@alfcnz
@alfcnz Месяц назад
Haha, no, no, my undergraduate student created PyTorch in 2016. So, I moved to it ever since! 😄😄😄
@user-xi4lu2so5z
@user-xi4lu2so5z Месяц назад
When I first started college a year ago, I found that the lua interpreter is faster than Python, so I've been wondering if I could use lua to build deep learning models... In the past two months, I was surprised to find that my idea had already been eliminated when I was in primary school... But I'm still curious, so now I'm trying to experience deep learning and data science computing tasks with torch in lua...
@alfcnz
@alfcnz Месяц назад
I would recommend moving to PyTorch since that’s the latest version of Torch itself. Yes, you could use Lua out of curiosity, but as of 2024, there are no advantages in doing so. Python has become the lingua franca of deep learning and has a large wealth of libraries, making the whole ecosystem very convenient to use.
@janos1945
@janos1945 Месяц назад
🥵🥵♥️
@alfcnz
@alfcnz Месяц назад
😁😁😁
@MateuszModrzejewski
@MateuszModrzejewski Месяц назад
Fantastic lecture as usual, thanks for sharing! Looking forward to further ones!
@alfcnz
@alfcnz Месяц назад
😀😀😀
@mohamedrefaat197
@mohamedrefaat197 Месяц назад
"who's that man?"
@alfcnz
@alfcnz Месяц назад
Hahaha 🤪🤪🤪
@hamzaomari7052
@hamzaomari7052 Месяц назад
Thanks for sharing, I like the haircut too *-*
@alfcnz
@alfcnz Месяц назад
Thanks! 🥰🥰🥰
@johnini
@johnini Месяц назад
Alfredo!! I talk about you with random people I've met... You always do a good job and always entertain!!
@alfcnz
@alfcnz Месяц назад
Thanks! ☺☺☺
@Zoronoa01
@Zoronoa01 Месяц назад
Is this video part of a course? where is part one please?
@alfcnz
@alfcnz Месяц назад
Will post in a few days the series intro video where I explain what it is about.
@Zoronoa01
@Zoronoa01 Месяц назад
@@alfcnz looking forward for that. Thank you!
@respair1385
@respair1385 Месяц назад
wow, never been happier with a notification! I think I missed this course, may I ask where's the first part? btw are we getting any introduction to diffusion methods this time?
@alfcnz
@alfcnz Месяц назад
This is a new undergraduate course I just started teaching this year. I'll post an intro video about this series in a few days. Diffusion models are taught in my graduate level class (and not from me yet).
@respair1385
@respair1385 Месяц назад
@@alfcnz thanks for the response. Is there anyway to access your graduate courses ? It's fine if it isn't free. As much as I'd like to visit your classes unfortunately there are people like me where we're two continents apart and not much feasible to apply for the classes. My only hope is to get them digitally.
@alfcnz
@alfcnz Месяц назад
There are several editions already available here on RU-vid about my graduate course. Feel free also to check out my homepage to see all courses I’m offering.
@hesamce
@hesamce Месяц назад
So happy of the video notification. Thanks for sharing😁❤️
@alfcnz
@alfcnz Месяц назад
Yay! 🥳🥳🥳
@CyberwizardProductions
@CyberwizardProductions Месяц назад
YAY!!!
@alfcnz
@alfcnz Месяц назад
🤗🤗🤗
@PicaPauDiablo1
@PicaPauDiablo1 Месяц назад
Thank you!
@alfcnz
@alfcnz Месяц назад
You're welcome! 😇