Тёмный
No video :(

05.1 - Latent Variable Energy Based Models (LV-EBMs), inference 

Alfredo Canziani
Подписаться 39 тыс.
Просмотров 17 тыс.
50% 1

Опубликовано:

 

28 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 64   
@liuculiu8366
@liuculiu8366 3 года назад
Really appreciate your generosity in sharing these courses online. We live in the best era.
@alfcnz
@alfcnz 3 года назад
😇😇😇
@punkbuster89
@punkbuster89 2 года назад
"we gonna actually see next week how do we learn shi... eeerm stuff...." Cracked me up XD thanks for the amazing course BTW, really enjoying it!
@alfcnz
@alfcnz 2 года назад
🥳🥳🥳
@bibhabasumohapatra
@bibhabasumohapatra Год назад
I have seen this video 5-6 times last 1 year. but Now when I understood 50% of it I was like F. Insane . . . .Amazing
@alfcnz
@alfcnz Год назад
Check out the 2023 edition! It’s much better! 🥳🥳🥳
@datonefaridze1503
@datonefaridze1503 2 года назад
You explain like Andrew Ng, giving examples are essential for proper understanding, thank you so much, great content
@alfcnz
@alfcnz 2 года назад
❤️❤️❤️
@kseniiaikonnikova576
@kseniiaikonnikova576 3 года назад
yay, new video! 😍 Alfredo, many thanks!
@alfcnz
@alfcnz 3 года назад
🥳🥳🥳
@francistembo650
@francistembo650 3 года назад
First comment. Couldn't resist my favourite teacher's video.
@alfcnz
@alfcnz 3 года назад
❤️❤️❤️
@rsilveira79
@rsilveira79 3 года назад
Awesome material, really looking forward to the next classes. Thanks for all the effort you put on designing this classes
@alfcnz
@alfcnz 3 года назад
You're welcome 😁
@robinranabhat3125
@robinranabhat3125 Год назад
THANK YOU !
@alfcnz
@alfcnz Год назад
You're welcome! 🥰🥰🥰
@soumyasarkar4100
@soumyasarkar4100 3 года назад
wow what cool visualisations !!
@alfcnz
@alfcnz 3 года назад
🎨🖌️👨🏼‍🎨
@abdulmajidmurad4667
@abdulmajidmurad4667 3 года назад
Thanks alfredo (pretty cool plots btw).
@alfcnz
@alfcnz 3 года назад
😍😍😍
@user-co6pu8zv3v
@user-co6pu8zv3v 3 года назад
Thank you, Alfredo!
@alfcnz
@alfcnz 3 года назад
Добре дошъл, Николай. 😊😊😊
@user-co6pu8zv3v
@user-co6pu8zv3v 3 года назад
It's in another language :))))
@alfcnz
@alfcnz 3 года назад
Упс, а вы говорите по-русски?
@user-co6pu8zv3v
@user-co6pu8zv3v 3 года назад
Ага, это мой родной язык. Я из России :))
@alfcnz
@alfcnz 3 года назад
@@user-co6pu8zv3v Ха-ха, ладно, раньше я ошибался языком 😅😅😅
@WolfgangWaltenberger
@WolfgangWaltenberger 3 года назад
These are super cool, pedagogical videos. I wonder what software stack you guys are using to produce them.
@alfcnz
@alfcnz 3 года назад
Hum, PowerPoint, LaTeXiT, matplotlib, Zoom, Adobe After Effects and Premiere.
@pastrop2003
@pastrop2003 3 года назад
Thank you, Alfredo, great video. I have been reading about the energy models for a few weeks already and still have a nagging question: Does energy function is a generalized loss function? I keep thinking that I can reframe any traditional neural network loss as an energy function. What do I miss here?
@alfcnz
@alfcnz 3 года назад
Next episode I'll talk about the loss. Stay tuned.
@bibhabasumohapatra
@bibhabasumohapatra Год назад
Basically in Layman terms we are choosing the best y_pred out of n number of y_pred for each ground truth y. Right?
@alfcnz
@alfcnz 2 месяца назад
Yes, that’s correct! And we need to do so because otherwise we would be learning to predict the average target y.
@chetanpandey8722
@chetanpandey8722 Год назад
Thank you for posting making such amazing videos. I have a doubt. Whenever you are talking about optimizing the energy function to find the minimum value you are saying that we should be using gradient descent and not stochastic gradient descent. In my understanding, in gradient descent we calculate the gradients using the whole dataset and then make an update while in stochastic case we take random data points to calculate the gradient and then make the update. So I am not able to understand what is the problem with stochastic gradient descent
@alfcnz
@alfcnz 7 месяцев назад
The energy is a scalar value for a given input x, y, z. You want to minimise this energy, for example, wrt z. There’s nothing stochastic here. When training a model, we minimise the loss by following a noisy gradient computed for a given per-sample (or per-batch) loss.
@blackcurrant0745
@blackcurrant0745 2 года назад
At 17:56 you say there are 48 different z's, then later you have only 24 of them at 29:43, and then later yet one can count 48 lilac z points in the graph at 42:45. What's the reason of changing the number of z points back and forth?
@alfcnz
@alfcnz 2 года назад
Good catch! z is continuous. The number of distinct values I pick is arbitrary. In the following edition of these slides there are no more distinct z points but they are shown as a continuous line. So, there are infinitely many z. Why 24 and 48. 24 are my y. I used to generate them with 24 equally spaced z. When I show the ‘continuous’ manifold, I should show more points than training samples. So, I doubled them. Hence 48. It looks like I didn't use the doubled version for the plot with the 24 squares. In the following edition of this lesson (not online, because only minor changes have been made and these videos take me forever to put together) and in the book (which replaces my video editing time) there are no more discrete dots for the latents.
@flaskapp9885
@flaskapp9885 3 года назад
amazing video alfredo :)
@alfcnz
@alfcnz 3 года назад
And more to come! 😇😇😇
@flaskapp9885
@flaskapp9885 3 года назад
@@alfcnz thanks, pls make guide video to NLP engineer or something :) there is no sort of nlp engineer things on the internet:)
@alfcnz
@alfcnz 3 года назад
@@flaskapp9885 NLP engineering? 🤔🤔🤔 What is it?
@flaskapp9885
@flaskapp9885 3 года назад
@@alfcnz yes sir, nlp engineering. IM thinking of doing that. :)
@alfcnz
@alfcnz 3 года назад
@@flaskapp9885 I don't know what that is. Can you explain?
@keshavsingh489
@keshavsingh489 3 года назад
Great explanation. Just one question: Why is it called energy function, when it looks just like a loss function with latent variable.?
@alfcnz
@alfcnz 3 года назад
A “loss” measures the network performance and it's minimised during training. We'll see more about this in the next episode. An “energy” is an actual output produced by a model and it's used during inference. In this episode we didn't train anything, still we've used gradient descent to perform inference of latent variables.
@keshavsingh489
@keshavsingh489 3 года назад
Thank you soo much for explaining. Looking forward to the next lecture.
@anondoggo
@anondoggo 2 года назад
So inference means we're given x and y and we want to predict an energy score E(x, y)? I thought lv-ebm was supposed to produce predictions for y, better go back to the slide :/
@anondoggo
@anondoggo 2 года назад
Ok so I think what's going on is, during training y is the target for which we should give low E for; during inference, we're choosing a y that gives the lowest energy and y is an input. Mind is blown :/
@alfcnz
@alfcnz Год назад
I think your sentence is broken. «during inference…» we'd like to test how far a given y is from the data manifold.
@mythorganizer4222
@mythorganizer4222 3 года назад
Hello Mr. Canziani!
@alfcnz
@alfcnz 3 года назад
“Prof” Canziani 😜
@mythorganizer4222
@mythorganizer4222 3 года назад
@@alfcnz I am sorry Professor Canziani. I want to tell you, your videos are the best learning source for people who want to study deep learning but can't afford it. Oh your videos and also deep learning by Ian Goodfellow. It is a very good book. Thank you for all the efforts you put in sir :D
@alfcnz
@alfcnz 3 года назад
😇😇😇
@sutharsanmahendren1071
@sutharsanmahendren1071 3 года назад
Thank you for your great explanation and make your course material available to all. I have a small doubt at 45:25 where you compute energy from all inferences from z samples. Is it the right way to use euclidian distance for computing distance from the reference point (y) to all the points(y hat) in the manifold. ? Will it is more appropriate if points from the bottom half of the manifold resulted in more energy than the first half?
@alfcnz
@alfcnz 3 года назад
E is a function of y and z. Given a y, say y', E is a function of z only. What I'm computing there is E(y', z) for a few values of z. In this example, for every z the decoder will give me ỹ. Finally, the energy function of choice, in this case, is the reconstruction error.
@sutharsanmahendren1071
@sutharsanmahendren1071 3 года назад
@@alfcnz Thank you so much for your reply. I understand that reconstruction error is one of the choices for energy function here.
@alfcnz
@alfcnz 3 года назад
@@sutharsanmahendren1071 okay, so your question is… not yet answered? Or did I nail it above?
@sutharsanmahendren1071
@sutharsanmahendren1071 3 года назад
@@alfcnzActually my question is reconstruction error is the best choice for EBM ? (Funny ideas: construct KNN graph with y_hat manifold and y (observation) and find the shortest path from y to all other y_hat; instead of computing energy between two points cant we measure the energy between two distributions which are formed by y and y_hat in EBM? )
@mikhaeldito
@mikhaeldito 3 года назад
Office hours, please!
@alfcnz
@alfcnz 3 года назад
Okay, okay.
@kevindoran9031
@kevindoran9031 2 года назад
How we learn s*** 😂
@alfcnz
@alfcnz 2 года назад
Without timestamp it's hard to double check 🥺🥺🥺
Далее
what will you choose? #tiktok
00:14
Просмотров 7 млн
04.1 - Natural signals properties and the convolution
1:09:13
Why Does Diffusion Work Better than Auto-Regression?
20:18
07 - Unsupervised learning: autoencoding the targets
56:42
02L - Modules and architectures
1:42:27
Просмотров 22 тыс.
06L - Latent variable EBMs for structured prediction
1:48:54
01 - History and resources
50:18
Просмотров 99 тыс.
what will you choose? #tiktok
00:14
Просмотров 7 млн