Variational Autoencoder - Model, ELBO, loss function and maths explained easily!

Umar Jamil

Подписаться 41 тыс.

Просмотров 29 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 62

@lucdemartre4738 6 месяцев назад

I would pay so much to have you as my teacher, that's not only the best video i've ever seen on deep leanring, but probably the most appealing way anyone ever taught me CS !

@sandahk7173 2 месяца назад

It was the best video I found to explain VAE. Thank you so much!

@JohnSmith-he5xg 11 месяцев назад

Getting philosophical w/ the Cave Allegory. I love it. Great stuff.

@vipulsangode8612 6 месяцев назад

This is the best explanation on the internet!

@shajidmughal3386 4 месяца назад

Great explanation. Clean!!! Reminds me of school where our physics teacher taught everything practical and it felt so simple. subs+1👍

@anantmohan3158 20 дней назад

Thank you for creating such a nice explanation of VAE. When are you bringing the practical implementation video of VAE? please make a video for this as well.

@Koi-vv8cy 11 месяцев назад

It's the clearest explanation about VAE that I have ever seen.

@umarjamilai 11 месяцев назад

If you're up to the challenge, watch my other video on how to code Stable Diffusion from scratch, which also uses the VAE

@lucdemartre4738 6 месяцев назад

PLATO MENTIONED PLATO MENTIONED I LOVE YOU THAT'S THE BEST VIDEO I'VE EVER SEEN !!!

@shuoliu3546 6 месяцев назад

You solved my confusion since long! Thank you !

@xm9086 6 месяцев назад

You are a great teacher.

@dannyjiujitsu 2 месяца назад

Phenomenal teaching.

@vikramsandu6054 5 месяцев назад

Simply amazing. Thank you so much for explaining so beautifully. :)

@isaz2425 6 месяцев назад

Thanks, this video have many explanations that are missing from other tutorials on VAE. Like the part from 22:45 onwards. I saw a lot of other videos that didn't explain how the p and q functions were related to the encoder and decoder. (every other tutorial felt like they started talking about VAE, and then suddenly changed subject to talk about some distribution functions for no obvious reason).

@umarjamilai 6 месяцев назад

Glad you liked it!

@danialramezani-c5k Месяц назад

You are awesome; cheers

@Kiara-h2o 2 месяца назад

Thanks a lot

@mahsakhalili5042 4 месяца назад

Appreciate it! it helped me a lot

@pauledam2174 2 месяца назад

great video ( am about half way through). I think at minute 13 there is a misstatement (I think). in general if g(x) is greater than h(x), if we find the max of h it doesn't mean we have located the max for g

@awsom 7 месяцев назад

Great Explanation!!

@hainguyenthien4225 2 месяца назад

thank you very much

@TheBlackNight971 3 месяца назад

Both VAE and AE map the input over a latent space, the difference lies on the **structure** of this latent space. The AE latent space is not "well-organized" as well as the VAE's latent space.

@shashankgsharma0901 4 месяца назад

thanks UMAR!

@lifeisbeautifu1 6 месяцев назад

You rock!

@OlenaHlushchenko 6 месяцев назад

Thanks!

@waynewang2071 7 месяцев назад

Hey, thank you for the great video. Curious if there is any plan to have a session for code for VAE? Many thanks!

@mirkovukovic6517 Месяц назад

At the very end of the video Umar mentions that in "the next video" he will explain the coding of the VAE along with some examples. I could not find that video. Was it made? What is its title? Thanks 🙂

@umarjamilai Месяц назад

There's a video on coding stable diffusion, which includes the VAE

@oiooio7879 Год назад

Can you do more explanations with coding walk through that video you did on transformer with the coding helped me understand it a lot

@umarjamilai Год назад

Hi Oio! I am working on a full coding tutorial to make your own Stable Diffusion from scratch. Stay tuned!

@huuhuynguyen3025 Год назад

@@umarjamilai i hope to see it soon, sir

@GrifinsBrother 8 месяцев назад

Sad that you have not released video "Hot to code the VAE"(

@sohammitra8657 Год назад

Hey can you do a video on SWin transformer next??

@ropori_piipo 6 месяцев назад

The Cave Allegory was overkill lol

@umarjamilai 6 месяцев назад

I'm more of a philosopher than an engineer 🧘🏽

@prateekpatel6082 8 месяцев назад

why does learning distribution via a latent variable capture semantic meaning. ? can you please elaborate a bit on that

@quonxinquonyi8570 7 месяцев назад

Latent variable is of low dimension compare to input which is of high dimension…so this low dimension latent variable contains features which are robust, meaning these robust features survive the encoding process coz encoding process removes redundant features….imagine a collection had images of cat and a bird image distribution, what an encoder can do in such a process is to outline a bird or cat by its outline without going into details of colours and texture….these outlines is more than enough to distinguish a bird from a cat without going into high dimensions of texture and colors

@prateekpatel6082 7 месяцев назад

@@quonxinquonyi8570 that doesnt answer the question. Latent space in autoencoders dont capture semantic meaning , but when we enforce regularization on latent space and learn a distribution thats when it learns some manifold

@quonxinquonyi8570 7 месяцев назад

@@prateekpatel6082 learning distribution means that you could generate from that distribution or in other words sample from such distribution…but since the “ sample generating distribution “ can be too hard to learn, so we go for reparametrization technique to learn the a standard normal distribution so that we can optimize

@quonxinquonyi8570 7 месяцев назад

I wasn’t talking about auto encoder,I was talking about variational auto encoder…

@quonxinquonyi8570 7 месяцев назад

“ learning the manifold “ doesn’t make sense in the context of variational auto encoder….coz to learn the manifold, we try to approach the “score function” ….score function means the original input distribution….there we have to noised and denoised in order to get some sense of generating distribution….but the problem still holds in form of denominator of density of density function….so we use log of derivative of distribution to cancel out that constant denominator….then use the high school level first order derivative method to learn the noise by using the perturbed density function….

@umarjamilai Год назад

Link to the slides: github.com/hkproj/vae-from-scratch-notes

@RomanLi-y9c Год назад

thx for the video, this is awesome!

@shashankgsharma0901 4 месяца назад

I lost you at 16:00

@anjalikatageri1550 4 месяца назад

fr me too

@zlfu3020 6 месяцев назад

Missing a lot of details and whys.

@ricardogomes9528 15 дней назад

Great video on VAE Umar, but I keep with a question: At 11:49, the KL divergence is written as "D_{KL}({numerator}||{denominator}". While in 11:34 you wrote KL divergence as "D_{KL}{denominator}||{numerator}". Why is that? I guess it is not because of the "\minus" sign

@nassimlamrini-b5i 2 месяца назад

🎯 Key points for quick navigation: 00:13 *understand fundamental ideas* 00:41 *explain autoencoder concept* 02:36 *issues with traditional autoencoders* 03:02 *introduce variational autoencoder* 04:13 *sampling from latent space* 25:02 *Model and prior choice.* 25:17 *Element-wise noise multiplication.* 25:32 *Learning log variance.* 26:00 *Maximizing ELBO for space learning.* 26:15 *Derivation of loss function.* Made with HARPA AI

@darkswordsmith Месяц назад

Variational methods is a class of methods ive always struggled to understand. I think your explanation addresses some key questions but I thunk you jumped into the math too quickly. Questions im still wondering about: - what are the "variational" parameters? They are represented by phi, but what do they do/mean? - why is having the noise source necessary? What happens if i set epsilon to be 0 all the time?

@chenqu773 9 месяцев назад

The peps starting from 06:40 are the gem. Totally agree.

@DikkeHamster 2 месяца назад

Don't AEs also learn a latent space representation? I'm not sure this is the largest difference with VAEs. The sampling is, however.

@umarjamilai 2 месяца назад

They do, but nobody forces the latent space of AEs to be a gaussian or any other known distribution.

@desmondteo855 4 месяца назад

Incredible explanation. Thanks for making this video. It's extremely helpful!

@oiooio7879 Год назад

Wow thank you very informative

@morgancredib-ai2501 9 месяцев назад

A normalizing flow video would complement this nicely

@nadajonidi9691 8 месяцев назад

Would you please give the url for normalizing flows

@OlenaHlushchenko 6 месяцев назад

I love this so much, this channel lands in my top 3 ML channels ever

@samcal740 Месяц назад

This actually very good

@martinschulze5399 10 месяцев назад

14:41 you dont maximiye log p(x), that is a fixed quantity.

@吴腾-q7j 6 месяцев назад

Thanks for sharing . In the chicken and egg example, will p(x, z) be trackable? if x, z is unrelated, and z is a prior distribution, so p(x, z) can be writen in a formalized way?