Variational Auto Encoder (VAE) - Theory

Подписаться 8 тыс.

Просмотров 21 тыс.

50% 1

VAE's are a mix between VI and Auto Encoders NN. They are used mainly for generating new data. In this video we will outline the theory behind the original paper, including looking at regular Auto Encoders, Variational Inference, and how they mix together to create VAE.
Original Paper (Kingma & Welling 2014): arxiv.org/pdf/...
The first and only Variational Inference (VI) course on-line!
Become a member and get full access to this online course:
meerkatstatist...
** 🎉 Special RU-vid 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 **
“VI in R” Course Outline:
Administration
* Administration
Intro
* Intuition - what is VI?
* Notebook - Intuition
* Origin, Outline, Context
KL Divergence
* KL Introduction
* KL - Extra Intuition
* Notebook - KL - Exercises
* Notebook - KL - Additional Topics
* KL vs. Other Metrics
VI vs. ML
* VI (using KL) vs. Maximum Likelihood
ELBO & “Mean Field”
* ELBO
* “Mean Field” Approximation
Coordinate Ascent VI (CAVI)
* Coordinate Ascent VI (CAVI)
* Functional Derivative & Euler-Lagrange Equation
* CAVI - Toy Example
* CAVI - Bayesian GMM Example
* Notebook - Normal-Gamma Conjugate Prior
* Notebook - Bayesian GMM - Unknown Precision
* Notebook - Image Denoising (Ising Model)
Exponential Family
* CAVI for the Exponential Family
* Conjugacy in the Exponential Family
* Notebook - Latent Dirichlet Allocations Example
VI vs. EM
* VI vs. EM
Stochastic VI / Advanced VI
* SVI - Review
* SVI for Exponential Family
* Automatic Differentiation VI (ADVI)
* Notebook - ADVI Example (using STAN)
* Black Box VI (BBVI)
* Notebook - BBVI Example
Expectation Propagation
* Forward vs. Reverse KL
* Expectation Propagation
Variational Auto Encoder
Why become a member?
* All video content
* Extra material (notebooks)
* Access to code and notes
* Community Discussion
* No Ads
* Support the Creator ❤️
VI (restricted) playlist: bit.ly/389QSm1
If you’re looking for statistical consultation, someone to work on interesting projects, or give training workshops, visit my website meerkatstatist... or contact me directly at david@meerkatstatistics.com
~~~~~ SUPPORT ~~~~~
Paypal me: paypal.me/Meer...
~~~~~~~~~~~~~~~~~
Intro/Outro Music: Dreamer - by Johny Grimes
• Johny Grimes - Dreamer

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 22

@Omsip123 21 день назад

Thanks for your efforts, very well explained

@paedrufernando2351 8 месяцев назад

@6:10 VI starts.The run down was awesome..puts eveything into perspective

@evgeniyazarov4230 9 месяцев назад

Great explanation! The two ways of looking on the loss function is insightful

@YICHAOCAI 8 месяцев назад

Fantastic video! This effectively resolved my queries.

@123sendodo4 Год назад

Very clear and useful information!

@SpringhallBess-g1b 12 дней назад

Lindgren Expressway

@shounakdesai4283 7 месяцев назад

awesome video.

@JeffreyParsons-h5u 2 дня назад

Peyton Place

@evaggelosantypas5139 Год назад

Hey great video, thank you for your efforts. Is it possible to get your slides ?

@MeerkatStatistics Год назад

Thanks. The slides are offered on my website meerkatstatistics.com/courses/variational-inference-in-r/lessons/variational-auto-encoder-theory/ for members. Please consider subscribing to also support this channel.

@evaggelosantypas5139 Год назад

@@MeerkatStatistics ok thnx

@minuklee6735 5 месяцев назад

Thank you for the awesome video! I have a question @11:35. I don't clearly understand why g_\theta takes x. am I correct that it does not take x if g_\theta is a gaussian distribution? as it will just be g_\theta(\epsilon) = \sigma*\epsilon + \mu (where \sigma and \mu comes from \theta)?? Again, I appreciate your video a lot!

@MeerkatStatistics 5 месяцев назад

Although not explicitly denoted, q(z) is also dependent on the data. This is why g(theta) will usually also be depending on x. I didn't want to write q(z|x) as in the paper, because it is not a posterior, but rather a distribution who's parameters you tweak until it reaches the true posterior p(z|x). I have a simple example (for the CAVI algorithm) on my website (for members) meerkatstatistics.com/courses/variational-inference-in-r/lessons/cavi-toy-example/ and also a bit more elaborate example free on RU-vid ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-8DzIPZnZ12k.htmlsi=8Un505QqOEtij9XV - in both cases you'll see a q(z) that is a Gaussian, but whose parameters depend on the data x.

@stazizov 5 месяцев назад

Could you please tell me if there is a mistake in the notation? @8:26 z_{i} = z_{l}?

@MeerkatStatistics 5 месяцев назад

Hey, yes of course. Sorry for the typo.

@stazizov 5 месяцев назад

@@MeerkatStatistics Thank you so much) Great video!!! 🔥

@BackBaird-y8l 13 дней назад

Mertie Flat

@LauraJohnson-f3v 9 дней назад

Price Tunnel

@marcospiotto9755 4 месяца назад

What is the difference between denoting p_theta (x|z) vs p(x|z,theta) ?

@MeerkatStatistics 4 месяца назад

I think "subscript" theta is just the standard way of denoting when we are optimizing theta, that is we are changing theta. While "conditioned on" theta is usually when the theta's are given. Also note that the subscript theta refers to the NN parameters, while often the "conditioned on" refers to distributional parameters. I don't think these are rules set in stone, though, and I'm not an expert in notation. As long as you understand what's going on - that's the important part.

@marcospiotto9755 4 месяца назад

@@MeerkatStatistics got it, thanks!