No video :(

Lisa Kreusser - Unveiling the role of the Wasserstein Distance in Generative Modelling

Подписаться 1,9 тыс.

50% 1

Abstract: Generative models have become very popular over the last few years in the machine learning community. These are generally based on likelihood based models (e.g. variational autoencoders), implicit models (e.g. generative adversarial networks), as well as score-based models. As part of this talk, I will provide insights into our recent research in this field focussing on (i) Wasserstein generative adversarial networks and (ii) score-based diffusion models. Wasserstein GANs (WGANs) were originally motivated by the idea of minimising the Wasserstein distance between a real and a generated distribution. We gather theoretical and empirical evidence that the WGAN loss is not a meaningful approximation of neither the distributional nor the batch Wasserstein distance, and argue that the success of WGANs can be attributed to the failure to approximate the batch Wasserstein distance. Score-based diffusion models have emerged as one of the most promising frameworks for deep generative modelling, due to their state-of-the art performance in many generation tasks while relying on mathematical foundations such as stochastic differential equations (SDEs) and ordinary differential equations (ODEs). We systematically analyse the difference between the ODE and SDE dynamics of score-based diffusion models, link it to an associated Fokker-Planck equation, and provide a theoretical upper bound on the Wasserstein 2-distance between the ODE- and SDE-induced distributions in terms of a Fokker-Planck residual. We also show numerically that reducing the Fokker-Planck residual by adding it as an additional regularisation term leads to closing the gap between ODE- and SDE-induced distributions. Our experiments suggest that this regularisation can improve the distribution generated by the ODE, however that this can come at the cost of degraded SDE sample quality.