Skip to main content

Vanilla VAE

Key Insight

A vanilla VAE upgrades a plain autoencoder with one crucial twist: its encoder outputs a small cloud of possibility for each image rather than a single point, and it is trained on the ELBO, which gently presses all those clouds to pile up under one standard bell curve. Once trained, you can ignore the encoder entirely, draw a random point straight from that bell curve, decode it, and get a brand-new digit — something a plain autoencoder simply cannot do. The price is blur: because the model averages over every plausible reconstruction and the sampling step adds noise, VAE samples look softer than the originals. The reparameterization trick is the small piece of math that makes all this trainable, letting gradients pass through the random sampling step. Comparing its fuzzy samples to the sharp-but-uncreative autoencoder shows you the central trade of generative modeling laid bare.