On the natural gradient of the evidence lower bound

This article studies the Fisher-Rao gradient, also referred to as the natural gradient, of the evidence lower bound (ELBO) which plays a central role in generative machine learning. It reveals that the gap between the evidence and its lower bound, the ELBO, has essentially a vanishing natural gradient within unconstrained optimization. As a result, maximization of the ELBO is equivalent to minimization of the Kullback-Leibler divergence from a target distribution, the primary objective function of learning. Building on this insight, we derive a condition under which this equivalence persists even when optimization is constrained to a model. This condition yields a geometric characterization, which we formalize through the notion of a cylindrical model .

Subjects

Evidence lower bound

variational gap

natural gradient

information geometry

variational inference

DDC Class

006.3: Artificial Intelligence

519: Applied Mathematics, Probabilities

Lizenz

https://creativecommons.org/licenses/by/4.0/

Publication version

publishedVersion

Name

24-0606.pdf

Type

Main Article

Size

1.15 MB

Format

Adobe PDF

Options

On the natural gradient of the evidence lower bound