Reliable Categorical Variational Inference with Mixture of Discrete Normalizing Flows (2006.15568v2)

Published 28 Jun 2020 in cs.LG and stat.ML

Abstract: Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling. Handling discrete latent variables is then challenging because the sampling process is not differentiable. Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations. In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one, causing problems especially with models having strong meaningful priors. We provide an alternative differentiable reparameterization for categorical distribution by composing it as a mixture of discrete normalizing flows. It defines a proper discrete distribution, allows directly optimizing the evidence lower bound, and is less sensitive to the hyperparameter controlling relaxation.