Mixed Variational Flows for Discrete Variables (2308.15613v3)
Abstract: Variational flows allow practitioners to learn complex continuous distributions, but approximating discrete distributions remains a challenge. Current methodologies typically embed the discrete target in a continuous space - usually via continuous relaxation or dequantization - and then apply a continuous flow. These approaches involve a surrogate target that may not capture the original discrete target, might have biased or unstable gradients, and can create a difficult optimization problem. In this work, we develop a variational flow family for discrete distributions without any continuous embedding. First, we develop a measure-preserving and discrete (MAD) invertible map that leaves the discrete target invariant, and then create a mixed variational flow (MAD Mix) based on that map. Our family provides access to i.i.d. sampling and density evaluation with virtually no tuning effort. We also develop an extension to MAD Mix that handles joint discrete and continuous models. Our experiments suggest that MAD Mix produces more reliable approximations than continuous-embedding flows while being significantly faster to train.
- G. D. Birkhoff. Proof of the ergodic theorem. Proceedings of the National Academy of Sciences, 17(12):656–660, 1931.
- C. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
- Variational inference: a review for statisticians. Journal of the American Statistical Association, 112(518):859–877, 2017.
- Semi-discrete normalizing flows through differentiable tessellation. In Advances in Neural Information Processing Systems, 2022.
- Herded Gibbs sampling. The Journal of Machine Learning Research, 17(1):263–291, 2016.
- E. Çinlar. Probability and Stochastics. Springer, 2011.
- sparsevb: Spike-and-Slab Variational Bayes for Linear and Logistic Regression, 2021. URL https://CRAN.R-project.org/package=sparsevb.
- F. Dablander. Variable selection using Gibbs sampling, 2019. URL https://fabiandablander.com/r/Spike-and-Slab.
- L. Devroye. Sample-Based Non-Uniform Random Variate Generation. Springer, 1986.
- NICE: Non-linear independent components estimation. In International Conference on Learning Representations, Workshop Track, 2015.
- Density estimation using Real NVP. In International Conference on Learning Representations, 2017.
- Operator Theoretic Aspects of Ergodic Theory. Springer, 2015.
- A. Gelfand and A. Smith. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85(410):398–409, 1990.
- S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741, 1984.
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2nd edition, 2009.
- Integer discrete flows and lossless compression. In Advances in Neural Information Processing Systems, 2019.
- Learning discrete distributions by dequantization. In Advances in Approximate Bayesian Inference, 2021a.
- Argmax flows and multinomial diffusion: learning categorical distributions. In Advances in Neural Information Processing Systems, 2021b.
- Neural autoregressive flows. In International Conference on Machine Learning, 2018.
- Augmented normalizing flows: bridging the gap between generative flows and latent variable models. arXiv:2002.07101, 2020.
- Categorical reparameterization with Gumbel-Softmax. In International Conference on Learning Representations, 2017.
- An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999.
- D. Kingma and J. Ba. Adam: a method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Normalizing flows: an introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11):3964–3979, 2020.
- Z. Kong and K. Chaudhuri. The expressive power of a class of normalizing flow models. In International Conference on Artificial Intelligence and Statistics, 2020.
- Automatic differentiation variational inference. Journal of Machine Learning Research, 2017.
- S. Kullback and R. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79–86, 1951.
- Universal approximation for log-concave distributions using well-conditioned normalizing flows. In Advances in Neural Information Processing Systems, 2021.
- P. Lippe and E. Gavves. Categorical normalizing flows via continuous transformations. In International Conference on Learning Representations, 2021.
- The Concrete distribution: a continuous relaxation of discrete random variables. In International Conference on Learning Representations, 2017.
- I. Murray and L. Elliot. Driving Markov chain Monte Carlo with a dependent random stream. arXiv:1204.3187, 2012.
- R. Neal. How to view an MCMC simulation as a permutation, with applications to parallel simulation and improved importance sampling. Technical report, University of Toronto, 2012.
- K. Neklyudov and M. Welling. Orbital MCMC. In International Conference on Artificial Intelligence and Statistics, 2022.
- Involutive MCMC: a unifying framework. In International Conference on Machine Learning, 2020.
- Deterministic Gibbs sampling via ordinary differential equations. arXiv:2106.10188, 2021.
- SurVAE flows: surjections to bridge the gap between VAEs and flows. In Advances in Neural Information Processing Systems, 2020.
- M. Njoroge. On Jacobians connected with matrix variate random variables. Master’s thesis, McGill University, 1988.
- Normalizing flows for probabilistic modeling and inference. The Journal of Machine Learning Research, 22(1):2617–2680, 2021.
- PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, 2019.
- K. Petersen and M. Pedersen. The matrix cookbook. Technical University of Denmark, 7(15), 2008.
- Black box variational inference. In International Conference on Artificial Intelligence and Statistics, 2014.
- K. Ray and B. Szabó. Variational Bayes for high-dimensional linear regression with sparse priors. Journal of the American Statistical Association, 117(539), 2022.
- D. Rezende and S. Mohamed. Variational inference with normalizing flows. In International Conference on Machine Learning, 2015.
- Deterministic Langevin Monte Carlo with normalizing flows for Bayesian inference. In Advances in Neural Information Processing Systems, 2022.
- E. Tabak and C. Turner. A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2):145–164, 2013.
- A note on the evaluation of generative models. In International Conference on Learning Representations, 2016.
- J. Tomczak. General invertible transformations for flow-based generative modeling. In International Conference on Learning Representations, Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models Workshop, 2021.
- Discrete flows: invertible generative models of discrete data. In Advances in Neural Information Processing Systems, 2019.
- UBC Advanced Research Computing. UBC ARC Sockeye, 2023, 2023. URL https://doi.org/10.14288/SOCKEYE.
- Rnade: the real-valued neural autoregressive density-estimator. Advances in Neural Information Processing Systems, 2013.
- IDF++: analyzing and improving integer discrete flows for lossless compression. In International Conference on Learning Representations, 2021.
- G. Ver Steeg and A. Galstyan. Hamiltonian dynamics with non-Newtonian momentum for rapid sampling. In Advances in Neural Information Processing Systems, 2021.
- M. Wainwright and M. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305, 2008.
- L. Wolf and M. Baum. Deterministic Gibbs sampling for data association in multi-object tracking. In IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pages 291–296, 2020.
- MixFlows: principled variational inference via mixed flows. In International Conference on Machine Learning, 2023.
- Approximation capabilities of neural ODEs and invertible residual networks. In International Conference on Machine Learning, 2020.
- iVPF: numerical invertible volume preserving flow for efficient lossless compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- Z. Ziegler and A. Rush. Latent normalizing flows for discrete sequences. In International Conference on Machine Learning, 2019.
- Gian Carlo Diluvi (2 papers)
- Benjamin Bloem-Reddy (16 papers)
- Trevor Campbell (50 papers)