Improving Variational Autoencoder Estimation from Incomplete Data with Mixture Variational Families (2403.03069v2)
Abstract: We consider the task of estimating variational autoencoders (VAEs) when the training data is incomplete. We show that missing data increases the complexity of the model's posterior distribution over the latent variables compared to the fully-observed case. The increased complexity may adversely affect the fit of the model due to a mismatch between the variational and model posterior distributions. We introduce two strategies based on (i) finite variational-mixture and (ii) imputation-based variational-mixture distributions to address the increased posterior complexity. Through a comprehensive evaluation of the proposed approaches, we show that variational mixtures are effective at improving the accuracy of VAE estimation from incomplete data.
- Importance Weighted Autoencoders. In International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, September 2015.
- Reinterpreting Importance-Weighted Autoencoders. In ICLR Workshop, February 2017.
- Inference Suboptimality in Variational Autoencoders. In International Conference on Machine Learning (ICML), May 2018.
- Maximum Likelihood from Incomplete Data Via the EM Algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22, 1977. doi: 10.1111/j.2517-6161.1977.tb01600.x.
- Density estimation using Real NVP. In International Conference on Learning Representations (ICLR), February 2017.
- UCI Machine Learning Repository, 2017.
- Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models. In International Conference on Learning Representations (ICLR), December 2017.
- Implicit Reparameterization Gradients. In Advances in Neural Information Processing Systems (NeurIPS), January 2019.
- Methods for enhancing neural network handwritten character recognition. In International Joint Conference on Neural Networks (IJCNN), volume 1, pp. 695–700, Seattle, WA, USA, 1991. IEEE. ISBN 978-0-7803-0164-1. doi: 10.1109/IJCNN.1991.155265.
- Amortized Inference in Probabilistic Reasoning. In Annual Meeting of the Cognitive Science Society, volume 36, 2014.
- Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science, 4(2):268–276, February 2018. ISSN 2374-7943. doi: 10.1021/acscentsci.7b00572.
- Generative Adversarial Networks. In Advances in Neural Information Processing Systems (NeurIPS), June 2014.
- Fast Variational Inference in the Conjugate Exponential Family. In Advances in Neural Information Processing Systems (NeurIPS), December 2012.
- GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- Not-MIWAE: Deep Generative Modelling with Missing not at Random Data. In International Conference on Learning Representations (ICLR), June 2020.
- Fast Variational Inference for Gaussian Process Models Through KL-Correction. In European Conference on Machine Learning (ECML), 2006.
- Auto-Encoding Variational Bayes. In International Conference on Learning Representations (ICLR), December 2013.
- Structured Inference Networks for Nonlinear State Space Models. In AAAI Conference on Artificial Intelligence, December 2016. doi: 10.48550/arXiv.1609.09869.
- Cooperation in the Latent Space: The Benefits of Adding Mixture Components in Variational Autoencoders. In International Conference on Machine Learning (ICML), July 2023. doi: 10.48550/arXiv.2209.15514.
- Human-level concept learning through probabilistic program induction. Science, 2015. doi: 10.1126/science.aab3050.
- Variational heteroscedastic Gaussian process regression. In International Conference on Machine Learning (ICML), June 2011.
- Roderick J. A. Little and Donald B. Rubin. Statistical Analysis with Missing Data: Second Edition. Wiley-Interscience, 2002. ISBN 0-471-18386-5.
- Identifiable Generative Models for Missing Not at Random Data Imputation. In Advances in Neural Information Processing Systems (NeurIPS), October 2021.
- EDDI: Efficient dynamic discovery of high-value information with partial VAE. In International Conference on Machine Learning (ICML), pp. 7483–7504, 2019. ISBN 9781510886988.
- Leveraging the Exact Likelihood of Deep Latent Variable Models. In Advances in Neural Information Processing Systems (NeurIPS), February 2018a.
- Refit your Encoder when New Data Comes by. In Workshop on Bayesian Deep Learning at Neural Information Processing Systems (NeurIPS), pp. 4, Montreal, Canada, 2018b.
- MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets. In International Conference on Machine Learning (ICML), 2019.
- Xiao-Li Meng. On the Rate of Convergence of the ECM Algorithm. The Annals of Statistics, 22(1):326–339, March 1994. ISSN 0090-5364, 2168-8966. doi: 10.1214/aos/1176325371.
- On Incorporating Inductive Biases into VAEs. In International Conference on Learning Representations (ICLR), February 2022. doi: 10.48550/arXiv.2106.13746.
- Automatic Differentiation Variational Inference with Mixtures. In International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 3250–3258. PMLR, March 2021.
- Handling Incomplete Heterogeneous Data using VAEs. Pattern Recognition, 107, 2020. ISSN 0031-3203. doi: 10.1016/j.patcog.2020.107501.
- Masked Autoregressive Flow for Density Estimation. Advances in Neural Information Processing Systems (NeurIPS), 30, 2017.
- Normalizing Flows for Probabilistic Modeling and Inference. Journal of Machine Learning Research, 22(57):1–64, 2021.
- Tighter Variational Bounds are Not Necessarily Better. In International Conference on Machine Learning (ICML), March 2019.
- On the convergence of Adam and beyond. In International Conference on Learning Representations (ICLR), pp. 1–23, 2018.
- Stochastic Backpropagation and Approximate Inference. In International Conference on Machine Learning (ICML), Beijing, China, 2014.
- Monte Carlo Statistical Methods. Springer, 2004. ISBN 0-387-21239-6.
- Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference. In Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017.
- Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling. Transactions on Machine Learning Research, August 2023. ISSN 2835-8856.
- Variational Gibbs Inference for Statistical Model Estimation from Incomplete Data. Journal of Machine Learning Research, 24(196):1–72, 2023. ISSN 1533-7928.
- Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Machine Learning (ICML), November 2015. doi: 10.48550/arXiv.1503.03585.
- Posterior Consistency for Missing Data in Variational Autoencoders. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), October 2023. doi: 10.48550/arXiv.2310.16648.
- Tijmen Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. In International Conference on Machine Learning (ICML), pp. 1064–1071, 2008. ISBN 9781605582054. doi: 10.1145/1390156.1390290.
- Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives. In International Conference on Learning Representations (ICLR), November 2018.
- Generative Models of Visually Grounded Imagination. International Conference on Learning Representations (ICLR), May 2017.
- Greg C. G. Wei and Martin A. Tanner. A Monte Carlo Implementation of the EM Algorithm and the Poor Man’s Data Augmentation Algorithms. Journal of the American Statistical Association, 85(411):699–704, September 1990. doi: 10.1080/01621459.1990.10474930.
- Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case. arXiv preprint, 1801.03851, January 2018.
- Multimodal Generative Models for Scalable Weakly-Supervised Learning. In NeurIPS 2018, February 2018.
- Laurent Younes. On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics and Stochastic Reports, 65(3-4):177–228, February 1999. ISSN 1045-1129. doi: 10.1080/17442509908834179.
- Generalization Gap in Amortized Inference. In Workshop on Bayesian Deep Learning at Neural Information Processing Systems (NeurIPS), pp. 6, 2021.