Chain of Log-Concave Markov Chains (2305.19473v2)
Abstract: We introduce a theoretical framework for sampling from unnormalized densities based on a smoothing scheme that uses an isotropic Gaussian kernel with a single fixed noise scale. We prove one can decompose sampling from a density (minimal assumptions made on the density) into a sequence of sampling from log-concave conditional densities via accumulation of noisy measurements with equal noise levels. Our construction is unique in that it keeps track of a history of samples, making it non-Markovian as a whole, but it is lightweight algorithmically as the history only shows up in the form of a running empirical mean of samples. Our sampling algorithm generalizes walk-jump sampling (Saremi & Hyv\"arinen, 2019). The "walk" phase becomes a (non-Markovian) chain of (log-concave) Markov chains. The "jump" from the accumulated measurements is obtained by empirical Bayes. We study our sampling algorithm quantitatively using the 2-Wasserstein metric and compare it with various Langevin MCMC algorithms. We also report a remarkable capacity of our algorithm to "tunnel" between modes of a distribution.
- Sliced and radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 51:22–45, 2015.
- Complexity of randomized algorithms for underdamped Langevin dynamics. arXiv preprint arXiv:2003.09906, 2020.
- Underdamped Langevin MCMC: A non-asymptotic analysis. In Conference on Learning Theory, pp. 300–323, 2018.
- Arnak S. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society. Series B (Statistical Methodology), pp. 651–676, 2017.
- Nonasymptotic convergence analysis for the unadjusted Langevin algorithm. The Annals of Applied Probability, 27(3):1551 – 1587, 2017.
- Log-concave sampling: Metropolis-Hastings algorithms are fast! In Conference on Learning Theory, pp. 793–797, 2018.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr):695–709, 2005.
- Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
- Optimization by simulated annealing. Science, 220(4598):671–680, 1983.
- Molecular dynamics: With deterministic and stochastic numerical methods. Interdisciplinary Applied Mathematics, 39:443, 2015.
- Sqrt(d) dimension dependence of Langevin Monte Carlo. In International Conference on Learning Representations, 2022.
- Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092, 1953.
- Koichi Miyasawa. An empirical Bayes estimator of the mean of a normal population. Bulletin of the International Statistical Institute, 38(4):181–188, 1961.
- High-order Langevin diffusion yields an accelerated MCMC algorithm. Journal of Machine Learning Research, 22(1):1919–1959, 2021.
- Radford M. Neal. Bayesian learning for neural networks. 1995. PhD thesis, University of Toronto.
- Radford M. Neal. Annealed importance sampling. Statistics and Computing, 11:125–139, 2001.
- Giorgio Parisi. Correlation functions and computer simulations. Nuclear Physics B, 180(3):378–384, 1981.
- Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
- Herbert Robbins. An empirical Bayes approach to statistics. In Proc. Third Berkeley Symp., volume 1, pp. 157–163, 1956.
- Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli, pp. 341–363, 1996.
- Langevin dynamics with variable coefficients and nonconservative forces: from stationary states to numerical methods. Entropy, 19(12):647, 2017.
- Neural empirical Bayes. Journal of Machine Learning Research, 20(181):1–23, 2019.
- Multimeasurement generative models. In International Conference on Learning Representations, 2022.
- The randomized midpoint method for log-concave sampling. Advances in Neural Information Processing Systems, 32, 2019.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp. 2256–2265, 2015.
- Non-reversible parallel tempering: a scalable highly parallel MCMC scheme. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):321–350, 2022.
- Cédric Villani. Topics in Optimal Transportation, volume 58. American Mathematical Society, 2021.
- Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305, 2008.
- Saeed Saremi (21 papers)
- Ji Won Park (21 papers)
- Francis Bach (249 papers)