Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chain of Log-Concave Markov Chains (2305.19473v2)

Published 31 May 2023 in stat.ML, cs.LG, and stat.CO

Abstract: We introduce a theoretical framework for sampling from unnormalized densities based on a smoothing scheme that uses an isotropic Gaussian kernel with a single fixed noise scale. We prove one can decompose sampling from a density (minimal assumptions made on the density) into a sequence of sampling from log-concave conditional densities via accumulation of noisy measurements with equal noise levels. Our construction is unique in that it keeps track of a history of samples, making it non-Markovian as a whole, but it is lightweight algorithmically as the history only shows up in the form of a running empirical mean of samples. Our sampling algorithm generalizes walk-jump sampling (Saremi & Hyv\"arinen, 2019). The "walk" phase becomes a (non-Markovian) chain of (log-concave) Markov chains. The "jump" from the accumulated measurements is obtained by empirical Bayes. We study our sampling algorithm quantitatively using the 2-Wasserstein metric and compare it with various Langevin MCMC algorithms. We also report a remarkable capacity of our algorithm to "tunnel" between modes of a distribution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Sliced and radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 51:22–45, 2015.
  2. Complexity of randomized algorithms for underdamped Langevin dynamics. arXiv preprint arXiv:2003.09906, 2020.
  3. Underdamped Langevin MCMC: A non-asymptotic analysis. In Conference on Learning Theory, pp.  300–323, 2018.
  4. Arnak S. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society. Series B (Statistical Methodology), pp.  651–676, 2017.
  5. Nonasymptotic convergence analysis for the unadjusted Langevin algorithm. The Annals of Applied Probability, 27(3):1551 – 1587, 2017.
  6. Log-concave sampling: Metropolis-Hastings algorithms are fast! In Conference on Learning Theory, pp.  793–797, 2018.
  7. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  8. Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr):695–709, 2005.
  9. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
  10. Optimization by simulated annealing. Science, 220(4598):671–680, 1983.
  11. Molecular dynamics: With deterministic and stochastic numerical methods. Interdisciplinary Applied Mathematics, 39:443, 2015.
  12. Sqrt(d) dimension dependence of Langevin Monte Carlo. In International Conference on Learning Representations, 2022.
  13. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092, 1953.
  14. Koichi Miyasawa. An empirical Bayes estimator of the mean of a normal population. Bulletin of the International Statistical Institute, 38(4):181–188, 1961.
  15. High-order Langevin diffusion yields an accelerated MCMC algorithm. Journal of Machine Learning Research, 22(1):1919–1959, 2021.
  16. Radford M. Neal. Bayesian learning for neural networks. 1995. PhD thesis, University of Toronto.
  17. Radford M. Neal. Annealed importance sampling. Statistics and Computing, 11:125–139, 2001.
  18. Giorgio Parisi. Correlation functions and computer simulations. Nuclear Physics B, 180(3):378–384, 1981.
  19. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
  20. Herbert Robbins. An empirical Bayes approach to statistics. In Proc. Third Berkeley Symp., volume 1, pp.  157–163, 1956.
  21. Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli, pp.  341–363, 1996.
  22. Langevin dynamics with variable coefficients and nonconservative forces: from stationary states to numerical methods. Entropy, 19(12):647, 2017.
  23. Neural empirical Bayes. Journal of Machine Learning Research, 20(181):1–23, 2019.
  24. Multimeasurement generative models. In International Conference on Learning Representations, 2022.
  25. The randomized midpoint method for log-concave sampling. Advances in Neural Information Processing Systems, 32, 2019.
  26. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp. 2256–2265, 2015.
  27. Non-reversible parallel tempering: a scalable highly parallel MCMC scheme. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):321–350, 2022.
  28. Cédric Villani. Topics in Optimal Transportation, volume 58. American Mathematical Society, 2021.
  29. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Saeed Saremi (21 papers)
  2. Ji Won Park (21 papers)
  3. Francis Bach (249 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com