Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sequential transport maps using SoS density estimation and $α$-divergences (2402.17943v2)

Published 27 Feb 2024 in stat.ML and cs.LG

Abstract: Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density. We further invertigate the sequential transport maps framework proposed from arXiv:2106.04170 arXiv:2303.02554, which builds on a sequence of composed Knothe-Rosenblatt (KR) maps. Each of those maps are built by first estimating an intermediate density of moderate complexity, and then by computing the exact KR map from a reference density to the precomputed approximate density. In our work, we explore the use of Sum-of-Squares (SoS) densities and $\alpha$-divergences for approximating the intermediate densities. Combining SoS densities with $\alpha$-divergence interestingly yields convex optimization problems which can be efficiently solved using semidefinite programming. The main advantage of $\alpha$-divergences is to enable working with unnormalized densities, which provides benefits both numerically and theoretically. In particular, we provide a new convergence analyses of the sequential transport maps based on information geometric properties of $\alpha$-divergences. The choice of intermediate densities is also crucial for the efficiency of the method. While tempered (or annealed) densities are the state-of-the-art, we introduce diffusion-based intermediate densities which permits to approximate densities known from samples only. Such intermediate densities are well-established in machine learning for generative modeling. Finally we propose low-dimensional maps (or lazy maps) for dealing with high-dimensional problems and numerically demonstrate our methods on Bayesian inference problems and unsupervised learning tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Shun-ichi Amari “Information Geometry and Its Applications” 194, Applied Mathematical Sciences Tokyo: Springer Japan, 2016 DOI: 10.1007/978-4-431-55978-8
  2. Ricardo Baptista, Michael C Brennan and Youssef Marzouk “Dimension reduction via score ratio matching”, 2022
  3. Ricardo Baptista, Youssef Marzouk and Olivier Zahm “On the representation and learning of monotone triangular transport maps” In Foundations of Computational Mathematics Springer, 2023, pp. 1–46
  4. “Greedy inference with structure-exploiting lazy maps” In 34th Conference on Neural Information Processing Systems, 2020
  5. “Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities” In Entropy 12.6, 2010, pp. 1532–1568 DOI: 10.3390/e12061532
  6. Chris Coey, Lea Kapelevich and Juan Pablo Vielma “Solving natural conic formulations with Hypatia.jl” In INFORMS Journal on Computing 34.5 INFORMS, 2022, pp. 2686–2699 DOI: https://doi.org/10.1287/ijoc.2022.1202
  7. “Multivariate Approximation in Downward Closed Polynomial Spaces” In Contemporary Computational Mathematics - A Celebration of the 80th Birthday of Ian Sloan Cham: Springer International Publishing, 2018, pp. 233–282 DOI: 10.1007/978-3-319-72456-0˙12
  8. “Deep composition of tensor-trains using squared inverse Rosenblatt transports” arXiv:2007.06968 [cs, math, stat] In Foundations of Computational Mathematics, 2021 DOI: 10.1007/s10208-021-09537-5
  9. Tiangang Cui, Sergey Dolgov and Robert Scheichl “Deep Importance Sampling Using Tensor Trains with Application to a Priori and a Posteriori Rare Events” In SIAM Journal on Scientific Computing 46.1 SIAM, 2024, pp. C1–C29
  10. Tiangang Cui, Sergey Dolgov and Olivier Zahm “Scalable conditional deep inverse Rosenblatt transports using tensor trains and gradient-based dimension reduction” In Journal of Computational Physics 485, 2023, pp. 112103 DOI: https://doi.org/10.1016/j.jcp.2023.112103
  11. Tiangang Cui, Sergey Dolgov and Olivier Zahm “Self-reinforced polynomial approximation methods for concentrated probability densities” arXiv:2303.02554 [cs, math, stat] arXiv, 2023 URL: http://arxiv.org/abs/2303.02554
  12. Pierre Del Moral, Arnaud Doucet and Ajay Jasra “Sequential monte carlo samplers” In Journal of the Royal Statistical Society Series B: Statistical Methodology 68.3 Oxford University Press, 2006, pp. 411–436
  13. Pierre Del Moral, Arnaud Doucet and Ajay Jasra “Sequential Monte Carlo Samplers” In Journal of the Royal Statistical Society Series B: Statistical Methodology 68.3, 2006, pp. 411–436 DOI: 10.1111/j.1467-9868.2006.00553.x
  14. Laurent Dinh, Jascha Sohl-Dickstein and Samy Bengio “Density estimation using Real NVP” arXiv:1605.08803 [cs, stat] arXiv, 2017 URL: http://arxiv.org/abs/1605.08803
  15. “Approximation and sampling of multivariate probability distributions in the tensor train decomposition” In Statistics and Computing 30 Springer, 2020, pp. 603–625
  16. Loris Felardos, Jérôme Hénin and Guillaume Charpiat “Designing losses for data-free training of normalizing flows on Boltzmann distributions” arXiv:2301.05475 [cond-mat] arXiv, 2023 URL: http://arxiv.org/abs/2301.05475
  17. “CVX: Matlab Software for Disciplined Convex Programming, version 2.1”, https://cvxr.com/cvx, 2014
  18. “On Sampling with Approximate Transport Maps” In arXiv preprint arXiv:2302.04763, 2023
  19. Priyank Jaini, Kira A Selby and Yaoliang Yu “Sum-of-squares polynomial flow” In International Conference on Machine Learning, 2019, pp. 3009–3018 PMLR
  20. William Ogilvy Kermack, A. G. McKendrick and Gilbert Thomas Walker “A contribution to the mathematical theory of epidemics” _eprint: https://royalsocietypublishing.org/doi/pdf/10.1098/rspa.1927.0118 In Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character 115.772, 1927, pp. 700–721 DOI: 10.1098/rspa.1927.0118
  21. Tom H. Koornwinder “Orthogonal polynomials, a short introduction” arXiv:1303.2825 [math], 2013, pp. 145–170 DOI: 10.1007/978-3-7091-1616-6˙6
  22. Jean-Bernard Lasserre “Moments, positive polynomials and their applications” OCLC: ocn503631126, Imperial College Press optimization series v. 1 London : Signapore ; Hackensack, NJ: Imperial College Press ; Distributed by World Scientific Publishing Co, 2010
  23. Steffen L. Lauritzen “Graphical models”, Oxford statistical science series 17 Oxford : New York: Clarendon Press ; Oxford University Press, 1996
  24. “JuMP 1.0: Recent improvements to a modeling language for mathematical optimization” In Mathematical Programming Computation, 2023 DOI: 10.1007/s12532-023-00239-3
  25. Ulysse Marteau-Ferey, Francis Bach and Alessandro Rudi “Non-Parametric Models for Non-Negative Functions” event-place: Vancouver, BC, Canada In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20 Red Hook, NY, USA: Curran Associates Inc., 2020
  26. “An introduction to sampling via measure transport” arXiv:1602.05023 [math, stat], 2016, pp. 1–41 DOI: 10.1007/978-3-319-11259-6˙23-1
  27. Thomas Minka “Divergence measures and message passing” In Microsoft Research Technical Report, 2005
  28. Frank Nielsen “An elementary introduction to information geometry” arXiv:1808.08271 [cs, math, stat] In Entropy 22.10, 2020, pp. 1100 DOI: 10.3390/e22101100
  29. “Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning” In Science 365.6457 American Association for the Advancement of Science, 2019, pp. eaaw1147
  30. “Normalizing Flows for Probabilistic Modeling and Inference” arXiv:1912.02762 [cs, stat] arXiv, 2021 URL: http://arxiv.org/abs/1912.02762
  31. Matthew D Parno and Youssef M Marzouk “Transport map accelerated markov chain monte carlo” In SIAM/ASA Journal on Uncertainty Quantification 6.2 SIAM, 2018, pp. 645–682
  32. “Computational optimal transport: With applications to data science” In Foundations and Trends® in Machine Learning 11.5-6 Now Publishers, Inc., 2019, pp. 355–607
  33. Mihai Putinar “Positive Polynomials on Compact Semi-algebraic Sets” Publisher: Indiana University Mathematics Department In Indiana University Mathematics Journal 42.3, 1993, pp. 969–984 URL: http://www.jstor.org/stable/24897130
  34. “Variational inference with normalizing flows” In International conference on machine learning, 2015, pp. 1530–1538 PMLR
  35. Danilo Jimenez Rezende and Shakir Mohamed “Variational Inference with Normalizing Flows” arXiv:1505.05770 [cs, stat] arXiv, 2016 URL: http://arxiv.org/abs/1505.05770
  36. Murray Rosenblatt “Remarks on a Multivariate Transformation” In The Annals of Mathematical Statistics 23.3, 1952, pp. 470–472 DOI: 10.1214/aoms/1177729394
  37. “Riemannian SOS-Polynomial Normalizing Flows” Series Title: Lecture Notes in Computer Science In Pattern Recognition 12544 Cham: Springer International Publishing, 2021, pp. 218–231 DOI: 10.1007/978-3-030-71278-5˙16
  38. Jie Shen, Li-Lian Wang and Haijun Yu “Approximations by orthonormal mapped Chebyshev functions for higher-dimensional problems in unbounded domains” In Journal of Computational and Applied Mathematics 265, 2014, pp. 264–275 DOI: 10.1016/j.cam.2013.09.024
  39. “Deep unsupervised learning using nonequilibrium thermodynamics” In International conference on machine learning, 2015, pp. 2256–2265 PMLR
  40. “Score-Based Generative Modeling through Stochastic Differential Equations” In International Conference on Learning Representations, 2021 URL: https://openreview.net/forum?id=PxTIG12RRHS
  41. Alessio Spantini, Daniele Bigoni and Youssef Marzouk “Inference via low-dimensional couplings” arXiv:1703.06131 [stat] arXiv, 2018 URL: http://arxiv.org/abs/1703.06131
  42. Benigno Uria, Iain Murray and Hugo Larochelle “RNADE: The real-valued neural autoregressive density-estimator” arXiv:1306.0186 [cs, stat] arXiv, 2014 URL: http://arxiv.org/abs/1306.0186
  43. Cédric Villani “Optimal transport: old and new” Springer, 2009
  44. “Measure transport via polynomial density surrogates” arXiv:2311.04172 [cs, math, stat] arXiv, 2023 URL: http://arxiv.org/abs/2311.04172
  45. “Bayesian invariant measurements of generalization” In Neural Processing Letters 2.6, 1995, pp. 28–31 DOI: 10.1007/BF02309013

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets