Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-fidelity Hamiltonian Monte Carlo (2405.05033v1)

Published 8 May 2024 in cs.CE, cs.LG, and stat.ML

Abstract: Numerous applications in biology, statistics, science, and engineering require generating samples from high-dimensional probability distributions. In recent years, the Hamiltonian Monte Carlo (HMC) method has emerged as a state-of-the-art Markov chain Monte Carlo technique, exploiting the shape of such high-dimensional target distributions to efficiently generate samples. Despite its impressive empirical success and increasing popularity, its wide-scale adoption remains limited due to the high computational cost of gradient calculation. Moreover, applying this method is impossible when the gradient of the posterior cannot be computed (for example, with black-box simulators). To overcome these challenges, we propose a novel two-stage Hamiltonian Monte Carlo algorithm with a surrogate model. In this multi-fidelity algorithm, the acceptance probability is computed in the first stage via a standard HMC proposal using an inexpensive differentiable surrogate model, and if the proposal is accepted, the posterior is evaluated in the second stage using the high-fidelity (HF) numerical solver. Splitting the standard HMC algorithm into these two stages allows for approximating the gradient of the posterior efficiently, while producing accurate posterior samples by using HF numerical solvers in the second stage. We demonstrate the effectiveness of this algorithm for a range of problems, including linear and nonlinear Bayesian inverse problems with in-silico data and experimental data. The proposed algorithm is shown to seamlessly integrate with various low-fidelity and HF models, priors, and datasets. Remarkably, our proposed method outperforms the traditional HMC algorithm in both computational and statistical efficiency by several orders of magnitude, all while retaining or improving the accuracy in computed posterior statistics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. The FEniCS project version 1.5. Archive of Numerical Software, 3, 2015. doi: 10.11588/ans.2015.100.20553.
  2. An Introduction to MCMC for Machine Learning. Machine Learning 2003 50:1, 50(1):5–43, jan 2003. ISSN 1573-0565. doi: 10.1023/A:1020281327116. URL https://link.springer.com/article/10.1023/A:1020281327116.
  3. A. Beck and M. Teboulle. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing, 18:2419–2434, 2009.
  4. Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli, 19(5 A):1501–1534, 2013. ISSN 13507265. doi: 10.3150/12-BEJ414. URL https://projecteuclid.org/euclid.bj/1383661192.
  5. Michael Betancourt. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv:1701.02434, 2017.
  6. The geometric foundations of Hamiltonian Monte Carlo. https://doi.org/10.3150/16-BEJ810, 23(4A):2257–2298, nov 2017. ISSN 1350-7265. doi: 10.3150/16-BEJ810. URL https://projecteuclid.org/journals/bernoulli/volume-23/issue-4A/The-geometric-foundations-of-Hamiltonian-Monte-Carlo/10.3150/16-BEJ810.fullhttps://projecteuclid.org/journals/bernoulli/volume-23/issue-4A/The-geometric-foundations-of-Hamiltonian-Monte-Carlo/10.3150/16-BEJ810.short.
  7. Phaistos: A framework for markov chain monte carlo simulation and inference of protein structure. Journal of Computational Chemistry, 34:1697 – 1705, 2013.
  8. Handbook of Markov Chain Monte Carlo. CRC Press, may 2011. ISBN 9781420079425. doi: 10.1201/b10905.
  9. Constrained reaction coordinate dynamics for the simulation of rare events. Chemical Physics Letters, 156(5):472–477, 1989.
  10. Sparse bayesian regularization using bernoulli-laplacian priors. 21st European Signal Processing Conference (EUSIPCO 2013), pages 1–5, 2013.
  11. An Introduction to Total Variation for Image Analysis:, pages 263–340. De Gruyter, 2010. doi: doi:10.1515/9783110226157.263. URL https://doi.org/10.1515/9783110226157.263.
  12. Markov chain monte carlo using an approximation. Journal of Computational and Graphical statistics, 14(4):795–810, 2005.
  13. Michael Creutz. Global monte carlo algorithms for many-fermion systems. Physical Review D, 38(4):1228, 1988.
  14. A dimension-reduced variational approach for solving physics-based inverse problems using generative adversarial network priors and normalizing flows. Computer Methods in Applied Mechanics and Engineering, 420:116682, 2024.
  15. The Bayesian Approach to Inverse Problems. In Handbook of Uncertainty Quantification, pages 311–428. Springer, Cham, jun 2017. doi: 10.1007/978-3-319-12385-1_7. URL https://link.springer.com/referenceworkentry/10.1007/978-3-319-12385-1{_}7.
  16. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Advances in neural information processing systems, 27, 2014.
  17. A monte carlo method for optimal portfolios. The journal of Finance, 58(1):401–446, 2003.
  18. Hybrid monte carlo. Physics Letters B, 195(2):216–222, 1987. ISSN 0370-2693. doi: https://doi.org/10.1016/0370-2693(87)91197-X.
  19. A random polynomial-time algorithm for approximating the volume of convex bodies. Journal of the ACM (JACM), 38(1):1–17, 1991.
  20. Preconditioning Markov Chain Monte Carlo Simulations Using Coarse-Scale Models. http://dx.doi.org/10.1137/050628568, 28(2):776–803, jul 2006. ISSN 10648275. doi: 10.1137/050628568.
  21. Fast and robust multi-frame super-resolution. IEEE Transactions on Image Processing, 1996.
  22. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85(410):398–409, 1990. ISSN 1537274X. doi: 10.1080/01621459.1990.10476213.
  23. Bayesian Data Analysis, Third Edition (Texts in Statistical Science). Book, page 675, 2014.
  24. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on pattern analysis and machine intelligence, (6):721–741, 1984.
  25. A multifidelity multilevel monte carlo method for uncertainty propagation in aerospace applications. In 19th AIAA non-deterministic approaches conference, page 1951, 2017.
  26. Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2):123–214, 2011.
  27. Generative Adversarial Networks. jun 2014. URL http://arxiv.org/abs/1406.2661.
  28. An adaptive Metropolis algorithm. Bernoulli, 7(2):223 – 242, 2001.
  29. Replica-exchange monte carlo scheme for bayesian data analysis. Physical review letters, 94(1):018105, 2005.
  30. Evaluating the performance of parallel subsurface simulators: An illustrative example with pflotran. Water Resources Research, 50:208–228, 2014. doi: 10.1002/2012WR013483.
  31. Arlen W Harbaugh. MODFLOW-2005, the US Geological Survey modular ground-water model: the ground-water flow process, volume 6. US Department of the Interior, US Geological Survey Reston, VA, USA, 2005.
  32. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  33. An adaptive-mcmc scheme for setting trajectory lengths in hamiltonian monte carlo. In Arindam Banerjee and Kenji Fukumizu, editors, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 3907–3915. PMLR, 13–15 Apr 2021. URL https://proceedings.mlr.press/v130/hoffman21a.html.
  34. The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res., 15(1):1593–1623, 2014.
  35. Evaluation of Gaussian approximations for data assimilation in reservoir models. Computational Geosciences, 17:851–885, 2013.
  36. Steady-state hydraulic tomography in a laboratory aquifer with deterministic heterogeneity: Multi-method and multiscale validation of hydraulic conductivity tomograms. Journal of Hydrology, 341(3-4):222–234, 2007.
  37. Fernández-Durán JJ. Circular distributions based on nonnegative trigonometric sums. Biometrics, 60(2):499–503, jun 2004. ISSN 0006-341X. doi: 10.1111/J.0006-341X.2004.00195.X. URL https://pubmed.ncbi.nlm.nih.gov/15180676/.
  38. Statistical and computational inverse problems, volume 160. Springer Science & Business Media, 2006.
  39. The Bayesian Framework for Inverse Problems in Heat Transfer. Heat Transfer Engineering, 32(9):718–753, 2011. doi: 10.1080/01457632.2011.525137. URL https://doi.org/10.1080/01457632.2011.525137.
  40. Multiphysics simulations: Challenges and opportunities. The International Journal of High Performance Computing Applications, 27(1):4–83, 2013. doi: 10.1177/1094342012468181.
  41. Adam: A Method for Stochastic Optimization. dec 2014. URL http://arxiv.org/abs/1412.6980.
  42. Probabilistic graphical models: principles and techniques. MIT press, 2009.
  43. J Lee and PK Kitanidis. Bayesian inversion with total variation prior for discrete geologic structure identification. Water Resources Research, 49(11):7658–7669, 2013.
  44. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
  45. X. Liu and P. K. Kitanidis. Large-scale inverse modeling with an application in hydraulic tomography. Water Resources Research, 47(2):2501, feb 2011. ISSN 1944-7973. doi: 10.1029/2010WR009144. URL https://onlinelibrary-wiley-com.stanford.idm.oclc.org/doi/full/10.1029/2010WR009144https://onlinelibrary-wiley-com.stanford.idm.oclc.org/doi/abs/10.1029/2010WR009144https://agupubs-onlinelibrary-wiley-com.stanford.idm.oclc.org/doi/10.1029/2010WR009144.
  46. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021.
  47. A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Application to Seismic Inversion. SIAM Journal on Scientific Computing, 34(3):A1460–A1487, jan 2012. ISSN 1064-8275. doi: 10.1137/110845598. URL http://epubs.siam.org/doi/10.1137/110845598.
  48. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21:1087, 12 1953. ISSN 0021-9606. doi: 10.1063/1.1699114. URL https://aip.scitation.org/doi/abs/10.1063/1.1699114.
  49. Stochastic Seismic Waveform Inversion Using Generative Adversarial Networks as a Geological Prior. Mathematical Geosciences 2019 52:1, 52(1):53–79, nov 2019. ISSN 1874-8953. doi: 10.1007/S11004-019-09832-6. URL https://link.springer.com/article/10.1007/s11004-019-09832-6.
  50. Radford M. Neal. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, pages 1–592, jun 2012. doi: 10.1201/b10905. URL http://arxiv.org/abs/1206.1901http://dx.doi.org/10.1201/b10905.
  51. On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models. AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pages 5272–5280, mar 2019. ISSN 2159-5399. doi: 10.1609/aaai.v34i04.5973. URL https://arxiv.org/abs/1903.12370v4.
  52. The Bayesian Lasso. https://doi.org/10.1198/016214508000000337, 103(482):681–686, jun 2012. doi: 10.1198/016214508000000337. URL https://www.tandfonline.com/doi/abs/10.1198/016214508000000337.
  53. Transport map accelerated markov chain monte carlo. SIAM/ASA Journal on Uncertainty Quantification, 6(2):645–682, 2018.
  54. Bayesian inference with generative adversarial network priors. arXiv preprint arXiv:1907.09987, 2019.
  55. Variationally mimetic operator networks. Computer Methods in Applied Mechanics and Engineering, 419:116536, 2024.
  56. GAN-Based Priors for Quantifying Uncertainty in Supervised Learning. SIAM/ASA Journal on Uncertainty Quantification, 9(3):1314–1343, jan 2021. doi: 10.1137/20M1354210.
  57. Solution of physics-based bayesian inverse problems with deep generative priors. Computer Methods in Applied Mechanics and Engineering, 400:115428, 2022.
  58. R.-E. Plessix. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications. Geophysical Journal International, 167(2):495–503, 11 2006. ISSN 0956-540X. doi: 10.1111/j.1365-246X.2006.02978.x. URL https://doi.org/10.1111/j.1365-246X.2006.02978.x.
  59. Exponential convergence of langevin distributions and their discrete approximations. Bernoulli, pages 341–363, 1996.
  60. On the use of the adiabatic molecular dynamics technique in the calculation of free energy profiles. The Journal of chemical physics, 116(11):4389–4402, 2002.
  61. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1-4):259–268, nov 1992. ISSN 0167-2789. doi: 10.1016/0167-2789(92)90242-F.
  62. Adaptive, delayed-acceptance mcmc for targets with expensive likelihoods. Journal of Computational and Graphical Statistics, 26(2):434–444, 2017.
  63. Electrical impedance tomography with basis constraints. Inverse Problems, 13(2):523–530, apr 1997. ISSN 0266-5611. doi: 10.1088/0266-5611/13/2/020. URL http://stacks.iop.org/0266-5611/13/i=2/a=020?key=crossref.46559bf45aab26a8302acc14e8db4c89.
  64. Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 681–688. Citeseer, 2011.
  65. Rapid exploration of configuration space with diffusion-map-directed molecular dynamics. The journal of physical chemistry B, 117(42):12769–12776, 2013.
  66. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. Journal of Computational Physics, 394:56–81, oct 2019. ISSN 0021-9991. doi: 10.1016/J.JCP.2019.05.024.

Summary

We haven't generated a summary for this paper yet.