Pareto Smoothed Importance Sampling (1507.02646v9)

Published 9 Jul 2015 in stat.CO, stat.ME, and stat.ML

Abstract: Importance weighting is a general way to adjust Monte Carlo integration to account for draws from the wrong distribution, but the resulting estimate can be highly variable when the importance ratios have a heavy right tail. This routinely occurs when there are aspects of the target distribution that are not well captured by the approximating distribution, in which case more stable estimates can be obtained by modifying extreme importance ratios. We present a new method for stabilizing importance weights using a generalized Pareto distribution fit to the upper tail of the distribution of the simulated importance ratios. The method, which empirically performs better than existing methods for stabilizing importance sampling estimates, includes stabilized effective sample size estimates, Monte Carlo error estimates, and convergence diagnostics. The presented Pareto $\hat{k}$ finite sample convergence rate diagnostic is useful for any Monte Carlo estimator.

Citations (225)

View on Semantic Scholar

Summary

The paper presents Pareto Smoothed Importance Sampling to reduce high variance in Monte Carlo estimates by smoothing extreme importance weights.
It fits a generalized Pareto distribution to the tail of importance ratios, yielding a superior bias-variance tradeoff compared to traditional methods.
The approach enhances diagnostics and effective sample size assessments, proving beneficial for robust Bayesian computations in high-dimensional settings.

An Overview of Pareto Smoothed Importance Sampling

The paper entitled "Pareto Smoothed Importance Sampling" presents a refined approach to importance sampling, a technique widely applied in Monte Carlo methods, especially within Bayesian computation contexts. This paper addresses the perennial challenge in importance sampling: high variance of estimates due to heavy right-tailed distributions of importance ratios, particularly when the proposal distribution poorly approximates the target distribution.

The proposed methodology, Pareto Smoothed Importance Sampling (PSIS), introduces a mechanism to stabilize and reduce the variability of importance weights. PSIS achieves this through fitting a generalized Pareto distribution to the tail of the empirical importance ratio distribution. This approach offers several advantages over traditional methods by providing stabilized estimates, allowing effective sample size calculations, and offering convergence diagnostics that are generally applicable to any Monte Carlo estimator.

Summary of Methodology and Results

Importance sampling modifies Monte Carlo integration estimates by weighting samples according to the ratio of the target and proposal distributions. However, when the proposal distribution inadequately represents the target, it results in heavy-tailed importance-weighted estimates with potentially infinite variance. This paper acknowledges this issue and proposes PSIS as a solution by smoothing the largest importance ratios using a generalized Pareto distribution fitting. The process includes replacing the maximum ratios with expected order statistics, thereby managing the contribution of disproportionately large weights and ensuring the robustness of the estimate.

The methodology is validated through empirical studies that compare PSIS against other methods such as ordinary importance sampling (IS) and truncated importance sampling (TIS). The results indicate that PSIS exhibits consistent superiority in terms of bias-variance tradeoff, providing smaller root mean squared errors (RMSEs) and effective Monte Carlo standard error (MCSE) estimates under various conditions, particularly when dealing with larger importance weight variances.

Implications and Future Directions

By offering a more reliable and diagnostic-rich approach to importance sampling, PSIS significantly enhances the practical application of Bayesian computation models, especially in high-dimensional settings where traditional methods falter. The proposed Pareto $\hat{k}$ diagnostic offers researchers a robust tool to assess the stability and reliability of their Monte Carlo approximations by indicating potentially problematic variance in sampling weights.

PSIS's flexible framework has been effectively applied in several software packages used for Bayesian analysis, demonstrating its essential role in modern probabilistic programming. Additionally, PSIS has become particularly beneficial for speeding up the computation of leave-one-out cross-validation (LOO-CV) in high-dimensional Bayesian models.

Theoretically, PSIS offers insights into addressing the infinite variance issue through probabilistic smoothing techniques and error assessment—effectively expanding the theoretical underpinnings of Monte Carlo methods for complex models.

Conclusion

The Pareto Smoothed Importance Sampling methodology innovatively tackles a long-standing issue within Monte Carlo-based Bayesian computations. PSIS not only promotes stability in importance sampling estimates but also provides meaningful diagnostics to assess and potentially preclude the use of unstable estimators. As computational models continue to grow in complexity and dimensionality, approaches like PSIS will be integral to developing robust, reliable, and efficient probabilistic models. Subsequent research may focus on enhancing the applicability of PSIS across a broader range of models, further cementing its role within the computational statistics toolkit.

Related Papers

GitHub

adjustr: Stan Model Adjustments and Sensitivity Analyses using Importance Sampling • adjustr

Tweets

https://twitter.com/dan_p_simpson/status/1763644089835086258