Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Calibrated and Conformal Prediction Improves Bayesian Optimization (2112.04620v5)

Published 8 Dec 2021 in cs.LG and stat.ML

Abstract: Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. The GPyOpt authors. Gpyopt: A bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt, 2016.
  2. Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf.
  3. No-regret bayesian optimization with unknown hyperparameters, 2019.
  4. A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, 2010.
  5. Bayesian optimization for learning gaits under uncertainty. Ann Math Artif Intell, page 5–23, 2016. URL https://doi.org/10.1007/s10472-015-9463-9.
  6. Structure discovery in nonparametric regression through compositional kernel search, 2013.
  7. Conformal prediction for the design problem. CoRR, abs/2202.03613, 2022. URL https://arxiv.org/abs/2202.03613.
  8. Peter I. Frazier. A tutorial on bayesian optimization, 2018.
  9. Bayesian optimization for materials design. Springer Series in Materials Science, page 45–75, Dec 2015. ISSN 2196-2812. doi: 10.1007/978-3-319-23871-5_3. URL http://dx.doi.org/10.1007/978-3-319-23871-5_3.
  10. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021.
  11. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359–378, 2007. doi: 10.1198/016214506000001437. URL https://doi.org/10.1198/016214506000001437.
  12. On calibration of modern neural networks, 2017.
  13. Elad Hazan et al. Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3-4):157–325, 2016.
  14. Online learning for latent dirichlet allocation. In J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 23. Curran Associates, Inc., 2010. URL https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf.
  15. Regression quantiles. Econometrica: journal of the Econometric Society, pages 33–50, 1978.
  16. Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009.
  17. Deep hybrid models: bridging discriminative and generative approaches. In Uncertainty in Artificial Intelligence, 2017a.
  18. Estimating uncertainty online against an adversary. In AAAI, pages 2110–2116, 2017b.
  19. Calibrated structured prediction. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper/2015/file/52d2752b150f9c35ccb6869cbf074e48-Paper.pdf.
  20. Accurate uncertainties for deep learning using calibrated regression, 2018.
  21. Simple and scalable predictive uncertainty estimation using deep ensembles, 2016. URL https://arxiv.org/abs/1612.01474.
  22. Calibrated model-based deep reinforcement learning, 2019.
  23. A general framework for forecast verification. Monthly weather review, 115(7):1330–1338, 1987.
  24. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. URL http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf.
  25. Posterior calibration and exploratory analysis for natural language processing models, 2015.
  26. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning, ICML ’05, page 625–632, New York, NY, USA, 2005. Association for Computing Machinery. ISBN 1595931805. doi: 10.1145/1102351.1102430. URL https://doi.org/10.1145/1102351.1102430.
  27. Nonstationary covariance functions for gaussian process regression. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems, volume 16. MIT Press, 2003. URL https://proceedings.neurips.cc/paper_files/paper/2003/file/326a8c055c0d04f5b06544665d8bb3ea-Paper.pdf.
  28. John C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS, pages 61–74. MIT Press, 1999.
  29. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, 2005. ISBN 026218253X.
  30. Learning non-gaussian time series using the box-cox gaussian process. pages 1–8, 07 2018. doi: 10.1109/IJCNN.2018.8489648.
  31. A tutorial on conformal prediction. 2007. doi: 10.48550/ARXIV.0706.3188. URL https://arxiv.org/abs/0706.3188.
  32. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016. doi: 10.1109/JPROC.2015.2494218.
  33. Shai Shalev-Shwartz et al. Online learning and online convex optimization. Foundations and Trends® in Machine Learning, 4(2):107–194, 2012.
  34. Warped gaussian processes. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems, volume 16. MIT Press, 2004. URL https://proceedings.neurips.cc/paper/2003/file/6b5754d737784b51ec5075c0dc437bf0-Paper.pdf.
  35. Practical bayesian optimization of machine learning algorithms. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 2, NIPS’12, page 2951–2959, Red Hook, NY, USA, 2012. Curran Associates Inc.
  36. Input warping for bayesian optimization of non-stationary functions, 2014.
  37. Scalable bayesian optimization using deep neural networks. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2171–2180, Lille, France, 07–09 Jul 2015. PMLR. URL http://proceedings.mlr.press/v37/snoek15.html.
  38. Distribution calibration for regression, 2019.
  39. Bayesian optimization with robust bayesian neural networks. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016. URL https://proceedings.neurips.cc/paper/2016/file/a96d3afec184766bfeca7a9f989fc7e7-Paper.pdf.
  40. Bayesian optimization with conformal prediction sets. In International Conference on Artificial Intelligence and Statistics, pages 959–986. PMLR, 2023.
  41. S. Surjanovic and D. Bingham. Virtual library of simulation experiments: Test functions and datasets. Retrieved October 8, 2022, from http://www.sfu.ca/~ssurjano.
  42. Auto-weka: Combined selection and hyperparameter optimization of classification algorithms, 2013.
  43. Explore-exploit in top-n recommender systems via gaussian processes. In RecSys ’14, 2014.
  44. Conformal calibrators. In Alexander Gammerman, Vladimir Vovk, Zhiyuan Luo, Evgueni N. Smirnov, Giovanni Cherubin, and Marco Christini, editors, Conformal and Probabilistic Prediction and Applications, COPA 2020, 9-11 September 2020, Virtual Event, Verona, Italy, volume 128 of Proceedings of Machine Learning Research, pages 84–99. PMLR, 2020. URL http://proceedings.mlr.press/v128/vovk20a.html.
  45. Individual calibration with randomized forecasting, 2020. URL https://arxiv.org/abs/2006.10288.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com