PMBO: Enhancing Black-Box Optimization through Multivariate Polynomial Surrogates (2403.07485v1)
Abstract: We introduce a surrogate-based black-box optimization method, termed Polynomial-model-based optimization (PMBO). The algorithm alternates polynomial approximation with Bayesian optimization steps, using Gaussian processes to model the error between the objective and its polynomial fit. We describe the algorithmic design of PMBO and compare the results of the performance of PMBO with several optimization methods for a set of analytic test functions. The results show that PMBO outperforms the classic Bayesian optimization and is robust with respect to the choice of its correlation function family and its hyper-parameter setting, which, on the contrary, need to be carefully tuned in classic Bayesian optimization. Remarkably, PMBO performs comparably with state-of-the-art evolutionary algorithms such as the Covariance Matrix Adaptation -- Evolution Strategy (CMA-ES). This finding suggests that PMBO emerges as the pivotal choice among surrogate-based optimization methods when addressing low-dimensional optimization problems. Hereby, the simple nature of polynomials opens the opportunity for interpretation and analysis of the inferred surrogate model, providing a macroscopic perspective on the landscape of the objective function.
- Luigi Acerbi and Wei Ji Ma. 2017a. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. arXiv:1705.04405 [stat.ML]
- Luigi Acerbi and Wei Ji Ma. 2017b. Practical Bayesian optimization for model fitting with Bayesian adaptive direct search. Advances in Neural Information Processing Systems 30 (2017), 1834–1844.
- casus/minterpy: Minterpy - Multivariate interpolation in Python. https://github.com/casus/minterpy. https://doi.org/10.14278/rodare.2062 (Accessed on 12/08/2023).
- Two decades of blackbox optimization applications. EURO Journal on Computational Optimization 9 (2021), 100011. https://doi.org/10.1016/j.ejco.2021.100011
- Peter Auer. 2003. Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3 (2003), 397–422. https://api.semanticscholar.org/CorpusID:10485293
- Serge Bernstein. 1912. Sur l’ordre de la meilleure approximation des fonctions continues par des polynômes de degré donné. Vol. 4. Hayez, imprimeur des académies royales, Bruxelles.
- Serge Bernstein. 1914. Sur la meilleure approximation de |x|𝑥|x|| italic_x | par des polynomes de degrés donnés. Acta Mathematica 37, 1 (1914), 1–57.
- Jean-Paul Berrut and Lloyd N. Trefethen. 2004. Barycentric Lagrange interpolation. SIAM review 46, 3 (2004), 501–517.
- Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. Vol. 4. Springer, Cambridge.
- Len Bos and Norm Levenberg. 2018. Bernstein-Walsh theory associated to convex bodies and applications to multivariate approximation theory. Computational Methods and Function Theory 18, 2 (2018), 361–388.
- A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning.
- Taeryon Choi and Mark J. Schervish. 2007. On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis 98, 10 (Nov. 2007), 1969–1987. https://doi.org/10.1016/j.jmva.2007.01.004
- Louis De Branges. 1959. The Stone-Weierstrass Theorem. Proc. Amer. Math. Soc. 10, 5 (1959), 822–824.
- Georg Faber. 1914. Über die interpolatorische Darstellung stetiger Funktionen. Jber. Deutsch. Math. Verein 23 (1914), 192–210.
- Roman Garnett. 2023. Bayesian optimization. Cambridge University Press. https://doi.org/10.1017/9781108348973
- Nikolaus Hansen. 2023. The CMA Evolution Strategy: A tutorial. arXiv:1604.00772 [cs.LG]
- Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009. In Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation (Portland, Oregon, USA) (GECCO ’10). Association for Computing Machinery, New York, NY, USA, 1689–1696. https://doi.org/10.1145/1830761.1830790
- COCO: A platform for comparing continuous optimizers in a black-box setting. Optimization Methods and Software 36 (2021), 114–144. Issue 1. https://doi.org/10.1080/10556788.2020.1808977
- CMA - Covariance Matrix Adaptation. https://cma-es.github.io/apidocs-pycma/index.html. (Accessed on 12/08/2023).
- A quadratic-time algorithm for general multivariate polynomial interpolation. arXiv:arXiv:1710.10846
- Multivariate interpolation on unisolvent nodes – lifting the curse of dimensionality. https://doi.org/10.48550/ARXIV.2010.10824
- Multivariate Newton interpolation. arXiv:arXiv:1812.04256
- Michael Hecht and Ivo F. Sbalzarini. 2018. Fast Interpolation and Fourier Transform in High-Dimensional Spaces. In Intelligent Computing. Proc. 2018 IEEE Computing Conf., Vol. 2, (Advances in Intelligent Systems and Computing, Vol. 857), K. Arai, S. Kapoor, and R. Bhatia (Eds.). Springer Nature, London, UK, 53–75.
- Minterpy - multivariate polynomial interpolation (Version 0.2.0-alpha). https://doi.org/10.14278/rodare.2062
- John H Holland. 1992. Genetic algorithms. Scientific American 267, 1 (1992), 66–73.
- Bayesian hyperparameter optimization of deep neural network algorithms based on ant colony optimization. In Document Analysis and Recognition – ICDAR 2021, Josep Lladós, Daniel Lopresti, and Seiichi Uchida (Eds.). Springer International Publishing, Cham, 585–594.
- Efficient global optimization of expensive black-box functions. Journal of Global Optimization 13 (12 1998), 455–492. https://doi.org/10.1023/A:1008306431147
- Gaussian processes and kernel methods: A review on connections and equivalences. arXiv:arXiv:1807.02582
- Jungtaek Kim and Seungjin Choi. 2023. BayesO: A Bayesian optimization framework in Python. Journal of Open Source Software 8, 90 (Oct. 2023), 5320. https://doi.org/10.21105/joss.05320
- Harold J. Kushner. 1964. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Basic Engineering 86 (1964), 97–106. https://api.semanticscholar.org/CorpusID:62599010
- Franciszek Leja. 1957. Sur certaines suites liées aux ensembles plans et leur application à la représentation conforme. In Annales Polonici Mathematici, Vol. 1. Instytut Matematyczny Polskiej Akademi Nauk, 8–13.
- E. Meijering. 2002. A chronology of interpolation: From ancient astronomy to modern signal and image processing. Proc. IEEE 90, 3 (March 2002), 319–342.
- Charles Audet Pierre-Luc Huot, Annie Poulin and Stéphane Alarie. 2019. A hybrid optimization approach for efficient calibration of computationally intensive hydrological models. Hydrological Sciences Journal 64, 10 (2019), 1204–1222. https://doi.org/10.1080/02626667.2019.1624922 arXiv:https://doi.org/10.1080/02626667.2019.1624922
- Optimizing dosage-specific treatments in a multi-scale model of a tumor growth. Frontiers in Molecular Biosciences 9 (2022). https://doi.org/10.3389/fmolb.2022.836794
- Carl Edward Rasmussen and Christopher KI Williams. 2006. Gaussian processes for machine learning. Vol. 2. MIT press Cambridge, MA, Tübingen and Edinburgh.
- Luis Miguel Rios and Nikolaos V. Sahinidis. 2012. Derivative-free optimization: a review of algorithms and comparison of software implementations. Journal of Global Optimization 56, 3 (July 2012), 1247–1293. https://doi.org/10.1007/s10898-012-9951-y
- Inverse problems and data assimilation. arXiv:1810.06191 [stat.ME]
- Gurjeet Sangra Singh and Luigi Acerbi. 2023. PyBADS: Fast and robust black-box optimization in Python. https://doi.org/10.48550/ARXIV.2306.15576
- Practical Bayesian optimization of machine learning algorithms. arXiv:1206.2944 [stat.ML]
- Hillel Tal-Ezer. 1988. High degree interpolation polynomial in Newton form. Contractor Report 181677, ICASE report No. 88-39. NASA Langley Research Center.
- Aretha L. Teckentrup. 2020. Convergence of Gaussian process regression with estimated hyper-parameters and applications in Bayesian inverse problems. SIAM/ASA Journal on Uncertainty Quantification 8, 4 (Jan. 2020), 1310–1337. https://doi.org/10.1137/19m1284816
- Lloyd N. Trefethen. 2017. Multivariate polynomial approximation in the hypercube. Proc. Amer. Math. Soc. 145, 11 (2017), 4837–4844.
- Lloyd N. Trefethen. 2019. Approximation theory and approximation practice, Extended Edition. Society for Industrial and Applied Mathematics, Philadelphia, PA. https://doi.org/10.1137/1.9781611975949 arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611975949
- Multivariate polynomial regression of euclidean degree extends the stability for fast Aapproximations of Trefethen functions. arXiv:arXiv:2212.11706
- Multivariate polynomial regression of Euclidean degree extends the stability for fast approximations of Trefethen functions. https://doi.org/10.48550/arXiv.2212.11706
- Surrogate-based methods for black-box optimization. International Transactions in Operational Research 24, 3 (April 2016), 393–424. https://doi.org/10.1111/itor.12292
- Wenjia Wang and Bing-Yi Jing. 2022. Convergence of Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression. arXiv:2104.09778 [math.ST]
- Karl Weierstrass. 1885. Über die analytische Darstellbarkeit sogenannter willkürlicher Funktionen einer reellen Veränderlichen. Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin 2 (1885), 633–639.
- Damar Wicaksono. 2018. Bayesian uncertainty quantification of physical models in thermal-hydraulics system codes. Ph. D. Dissertation. ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE.
- D. H Wolpert and W. G Macready. 1997. No free lunch theorems for optimization. IEEE transactions on evolutionary computation 1, 1 (1997), 67–82.
- Convergence guarantees for Gaussian process means with misspecified likelihoods and smoothness. The Journal of Machine Learning Research 22, 1 (2021), 5468–5507.
- Gradient-free multi-domain optimization for autonomous systems. arXiv:2202.13525 [cs.RO]