Large Language Models As Evolution Strategies (2402.18381v1)
Abstract: Large Transformer models are capable of implementing a plethora of so-called in-context learning algorithms. These include gradient descent, classification, sequence completion, transformation, and improvement. In this work, we investigate whether LLMs, which never explicitly encountered the task of black-box optimization, are in principle capable of implementing evolutionary optimization algorithms. While previous works have solely focused on language-based task specification, we move forward and focus on the zero-shot application of LLMs to black-box optimization. We introduce a novel prompting strategy, consisting of least-to-most sorting of discretized population members and querying the LLM to propose an improvement to the mean statistic, i.e. perform a type of black-box recombination operation. Empirically, we find that our setup allows the user to obtain an LLM-based evolution strategy, which we call EvoLLM', that robustly outperforms baseline algorithms such as random search and Gaussian Hill Climbing on synthetic BBOB functions as well as small neuroevolution tasks. Hence, LLMs can act as
plug-in' in-context recombination operators. We provide several comparative studies of the LLM's model size, prompt strategy, and context construction. Finally, we show that one can flexibly improve EvoLLM's performance by providing teacher algorithm information via instruction fine-tuning on previously collected teacher optimization trajectories.
- Openai gym. arXiv preprint arXiv:1606.01540 (2016).
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- EvoPrompting: Language Models for Code-Level Neural Architecture Search. (2023).
- Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems 34 (2021), 15084–15097.
- Learning to learn without gradient descent by gradient descent. In International Conference on Machine Learning. PMLR, 748–756.
- Towards learning universal hyperparameter optimizers with transformers. Advances in Neural Information Processing Systems 35 (2022), 32053–32068.
- Multi-step Planning for Automated Hyperparameter Optimization with OptFormer. arXiv preprint arXiv:2210.04971 (2022).
- Promptbreeder: Self-referential self-improvement via prompt evolution. arXiv preprint arXiv:2309.16797 (2023).
- PaLM 2 Technical Report. (2023). arXiv:cs.CL/2305.10403
- Connecting large language models with evolutionary algorithms yields powerful prompt optimizers. arXiv preprint arXiv:2309.08532 (2023).
- Nikolaus Hansen. 2006. The CMA evolution strategy: a comparing review. Towards a new evolutionary computation: Advances in the estimation of distribution algorithms (2006), 75–102.
- Real-parameter black-box optimization benchmarking 2010: Experimental setup. Ph.D. Dissertation. INRIA.
- General-purpose in-context learning by meta-learning transformers. arXiv preprint arXiv:2212.04458 (2022).
- Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Advances in Neural Information Processing Systems 34 (2021), 28742–28756.
- Generative pretraining for black-box optimization. arXiv preprint arXiv:2206.10786 (2022).
- Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization. In Proceedings of the Genetic and Evolutionary Computation Conference. 929–937.
- Discovering evolution strategies via meta-black-box optimization. In Proceedings of the Companion Conference on Genetic and Evolutionary Computation. 29–30.
- Robert Tjarko Lange. 2022a. evosax: JAX-based Evolution Strategies. arXiv preprint arXiv:2212.04180 (2022).
- Robert Tjarko Lange. 2022b. gymnax: A JAX-based Reinforcement Learning Environment Library. (2022). http://github.com/RobertTLange/gymnax
- NeuroEvoBench: Benchmarking Evolutionary Optimizers for Deep Learning Applications. arXiv preprint arXiv:2311.02394 (2023).
- In-context reinforcement learning with algorithm distillation. arXiv preprint arXiv:2210.14215 (2022).
- Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning. PMLR, 3744–3753.
- Evolution through large models. In Handbook of Evolutionary Machine Learning. Springer, 331–366.
- The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021).
- Yujian Betterest Li and Kai Wu. 2023. SPELL: Semantic Prompt Evolution based on a LLM. arXiv preprint arXiv:2310.01260 (2023).
- Large language model for multi-objective evolutionary optimization. arXiv preprint arXiv:2310.12541 (2023).
- Algorithm Evolution Using Large Language Model. arXiv preprint arXiv:2311.15249 (2023).
- Large Language Models as Evolutionary Optimizers. arXiv preprint arXiv:2310.19046 (2023).
- Structured state space models for in-context reinforcement learning. arXiv preprint arXiv:2303.03982 (2023).
- Prompt Optimisation with Random Sampling. arXiv preprint arXiv:2311.09569 (2023).
- Language Model Crossover: Variation through Few-Shot Prompting. (2023).
- Large Language Models as General Pattern Machines. arXiv preprint arXiv:2307.04721 (2023).
- Importance of Directional Feedback for LLM-based Optimizers. In NeurIPS 2023 Foundation Models for Decision Making Workshop.
- OpenAI. 2023. GPT-4 Technical Report. ArXiv abs/2303.08774 (2023). https://arxiv.org/abs/2303.08774
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
- Ingo Rechenberg. 1978. Evolutionsstrategien. In Simulationsmethoden in der Medizin und Biologie. Springer, 83–114.
- Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 (2017).
- High dimensions and heavy tails for natural evolution strategies. In Proceedings of the 13th annual conference on Genetic and evolutionary computation. 845–852.
- Yujin Tang and David Ha. 2021. The sensory neuron as a transformer: Permutation-invariant neural networks for reinforcement learning. Advances in Neural Information Processing Systems 34 (2021), 22574–22587.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
- Attention is all you need. Advances in neural information processing systems 30 (2017).
- Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824–24837.
- Natural evolution strategies. The Journal of Machine Learning Research 15, 1 (2014).
- Large language models as optimizers. (2023).
- Using Large Language Models for Hyperparameter Optimization. In NeurIPS 2023 Foundation Models for Decision Making Workshop.
- Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625 (2022).