ExpeL: LLM Agents Are Experiential Learners (2308.10144v3)
Abstract: The recent surge in research interest in applying LLMs to decision-making tasks has flourished by leveraging the extensive world knowledge embedded in LLMs. While there is a growing demand to tailor LLMs for custom decision-making tasks, finetuning them for specific tasks is resource-intensive and may diminish the model's generalization capabilities. Moreover, state-of-the-art LLMs like GPT-4 and Claude are primarily accessible through API calls, with their parametric weights remaining proprietary and unavailable to the public. This scenario emphasizes the growing need for new methodologies that allow learning from agent experiences without requiring parametric updates. To address these problems, we introduce the Experiential Learning (ExpeL) agent. Our agent autonomously gathers experiences and extracts knowledge using natural language from a collection of training tasks. At inference, the agent recalls its extracted insights and past experiences to make informed decisions. Our empirical results highlight the robust learning efficacy of the ExpeL agent, indicating a consistent enhancement in its performance as it accumulates experiences. We further explore the emerging capabilities and transfer learning potential of the ExpeL agent through qualitative observations and additional experiments.
- Anthropic. 2023. Introducing Claude.
- Emergent Autonomous Scientific Research Capabilities of Large Language Models. arXiv preprint.
- ChemCrow: Augmenting Large-Language Models with Chemistry Tools. arXiv preprint.
- Language Models are Few-Shot Learners. NeurIPS.
- Chase, H. 2023. Langchain.
- PaLM: Scaling Language Modeling with Pathways. JMLR.
- Scaling Instruction-Finetuned Language Models. arXiv preprint.
- Shortcut Learning of Large Language Models in Natural Language Understanding: A Survey. arXiv preprint.
- MindAgent: Emergent Gaming Interaction. arXiv preprint.
- A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis. arXiv preprint.
- Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition. In CoRL. PMLR.
- Reasoning with Language Model is Planning with World Model. arXiv preprint.
- Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents. In ICML. PMLR.
- Large-scale Retrieval for Reinforcement Learning. NeurIPS.
- Billion-scale Similarity Search with GPUs. IEEE Transactions on Big Data.
- Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus and Giroux.
- Large Language Models are Zero-Shot Reasoners. NeurIPS.
- A Survey on Retrieval-Augmented Text Generation. arXiv preprint.
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks. NeurIPS.
- Text2Motion: From Natural Language Instructions to Feasible Plans. Autonomous Robots.
- Lin, L.-J. 1992. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching. Machine learning.
- What Makes Good In-Context Examples for GPT-3? In DeeLIO. Association for Computational Linguistics.
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys.
- AgentBench: Evaluating LLMs as Agents. arXiv preprint.
- REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction. In CoRL. PMLR.
- Maas; Carey; Wheeler; Saatchi; Billington; and Shamash. 2023. To Infinity and Beyond: SHOW-1 and Showrunner Agents in Multi-Agent Simulations. arXiv preprint.
- Large Language Models as General Pattern Machines. In CoRL. PMLR.
- EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought. NeurIPS.
- Nakajima, Y. 2023. BabyAGI. https://github.com/yoheinakajima/babyagi.
- WebGPT: Browser-Assisted Question-Answering with Human Feedback. arXiv preprint.
- OpenAI. 2023. GPT-4 Technical Report.
- Training Language Models to Follow Instructions with Human Feedback. In NeurIPS.
- Generative Agents: Interactive Simulacra of Human Behavior. In ACM Symposium on User Interface Software and Technology.
- Language Models as Knowledge Bases? In EMNLP-IJCNLP. Association for Computational Linguistics.
- Communicative Agents for Software Development. arXiv:2307.07924.
- Learning To Retrieve Prompts for In-Context Learning. In NAACL. Association for Computational Linguistics.
- Prioritized Experience Replay. In ICLR.
- From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces. NeurIPS.
- Reflexion: Language Agents with Verbal Reinforcement Learning. In NeurIPS.
- ALFWorld: Aligning Text and Embodied Environments for Interactive Learning. In ICLR.
- Significant-Gravitas. 2023. AutoGPT. https://github.com/Significant-Gravitas/Auto-GPT.
- MPNet: Masked and Permuted Pre-training for Language Understanding. NeurIPS.
- Cognitive Architectures for Language Agents. arXiv preprint.
- AdaPlanner: Adaptive Planning from Feedback with Language Models. NeurIPS.
- Reinforcement Learning: An Introduction. MIT press.
- Stanford Alpaca: An Instruction-Following LLaMA Model. https://github.com/tatsu-lab/stanford˙alpaca.
- LaMDA: Language Models for Dialog Applications. arXiv preprint.
- FEVER: a Large-scale Dataset for Fact Extraction and VERification. In NAACL.
- LLaMA: Open and Efficient Foundation Language Models. arXiv preprint.
- Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv preprint.
- Focused Transformer: Contrastive Training for Context Scaling. In NeurIPS.
- Voyager: An Open-ended Embodied Agent with Large Language Models. arXiv preprint.
- A Survey on Large Language Model Based Autonomous Agents. arXiv preprint.
- Learning to Retrieve In-Context Examples for Large Language Models. arXiv preprint.
- Avalon’s Game of Thoughts: Battle Against Deception through Recursive Contemplation. arXiv preprint.
- Q-learning. Machine learning.
- Finetuned Language Models are Zero-Shot Learners. In ICLR.
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS.
- TidyBot: Personalized Robot Assistance with Large Language Models. Autonomous Robots.
- The Rise and Potential of Large Language Model Based Agents: A Survey. arXiv preprint.
- Foundation Models for Decision Making: Problems, Methods, and Opportunities. arXiv preprint.
- MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action. arXiv preprint.
- HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP. Association for Computational Linguistics.
- WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents. In NeurIPS.
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models. NeurIPS.
- ReAct: Synergizing Reasoning and Acting in Language Models. In ICLR.
- Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.
- Offline Prioritized Experience Replay. arXiv preprint.
- AgentTuning: Enabling Generalized Agent Abilities for LLMs. arXiv preprint.
- Automatic Chain of Thought Prompting in Large Language Models. In ICLR.
- Augmenting Unsupervised Reinforcement Learning with Self-Reference. arXiv preprint.
- A Survey of Large Language Models. arXiv preprint.
- A Comprehensive Survey on Transfer Learning. Proceedings of the IEEE.
- RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. In CoRL. PMLR.