Emergent Mind

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

(2403.12014)
Published Mar 18, 2024 in cs.CL , cs.AI , and cs.LG

Abstract

Recent SOTA approaches for embodied learning via interaction directly employ LLMs as agents to determine the next steps in an environment. Due to their world knowledge and reasoning capabilities, LLM agents achieve stronger performance than previous smaller agents based on reinforcement learning (RL); however, frequently calling LLMs is slow and expensive. Instead of directly employing LLMs as agents, can we use LLMs' reasoning capabilities to adaptively create training environments to help smaller embodied RL agents learn useful skills that they are weak at? We propose EnvGen, a novel framework to address this question. First, we prompt an LLM to generate training environments that allow agents to quickly learn different tasks in parallel. Concretely, the LLM is given the task description and simulator objectives that the agents should learn and is then asked to generate a set of environment configurations (e.g., different terrains, items given to agents, etc.). Next, we train a small RL agent in a mixture of the original and LLM-generated environments. Then, we enable the LLM to continuously adapt the generated environments to progressively improve the skills that the agent is weak at, by providing feedback to the LLM in the form of the agent's performance. We demonstrate the usefulness of EnvGen with comprehensive experiments in Crafter and Heist environments. We find that a small RL agent trained with EnvGen can outperform SOTA methods, including a GPT-4 agent, and learns long-horizon tasks significantly faster. We show qualitatively how the LLM adapts training environments to help improve RL agents' weaker skills over time. Additionally, EnvGen is substantially more efficient as it only uses a small number of LLM calls (e.g., 4 in total), whereas LLM agents require thousands of LLM calls. Lastly, we present detailed ablation studies for our design choices.

Prompts provided to LLM for creating environments in the initial step of environment generation.

Overview

  • EnvGen introduces a framework using LLMs to dynamically generate and adapt training environments for RL agents, targeting the enhancement of their long-horizon task performance.

  • The framework operates by utilizing LLMs to design environments that address an agent's specific weaknesses, facilitating rapid skill acquisition without direct LLM involvement in task execution.

  • Experimental results indicate that agents trained with the EnvGen framework outperform those trained under conventional conditions, especially in complex, sequence-dependent tasks.

  • EnvGen's approach posits a new method of employing LLMs' vast knowledge and reasoning capabilities in a cost-effective manner, opening paths for future AI training innovations.

Adaptive Environment Generation with LLMs for Enhanced Training of Embodied Agents

Introduction to the EnvGen Framework

Recent advancements in embodied AI emphasize learning through environmental interaction, a stark departure from traditional dataset-based approaches. Environments offering complex tasks necessitate agents capable of long-horizon planning, a significant challenge for conventional reinforcement learning (RL) paradigms due to sparse reward distributions. This paper introduces EnvGen, a framework leveraging LLMs to dynamically create and adapt training environments for small RL agents. By generating tailored environments aimed at addressing an agent's weaknesses, EnvGen facilitates efficient skill acquisition, particularly for tasks requiring extensive action sequences.

Challenges with Long-horizon Task Learning

Traditional RL agents often stumble when facing tasks that demand sequential achievement unlocking, primarily due to the sparse and delayed rewards inherent to such tasks. LLMs, equipped with extensive world knowledge and sophisticated reasoning capabilities, offer a promising solution yet are hindered by their proclivity for slow, cost-intensive operations when directly employed as agents.

EnvGen: Adaptive Environment Generation

EnvGen circumvents the limitations of direct LLM use by instead leveraging LLMs to generate and adapt training environments. Initiated with a descriptive prompt about the task and simulator capabilities, the LLM proposes a set of environment configurations. An RL agent is trained within these LLM-suggested environments before being evaluated in the original setting. This feedback loop allows for iterative refinement, with the LLM tailoring subsequent environments to specifically bolster the agent’s underdeveloped skills. EnvGen proposes a cost-effective method that significantly reduces the need for direct LLM invocation.

Empirical Validation

The effectiveness of EnvGen is validated through comprehensive experiments within the Crafter and Heist simulation environments. Findings demonstrate that RL agents trained under the EnvGen framework surpass state-of-the-art counterparts, achieving superior performance in complex, long-horizon tasks. Notably, a small RL agent trained with EnvGen manages to outperform a GPT-4-driven agent, highlighting EnvGen's efficiency in leveraging LLM capabilities without incurring prohibitive computational or financial costs.

Theoretical Implications and Practical Applications

The EnvGen framework exemplifies the practical integration of LLMs into RL workflows, deviating from direct usage paradigms. This technique opens new avenues for exploiting LLMs' comprehensive world knowledge and reasoning prowess in a manner that is both computationally and economically viable. The ability of EnvGen to adaptively refine training environments based on agent performance underscores the potential of LLMs in crafting highly specialized, skill-targeted learning contexts.

Future Perspectives in AI Training

EnvGen marks a significant step forward in the symbiotic use of LLMs and RL agents, providing a blueprint for future explorations in adaptive learning environments. As LLMs continue to evolve, their integration into embodied AI training through frameworks like EnvGen could revolutionize our approach to nurturing intelligent, highly capable agents. Future research may explore the extension of this methodology across a broader spectrum of simulation environments, further cementing the role of LLMs in the efficient training of embodied agents.

Conclusion

EnvGen presents a novel approach to leveraging the analytical strengths of LLMs for the advancement of embodied AI. By refocusing the role of LLMs from direct action planning to the generation and adaptation of training environments, EnvGen offers a scalable, efficient method for enhancing RL agent performance. This work paves the way for innovative uses of LLMs in AI training, promising significant improvements in agent learning efficiency and skill acquisition within complex, dynamic environments.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.