Code as Policies: Language Model Programs for Embodied Control (2209.07753v4)

Published 16 Sep 2022 in cs.RO

Abstract: LLMs trained on code completion have been shown to be capable of synthesizing simple Python programs from docstrings [1]. We find that these code-writing LLMs can be re-purposed to write robot policy code, given natural language commands. Specifically, policy code can express functions or feedback loops that process perception outputs (e.g.,from object detectors [2], [3]) and parameterize control primitive APIs. When provided as input several example language commands (formatted as comments) followed by corresponding policy code (via few-shot prompting), LLMs can take in new commands and autonomously re-compose API calls to generate new policy code respectively. By chaining classic logic structures and referencing third-party libraries (e.g., NumPy, Shapely) to perform arithmetic, LLMs used in this way can write robot policies that (i) exhibit spatial-geometric reasoning, (ii) generalize to new instructions, and (iii) prescribe precise values (e.g., velocities) to ambiguous descriptions ("faster") depending on context (i.e., behavioral commonsense). This paper presents code as policies: a robot-centric formulation of LLM generated programs (LMPs) that can represent reactive policies (e.g., impedance controllers), as well as waypoint-based policies (vision-based pick and place, trajectory-based control), demonstrated across multiple real robot platforms. Central to our approach is prompting hierarchical code-gen (recursively defining undefined functions), which can write more complex code and also improves state-of-the-art to solve 39.8% of problems on the HumanEval [1] benchmark. Code and videos are available at https://code-as-policies.github.io

Citations (714)

View on Semantic Scholar

Summary

The paper introduces Code as Policies, repurposing LLMs to generate executable code for robotic control from natural language prompts.
It employs hierarchical code generation, recursively defining functions to produce complex, adaptable policies using API calls.
The approach achieves state-of-the-art performance, solving 39.8% of problems on HumanEval and excelling in long-horizon and spatial tasks.

Overview of "Code as Policies: LLM Programs for Embodied Control"

The paper presents an approach titled "Code as Policies" (CaP) which leverages LLMs for generating programs that control robots. The central idea is to use LLMs trained on code completion tasks to convert natural language instructions into executable policy code, which can be deployed on robotic systems. By providing example commands and corresponding policy code through few-shot prompting, the LLMs are adept at composing API calls required for robotic control.

Key Contributions

Repurposing LLMs for Robotic Control:
- The authors repurpose LLMs, traditionally used for synthesizing simple programs, to write robot policies. These policies process perception outputs and parameterize control APIs.
Hierarchical Code Generation:
- They propose a method of hierarchical code generation that recursively defines undefined functions, allowing for the generation of more complex robotic policies.
Use of Perception and Control APIs:
- The generated policies can incorporate third-party libraries such as NumPy and Shapely for geometric reasoning, and express classic logic structures like loops and conditionals.
Achievement in Benchmarks:
- The approach improves state-of-the-art to solve 39.8% of problems on the HumanEval benchmark, showcasing its efficacy in solving generic code-generation challenges.

Strong Numerical Results and Claims

The authors report a success rate of up to 97.2% in long-horizon tasks and 89.3% in spatial-geometric reasoning tasks within simulated environments. Such claims indicate the approach's robust performance in specific manipulation scenarios.
The hierarchical code generation is highlighted as particularly effective, improving pass rates significantly across different LLMs in the RoboCodeGen and HumanEval benchmarks.

Practical and Theoretical Implications

Practical:
- The CaP formulation allows for versatile robot programming, capable of generating and adapting policies for multiple robotic systems without needing additional data or training.
- By utilizing off-the-shelf models for perception (e.g., MDETR, ViLD), the approach remains flexible and applicable to various real-world robotic platforms, from UR5e arms to mobile robots in complex environments like kitchens.
Theoretical:
- The paper advances our understanding of LLMs' capabilities in code synthesis, particularly in the field of robotic control, where the need for explicit programming can be reduced.

Future Developments

Future research might explore enhancing the robustness of hierarchical code generation and expanding the approach to comprehend more complex tasks or perform dynamic adaptations across significantly diverse robot embodiments.
Further refinements may address current limitations, such as handling longer or more abstract commands and increasing the diversity of controllable parameters.

Conclusion

The "Code as Policies" framework represents an innovative utilization of LLMs for robotic control, combining modern NLP advancements with robotic applications. By generating interpretable and adaptable code for robots, this work opens pathways to more interactive and intelligent robot programming paradigms, presenting a noteworthy contribution to both AI and robotics fields.

PDF Markdown

Related Papers

GitHub

Tweets

https://twitter.com/MaximeRivest/status/1934763479514403257

https://twitter.com/ProgramWithAi/status/1934662135671320736