LARG, Language-based Automatic Reward and Goal Generation (2306.10985v1)
Abstract: Goal-conditioned and Multi-Task Reinforcement Learning (GCRL and MTRL) address numerous problems related to robot learning, including locomotion, navigation, and manipulation scenarios. Recent works focusing on language-defined robotic manipulation tasks have led to the tedious production of massive human annotations to create dataset of textual descriptions associated with trajectories. To leverage reinforcement learning with text-based task descriptions, we need to produce reward functions associated with individual tasks in a scalable manner. In this paper, we leverage recent capabilities of LLMs and introduce \larg, Language-based Automatic Reward and Goal Generation, an approach that converts a text-based task description into its corresponding reward and goal-generation functions We evaluate our approach for robotic manipulation and demonstrate its ability to train and execute policies in a scalable manner, without the need for handcrafted reward functions.
- Inner monologue: Embodied reasoning through planning with language models. 2022. doi:10.48550/ARXIV.2207.05608. URL https://arxiv.org/abs/2207.05608.
 - Vima: General robot manipulation with multimodal prompts. 2022. doi:10.48550/ARXIV.2210.03094. URL https://arxiv.org/abs/2210.03094.
 - Lm-nav: Robotic navigation with large pre-trained models of language, vision, and action. 2022. doi:10.48550/ARXIV.2207.04429. URL https://arxiv.org/abs/2207.04429.
 - Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. 2022. doi:10.48550/ARXIV.2201.07207. URL https://arxiv.org/abs/2201.07207.
 - A survey of deep network solutions for learning control in robotics: From reinforcement to imitation. arXiv: Robotics, 2016.
 - Should i run offline reinforcement learning or behavioral cloning? In International Conference on Learning Representations, 2022.
 - Feature-based transfer learning for robotic push manipulation. 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1–5, 2018.
 - Transfer learning for accurate modeling and control of soft actuators. 2021 IEEE 4th International Conference on Soft Robotics (RoboSoft), pages 51–57, 2021.
 - Multi-modal transfer learning for grasping transparent and specular objects. IEEE Robotics and Automation Letters, 5:3796–3803, 2020.
 - Hg-dagger: Interactive imitation learning with human experts. 2019 International Conference on Robotics and Automation (ICRA), pages 8077–8083, 2018.
 - Correct me if i am wrong: Interactive learning for robotic manipulation. IEEE Robotics and Automation Letters, 7:3695–3702, 2021.
 - Interactive reinforcement learning with inaccurate feedback. 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 7498–7504, 2020.
 - Reinforcement learning: An introduction. IEEE Transactions on Neural Networks, 16:285–286, 2005.
 - Asynchronous methods for deep reinforcement learning. In ICML, 2016.
 - Continuous control with deep reinforcement learning. CoRR, abs/1509.02971, 2016.
 - Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 3389–3396, 2017. doi:10.1109/ICRA.2017.7989385.
 - Multi-goal reinforcement learning: Challenging robotics environments and request for research. ArXiv, abs/1802.09464, 2018.
 - Visual reinforcement learning with imagined goals. In NeurIPS, 2018.
 - Asymmetric self-play for automatic goal discovery in robotic manipulation. ArXiv, abs/2101.04882, 2021.
 - M. Dorigo and M. Colombetti. Robot shaping: Developing autonomous agents through learning. Artificial intelligence, 71(2):321–370, 1994.
 - J. Randløv and P. Alstrøm. Learning to drive a bicycle using reinforcement learning and shaping. In Proceedings of the 15th International Conference on Machine Learning (ICML’98), pages 463–471, 1998.
 - Rt-1: Robotics transformer for real-world control at scale. ArXiv, abs/2212.06817, 2022.
 - Code as policies: Language model programs for embodied control. 2022. doi:10.48550/ARXIV.2209.07753. URL https://arxiv.org/abs/2209.07753.
 - Reward design with language models. In International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=10uNUgI5Kl.
 - Language-conditioned goal generation: a new approach to language grounding for rl. ArXiv, abs/2006.07043, 2020a.
 - Language-conditioned goal generation: a new approach to language grounding for RL. CoRR, abs/2006.07043, 2020b. URL https://arxiv.org/abs/2006.07043.
 - Exploring the limits of transfer learning with a unified text-to-text transformer, 2020a.
 - Exploring the limits of transfer learning with a unified text-to-text transformer. ArXiv, abs/1910.10683, 2020b.
 - Survey of hallucination in natural language generation. CoRR, abs/2202.03629, 2022. URL https://arxiv.org/abs/2202.03629.
 - Proximal policy optimization algorithms. ArXiv, abs/1707.06347, 2017.
 - Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155, 2022.
 - Starcoder: may the source be with you!, 2023.
 
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.