Intelligent Switching for Reset-Free RL (2405.01684v1)
Abstract: In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable. The \textit{resetting} assumption limits the potential of reinforcement learning in the real world, as providing resets to an agent usually requires the creation of additional handcrafted mechanisms or human interventions. Recent work aims to train agents (\textit{forward}) with learned resets by constructing a second (\textit{backward}) agent that returns the forward agent to the initial state. We find that the termination and timing of the transitions between these two agents are crucial for algorithm success. With this in mind, we create a new algorithm, Reset Free RL with Intelligently Switching Controller (RISC) which intelligently switches between the two agents based on the agent's confidence in achieving its current goal. Our new method achieves state-of-the-art performance on several challenging environments for reset-free RL.
- Deep reinforcement learning at the edge of the statistical precipice. Advances in Neural Information Processing Systems, 2021.
- Richard Bellman. A Markovian Decision Process. Journal of Mathematics and Mechanics, 6(5):679–684, 1957. ISSN 0095-9057.
- Single-Life Reinforcement Learning. In Neural Information Processing Systems 2022, 2022.
- Minimalistic Gridworld Environment for Gymnasium, 2018.
- Ecological Reinforcement Learning. arXiv:2006.12478 [cs, stat], June 2020.
- PyBullet, a Python module for physics simulation for games, robotics and machine learning, 2016.
- Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning. arXiv:1711.06782 [cs], November 2017.
- Reverse Curriculum Generation for Reinforcement Learning. In Proceedings of the 1st Annual Conference on Robot Learning, pp. 482–495. PMLR, October 2017.
- Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. arXiv:2104.11203 [cs], April 2021.
- Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning. arXiv:2203.15755 [cs], March 2022.
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In Proceedings of the 35th International Conference on Machine Learning, pp. 1861–1870. PMLR, July 2018.
- Learning compound multi-step controllers under unknown dynamics. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6435–6442, September 2015. doi: 10.1109/IROS.2015.7354297.
- Towards Continual Reinforcement Learning: A Review and Perspectives, November 2022.
- Automating Reinforcement Learning with Example-based Resets. arXiv:2204.02041 [cs], April 2022.
- Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum. In Proceedings of the 40th International Conference on Machine Learning, pp. 16441–16457. PMLR, July 2023.
- Reset-Free Lifelong Learning with Skill-Space Planning, June 2021.
- Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, February 2015. ISSN 1476-4687. doi: 10.1038/nature14236.
- Time Limits in Reinforcement Learning, January 2022.
- Automatic Curriculum Learning For Deep RL: A Short Survey, May 2020.
- How Should an Agent Practice?, December 2019.
- Autonomous Reinforcement Learning via Subgoal Curricula. arXiv:2107.12931 [cs], October 2021a.
- Autonomous Reinforcement Learning: Formalism and Benchmarking. arXiv:2112.09605 [cs], December 2021b.
- A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning, May 2022.
- Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning, March 2023.
- Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484–489, January 2016. ISSN 1476-4687. doi: 10.1038/nature16961.
- Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. In International Conference on Learning Representations, February 2018.
- Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA, October 2018. ISBN 978-0-262-03924-6.
- Human-Timescale Adaptation in an Open-Ended Task Space, January 2023.
- Continual Learning of Control Primitives: Skill Discovery via Reset-Games, November 2020.
- Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. In Conference on Robot Learning (CoRL), 2019.
- The Ingredients of Real-World Robotic Reinforcement Learning. arXiv:2004.12570 [cs, stat], April 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.