Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations (2403.20328v2)

Published 29 Mar 2024 in cs.RO and cs.LG

Abstract: Quadruped robots are progressively being integrated into human environments. Despite the growing locomotion capabilities of quadrupedal robots, their interaction with objects in realistic scenes is still limited. While additional robotic arms on quadrupedal robots enable manipulating objects, they are sometimes redundant given that a quadruped robot is essentially a mobile unit equipped with four limbs, each possessing 3 degrees of freedom (DoFs). Hence, we aim to empower a quadruped robot to execute real-world manipulation tasks using only its legs. We decompose the loco-manipulation process into a low-level reinforcement learning (RL)-based controller and a high-level Behavior Cloning (BC)-based planner. By parameterizing the manipulation trajectory, we synchronize the efforts of the upper and lower layers, thereby leveraging the advantages of both RL and BC. Our approach is validated through simulations and real-world experiments, demonstrating the robot's ability to perform tasks that demand mobility and high precision, such as lifting a basket from the ground while moving, closing a dishwasher, pressing a button, and pushing a door. Project website: https://zhengmaohe.github.io/leg-manip

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. X. Cheng, A. Kumar, and D. Pathak, “Legs as manipulator: Pushing quadrupedal agility beyond locomotion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023.
  2. X. Huang, Z. Li, Y. Xiang, Y. Ni, Y. Chi, Y. Li, L. Yang, X. B. Peng, and K. Sreenath, “Creating a dynamic quadrupedal robotic goalkeeper with reinforcement learning,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2023, pp. 2715–2722.
  3. Y. Ji, Z. Li, Y. Sun, X. B. Peng, S. Levine, G. Berseth, and K. Sreenath, “Hierarchical reinforcement learning for precise soccer shooting skills using a quadrupedal robot,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 1479–1486.
  4. Y. Ji, G. B. Margolis, and P. Agrawal, “Dribblebot: Dynamic legged manipulation in the wild,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5155–5162.
  5. J. Wu, R. Antonova, A. Kan, M. Lepert, A. Zeng, S. Song, J. Bohg, S. Rusinkiewicz, and T. Funkhouser, “Tidybot: Personalized robot assistance with large language models,” Autonomous Robots, 2023.
  6. H. Xiong, R. Mendonca, K. Shaw, and D. Pathak, “Adaptive mobile manipulation for articulated objects in the open world,” arXiv preprint arXiv:2401.14403, 2024.
  7. Z. Fu, T. Z. Zhao, and C. Finn, “Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation,” in arXiv, 2024.
  8. N. M. M. Shafiullah, A. Rai, H. Etukuru, Y. Liu, I. Misra, S. Chintala, and L. Pinto, “On bringing robots home,” arXiv preprint arXiv:2311.16098, 2023.
  9. B. Wu, R. Martín-Martín, and L. Fei-Fei, “M-ember: Tackling long-horizon mobile manipulation via factorized domain transfer,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 11 690–11 697.
  10. N. Yokoyama, A. Clegg, J. Truong, E. Undersander, T.-Y. Yang, S. Arnaud, S. Ha, D. Batra, and A. Rai, “Asc: Adaptive skill coordination for robotic mobile manipulation,” IEEE Robotics and Automation Letters, vol. 9, no. 1, pp. 779–786, 2024.
  11. S. Srivastava, C. Li, M. Lingelbach, R. Martín-Martín, F. Xia, K. E. Vainio, Z. Lian, C. Gokmen, S. Buch, K. Liu et al., “Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments,” in Conference on Robot Learning.   PMLR, 2022, pp. 477–490.
  12. R. Grandia, F. Jenelten, S. Yang, F. Farshidian, and M. Hutter, “Perceptive locomotion through nonlinear model predictive control,” 2022.
  13. F. Jenelten, J. He, F. Farshidian, and M. Hutter, “Dtc: Deep tracking control,” Science Robotics, vol. 9, no. 86, p. eadh5401, 2024.
  14. Y. Ding, A. Pandala, C. Li, Y.-H. Shin, and H.-W. Park, “Representation-free model predictive control for dynamic motions in quadrupeds,” IEEE Transactions on Robotics, vol. 37, no. 4, pp. 1154–1171, 2021.
  15. X. Cheng, K. Shi, A. Agarwal, and D. Pathak, “Extreme parkour with legged robots,” in Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition @ CoRL2023, 2023.
  16. Z. Zhuang, Z. Fu, J. Wang, C. Atkeson, S. Schwertfeger, C. Finn, and H. Zhao, “Robot parkour learning,” in Conference on Robot Learning (CoRL), 2023.
  17. D. Hoeller, N. Rudin, D. Sako, and M. Hutter, “Anymal parkour: Learning agile navigation for quadrupedal robots,” 2023.
  18. T. Miki, J. Lee, L. Wellhausen, and M. Hutter, “Learning to walk in confined spaces using 3d representation,” 2024.
  19. A. Agarwal, A. Kumar, J. Malik, and D. Pathak, “Legged locomotion in challenging terrains using egocentric vision,” in Conference on Robot Learning.   PMLR, 2023, pp. 403–415.
  20. S. Choi, G. Ji, J. Park, H. Kim, J. Mun, J. H. Lee, and J. Hwangbo, “Learning quadrupedal locomotion on deformable terrain,” Science Robotics, vol. 8, no. 74, p. eade2256, 2023.
  21. K. LEI, Z. He, C. Lu, K. Hu, Y. Gao, and H. Xu, “Uni-o4: Unifying online and offline deep reinforcement learning with multi-step on-policy optimization,” in The Twelfth International Conference on Learning Representations, 2024.
  22. L. Smith, J. C. Kew, X. B. Peng, S. Ha, J. Tan, and S. Levine, “Legged robots that keep on learning: Fine-tuning locomotion policies in the real world,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 1593–1599.
  23. R. Yang, Z. Chen, J. Ma, C. Zheng, Y. Chen, Q. Nguyen, and X. Wang, “Generalized animal imitator: Agile locomotion with versatile motion prior,” in Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition @ CoRL2023, 2023.
  24. J. Wu, G. Xin, C. Qi, and Y. Xue, “Learning robust and agile legged locomotion using adversarial motion priors,” IEEE Robotics and Automation Letters, 2023.
  25. E. Vollenweider, M. Bjelonic, V. Klemm, N. Rudin, J. Lee, and M. Hutter, “Advanced skills through multiple adversarial motion priors in reinforcement learning,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 5120–5126.
  26. X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa, “Amp: Adversarial motion priors for stylized physics-based character control,” ACM Transactions on Graphics (ToG), vol. 40, no. 4, pp. 1–20, 2021.
  27. Z. Fu, X. Cheng, and D. Pathak, “Deep whole-body control: Learning a unified policy for manipulation and locomotion,” in Conference on Robot Learning (CoRL), 2022.
  28. J.-P. Sleiman, F. Farshidian, and M. Hutter, “Versatile multicontact planning and control for legged loco-manipulation,” Science Robotics, vol. 8, no. 81, p. eadg5014, 2023.
  29. S. Zimmermann, R. Poranne, and S. Coros, “Go fetch! - dynamic grasps using boston dynamics spot with external robotic arm,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 4488–4494.
  30. B. Forrai, T. Miki, D. Gehrig, M. Hutter, and D. Scaramuzza, “Event-based agile object catching with a quadrupedal robot,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 12 177–12 183.
  31. P. Arm, M. Mittal, H. Kolvenbach, and M. Hutter, “Pedipulate: Enabling manipulation skills using a quadruped robot’s leg,” 2024.
  32. E. Olson, “Apriltag: A robust and flexible visual fiducial system,” in Proc. Int. Conf. Robot. Automat., 2011.
  33. Y. Ze, G. Zhang, K. Zhang, C. Hu, M. Wang, and H. Xu, “3d diffusion policy,” arXiv preprint arXiv:2403.03954, 2024.
  34. F. Abdolhosseini, H. Y. Ling, Z. Xie, X. B. Peng, and M. van de Panne, “On learning symmetric locomotion,” in Motion, Interaction and Games, ser. MIG ’19.   New York, NY, USA: Association for Computing Machinery, 2019.
  35. N. Rudin, D. Hoeller, M. Bjelonic, and M. Hutter, “Advanced skills by learning locomotion and local navigation end-to-end,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 2497–2503.
  36. V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, and G. State, “Isaac gym: High performance GPU based physics simulation for robot learning,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  37. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  38. G. Ji, J. Mun, H. Kim, and J. Hwangbo, “Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4630–4637, 2022.
  39. T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine, “Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,” in Conference on Robot Learning (CoRL), 2019.
  40. Y. Chen, T. Wu, S. Wang, X. Feng, J. Jiang, Z. Lu, S. McAleer, H. Dong, S.-C. Zhu, and Y. Yang, “Towards human-level bimanual dexterous manipulation with reinforcement learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 5150–5163, 2022.
  41. Y. Zhu, J. Wong, A. Mandlekar, and R. Martín-Martín, “robosuite: A modular simulation framework and benchmark for robot learning,” CoRR, vol. abs/2009.12293, 2020.
  42. S. James, Z. Ma, D. R. Arrojo, and A. J. Davison, “Rlbench: The robot learning benchmark & learning environment,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3019–3026, 2020.
  43. D. Yarats, R. Fergus, A. Lazaric, and L. Pinto, “Mastering visual continuous control: Improved data-augmented reinforcement learning,” in International Conference on Learning Representations, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zhengmao He (5 papers)
  2. Kun Lei (6 papers)
  3. Yanjie Ze (20 papers)
  4. Koushil Sreenath (90 papers)
  5. Zhongyu Li (72 papers)
  6. Huazhe Xu (93 papers)
Citations (7)

Summary

  • The paper introduces a hierarchical framework merging behavior cloning and reinforcement learning to enable leg-based loco-manipulation in quadrupedal robots.
  • The framework employs a high-level BC planner to generate manipulation trajectories and a low-level RL controller for precise, dynamic end-effector tracking.
  • Experiments demonstrate robust task performance and efficient sim-to-real transfer using only 20,000 visual timesteps compared to traditional methods.

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations

Introduction

Quadruped robots, equipped with four highly mobile and adaptable limbs, have ushered in a new era for robotic locomotion and manipulation in complex environments. Despite their advancements, the fusion of locomotion and object manipulation using these robots remains a challenging frontier, primarily due to the dynamic instability and control intricacies involved. Addressing this, the work under discussion presents a hierarchical learning framework that integrates the strengths of Behavior Cloning (BC) and Reinforcement Learning (RL) to empower quadruped robots with the ability to execute real-world manipulation tasks utilizing their legs for manipulation, negating the need for additional mechanical arms.

Related Work

Research in mobile manipulation has predominantly centered around wheeled robots with mounted mechanical arms, limiting operational terrains. Meanwhile, legged locomotion research has made substantial progress in enabling robots to traverse challenging terrains. However, integrating manipulation with locomotion on quadrupedal platforms is less explored, with existing research either focusing on attaching extra hardware, leading to cost and mobility compromises, or attempting leg-based manipulations with limited success in versatility, precision, and dynamic utilization.

Hierarchical Learning Framework

Framework Overview

The proposed framework segments loco-manipulation into two levels: a high-level BC-based planner that generates manipulation trajectories from visual inputs and a low-level RL-based controller that executes these trajectories with precise control. A rational Bézier curve parameterizes the manipulation trajectories, offering a flexible representation that encapsulates both positional and orientational targets for the end-effectors. This dual-layered approach effectively harnesses the respective strengths of BC in handling high-dimensional visual data and RL's proficiency in controlling dynamic systems.

High-level Planner

At the heart of the high-level planner lies a diffusion-based BC policy trained on a dataset of expert demonstrations, mapping visual point clouds and robot states to trajectory parameters. These parameters outline the desired manipulative actions of the robot in a scenario-independent manner. Extensive simulations facilitate the collection of a diverse set of these expert demonstrations, ensuring the adaptability of the learned planner across various tasks.

Low-level Controller

The low-level controller, trained through RL, equips the robot with the capability to track both position and orientation targets for the end-effector while maintaining dynamic stability across all limbs. This layer significantly depends on the parameterization of target trajectories, allowing for real-time adjustments and robust handling of external disturbances during task execution.

Design of Tasks for Loco-Manipulation

To evaluate and demonstrate the efficacy of the proposed framework, a suite of tasks encompassing a broad range of manipulation challenges was introduced. These tasks, designed around daily scenarios that a quadruped robot could encounter, necessitate a combination of intricate maneuvers including pushing, pulling, lifting, and precise positional adjustments, providing a comprehensive assessment platform for loco-manipulation capabilities.

Experiments

Experimental validations highlight the superiority of the hierarchical learning framework over traditional approaches in achieving higher task success rates across multiple scenarios. Notably, the framework exhibits remarkable efficiency in learning from visual data, leveraging around 20,000 timesteps of visual data to train the high-level planner, a fraction of what baselines require. Additionally, sim-to-real experiments confirm the practical viability of the approach, with successful task executions in real-world settings without further adjustments or fine-tuning. The control policy demonstrates precise end-effector tracking under varied task conditions, signifying robustness against disturbances and real-time adaptability.

Conclusion and Future Works

This research marks a significant step toward realizing versatile and dynamically stable loco-manipulation using quadruped robots without additional manipulative equipment. The hierarchical learning framework, merging BC and RL, showcases a path toward autonomous execution of complex real-world tasks through leg-based manipulation. Future work may explore expanding the diversity and complexity of tasks, improvement in data collection methodologies to enhance adaptability, and strategies to further bridge the gap between simulation and real-world applications

Github Logo Streamline Icon: https://streamlinehq.com