Emergent Mind

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations

(2403.20328)
Published Mar 29, 2024 in cs.RO and cs.LG

Abstract

Quadruped robots are progressively being integrated into human environments. Despite the growing locomotion capabilities of quadrupedal robots, their interaction with objects in realistic scenes is still limited. While additional robotic arms on quadrupedal robots enable manipulating objects, they are sometimes redundant given that a quadruped robot is essentially a mobile unit equipped with four limbs, each possessing 3 degrees of freedom (DoFs). Hence, we aim to empower a quadruped robot to execute real-world manipulation tasks using only its legs. We decompose the loco-manipulation process into a low-level reinforcement learning (RL)-based controller and a high-level Behavior Cloning (BC)-based planner. By parameterizing the manipulation trajectory, we synchronize the efforts of the upper and lower layers, thereby leveraging the advantages of both RL and BC. Our approach is validated through simulations and real-world experiments, demonstrating the robot's ability to perform tasks that demand mobility and high precision, such as lifting a basket from the ground while moving, closing a dishwasher, pressing a button, and pushing a door. Project website: https://zhengmaohe.github.io/leg-manip

Summary: Overview of 9 tasks trained for robot leg in non-prehensile loco-manipulation activities.

Overview

  • The paper introduces a hierarchical learning framework combining Behavior Cloning (BC) and Reinforcement Learning (RL) to enable quadruped robots to perform manipulation tasks using their legs.

  • It presents a novel approach for loco-manipulation by dividing the process into a high-level BC-based planner for trajectory generation from visual inputs and a low-level RL-based controller for precise execution.

  • Experimental results demonstrate superior task success rates and efficiency in learning from visual data compared to traditional methods, with successful real-world application without further adjustments.

  • The research paves the way for versatile and dynamically stable quadruped robot loco-manipulation without extra manipulative apparatus, suggesting directions for future work in task diversity and sim-to-real transfer improvements.

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations

Introduction

Quadruped robots, equipped with four highly mobile and adaptable limbs, have ushered in a new era for robotic locomotion and manipulation in complex environments. Despite their advancements, the fusion of locomotion and object manipulation using these robots remains a challenging frontier, primarily due to the dynamic instability and control intricacies involved. Addressing this, the work under discussion presents a hierarchical learning framework that integrates the strengths of Behavior Cloning (BC) and Reinforcement Learning (RL) to empower quadruped robots with the ability to execute real-world manipulation tasks utilizing their legs for manipulation, negating the need for additional mechanical arms.

Related Work

Research in mobile manipulation has predominantly centered around wheeled robots with mounted mechanical arms, limiting operational terrains. Meanwhile, legged locomotion research has made substantial progress in enabling robots to traverse challenging terrains. However, integrating manipulation with locomotion on quadrupedal platforms is less explored, with existing research either focusing on attaching extra hardware, leading to cost and mobility compromises, or attempting leg-based manipulations with limited success in versatility, precision, and dynamic utilization.

Hierarchical Learning Framework

Framework Overview

The proposed framework segments loco-manipulation into two levels: a high-level BC-based planner that generates manipulation trajectories from visual inputs and a low-level RL-based controller that executes these trajectories with precise control. A rational Bézier curve parameterizes the manipulation trajectories, offering a flexible representation that encapsulates both positional and orientational targets for the end-effectors. This dual-layered approach effectively harnesses the respective strengths of BC in handling high-dimensional visual data and RL's proficiency in controlling dynamic systems.

High-level Planner

At the heart of the high-level planner lies a diffusion-based BC policy trained on a dataset of expert demonstrations, mapping visual point clouds and robot states to trajectory parameters. These parameters outline the desired manipulative actions of the robot in a scenario-independent manner. Extensive simulations facilitate the collection of a diverse set of these expert demonstrations, ensuring the adaptability of the learned planner across various tasks.

Low-level Controller

The low-level controller, trained through RL, equips the robot with the capability to track both position and orientation targets for the end-effector while maintaining dynamic stability across all limbs. This layer significantly depends on the parameterization of target trajectories, allowing for real-time adjustments and robust handling of external disturbances during task execution.

Design of Tasks for Loco-Manipulation

To evaluate and demonstrate the efficacy of the proposed framework, a suite of tasks encompassing a broad range of manipulation challenges was introduced. These tasks, designed around daily scenarios that a quadruped robot could encounter, necessitate a combination of intricate maneuvers including pushing, pulling, lifting, and precise positional adjustments, providing a comprehensive assessment platform for loco-manipulation capabilities.

Experiments

Experimental validations highlight the superiority of the hierarchical learning framework over traditional approaches in achieving higher task success rates across multiple scenarios. Notably, the framework exhibits remarkable efficiency in learning from visual data, leveraging around 20,000 timesteps of visual data to train the high-level planner, a fraction of what baselines require. Additionally, sim-to-real experiments confirm the practical viability of the approach, with successful task executions in real-world settings without further adjustments or fine-tuning. The control policy demonstrates precise end-effector tracking under varied task conditions, signifying robustness against disturbances and real-time adaptability.

Conclusion and Future Works

This research marks a significant step toward realizing versatile and dynamically stable loco-manipulation using quadruped robots without additional manipulative equipment. The hierarchical learning framework, merging BC and RL, showcases a path toward autonomous execution of complex real-world tasks through leg-based manipulation. Future work may explore expanding the diversity and complexity of tasks, improvement in data collection methodologies to enhance adaptability, and strategies to further bridge the gap between simulation and real-world applications

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.