Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

129 tokens/sec

GPT-4o

28 tokens/sec

Gemini 2.5 Pro Pro

42 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

20 1

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation (2403.04436v1)

Published 7 Mar 2024 in cs.RO, cs.AI, cs.LG, cs.SY, and eess.SY

Abstract: We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwards, we train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. We successfully achieve teleoperation of dynamic whole-body motions in real-world scenarios, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.

References (60)

Citations (40)

View on Semantic Scholar

Summary

The paper introduces a novel framework that leverages reinforcement learning and a sim-to-data process to retarget human motions for real-time, whole-body humanoid teleoperation.
The paper details a two-step retargeting method that adapts the human body model to the humanoid structure and refines motions for feasibility under real-world constraints.
The paper demonstrates successful zero-shot transfer from simulation to real-world execution on a Unitree H1 robot, achieving dynamic motion replication with reduced motion data.

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

Introduction

Humanoid robots, by virtue of their design, offer unique advantages for tasks that require human-like interaction with the environment. However, controlling these robots to replicate the wide range of human motions in real-time presents a substantial challenge. Conventional model-based approaches to humanoid teleoperation often require simplifications and are heavily reliant on external sensor setups, limiting their applicability in dynamic tasks. Recent advancements in reinforcement learning offer promising solutions, but the application of these techniques to real-world humanoid teleoperation, particularly at the whole-body level, remains largely unexplored.

This paper presents Human to Humanoid (\method), a novel framework that enables the teleoperation of a full-sized humanoid robot using only an RGB camera. The approach is grounded in reinforcement learning, augmented with a "sim-to-data" process that refines a large-scale human motion dataset for feasibility with real-world humanoid constraints. Crucially, this system achieves zero-shot transfer from simulation to real-world application, demonstrating the robot's capability to mimic dynamic human motions such as walking, kicking, and complex gesturing in real time.

Methodology

Retargeting Human Motions

A key innovation in \method is its scalable approach to retargeting human motion for humanoid robots. This process involves adjusting large-scale human motion data to fit the physical constraints and capabilities of a humanoid. The paper introduces a two-step retargeting process, initially adapting the human body model (SMPL) to match the humanoid’s structure, followed by a novel "sim-to-data" method. This method employs a privileged motion imitator to filter out infeasible motions from the retargeted dataset, resulting in a refined set of motions which the real-world humanoid can feasibly execute.

Real-Time Whole-Body Teleoperation Training

The paper articulates a comprehensive training regimen for the humanoid robot, utilizing the retargeted and refined motion dataset. Key to this process is the formulation of an appropriate state space that captures essential motion details while remaining computationally tractable for real-time application. The framework incorporates advanced domain randomization techniques to ensure robustness and generalization of the control policy from simulation to real-world execution.

Evaluation and Results

The \method framework was rigorously tested in both simulated environments and real-world scenarios. In simulation, \method demonstrated superior performance in motion tracking accuracy and success rate over baseline approaches. Notably, the framework exhibited significant resilience to data reduction, maintaining high success rates even when trained on a substantially smaller motion dataset. In real-world tests, \method enabled a full-sized Unitree H1 humanoid robot to replicate a wide variety of dynamic human motions with high fidelity, signifying a substantial leap forward in humanoid teleoperation capabilities.

Implications and Future Directions

The success of \method in enabling advanced humanoid teleoperation has broad implications for the use of humanoid robots in environments that demand human-like dexterity and adaptability. Looking ahead, the paper outlines potential avenues for further research, including improving the representation of motion goals, closing the embodiment gap between humans and robots, and advancements in human-robot interaction to enhance teleoperation efficiency and intuitiveness.

In summary, \method represents a significative advancement in the field of humanoid robotics, offering a robust and scalable solution for real-time whole-body teleoperation using only an RGB camera. This work not only extends the frontier of humanoid robot control but also paves the way for greater human-robot collaboration in complex real-world tasks.

PDF Markdown

Tweets

https://twitter.com/taziku_co/status/1766057958050582638

https://twitter.com/WilliamLamkin/status/1766014517304033754

https://twitter.com/NickSherrard/status/1779585219374268458

YouTube

Show All Videos