MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control (2208.07363v3)

Published 15 Aug 2022 in cs.RO, cs.GR, cs.LG, cs.SY, and eess.SY

Abstract: Simulated humanoids are an appealing research domain due to their physical capabilities. Nonetheless, they are also challenging to control, as a policy must drive an unstable, discontinuous, and high-dimensional physical system. One widely studied approach is to utilize motion capture (MoCap) data to teach the humanoid agent low-level skills (e.g., standing, walking, and running) that can then be re-used to synthesize high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains very hard, as MoCap data offers only kinematic information. Finding physical control inputs to realize the demonstrated motions requires computationally intensive methods like reinforcement learning. Thus, despite the publicly available MoCap data, its utility has been limited to institutions with large-scale compute. In this work, we dramatically lower the barrier for productive research on this topic by training and releasing high-quality agents that can track over three hours of MoCap data for a simulated humanoid in the dm_control physics-based environment. We release MoCapAct (Motion Capture with Actions), a dataset of these expert agents and their rollouts, which contain proprioceptive observations and actions. We demonstrate the utility of MoCapAct by using it to train a single hierarchical policy capable of tracking the entire MoCap dataset within dm_control and show the learned low-level component can be re-used to efficiently learn downstream high-level tasks. Finally, we use MoCapAct to train an autoregressive GPT model and show that it can control a simulated humanoid to perform natural motion completion given a motion prompt. Videos of the results and links to the code and dataset are available at https://microsoft.github.io/MoCapAct.

Authors (6)

Nolan Wagener (8 papers)
Andrey Kolobov (25 papers)
Felipe Vieira Frujeri (8 papers)
Ricky Loynd (6 papers)
Ching-An Cheng (48 papers)
Matthew Hausknecht (26 papers)

Citations (15)

View on Semantic Scholar

Summary

The paper demonstrates expert policy training with segmented MoCap clips that achieves high tracking accuracy in simulated humanoid control.
It introduces MoCapAct, a dataset of 2589 motion snippets that lowers computational barriers and democratizes research in advanced motion control.
The study leverages hierarchical and autoregressive models to enhance skill generalization and enable generative motion completion using GPT.

Overview of "MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control"

The paper "MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control" addresses the challenges and opportunities associated with controlling simulated humanoid agents using motion capture (MoCap) data. This work focuses on training policies that can manage the high-dimensional, unstable, and discontinuous nature of humanoid simulation environments, particularly when leveraging MoCap data for low-level skill acquisition.

The authors introduce MoCapAct, a dataset that significantly lowers the computational barriers typically associated with large-scale MoCap data usage. By training and releasing expert policies capable of tracking over three hours of data within a physics-based environment, the authors enable broader access for research institutions constrained by computational resources. These experts facilitate the synthesis of high-level humanoid behaviors from MoCap kinematic information, allowing MoCapAct to serve as a foundation for more advanced AI and machine learning research in humanoid control.

Key Contributions

Expert Policy Training: The paper details the creation of a comprehensive MoCap dataset by splitting longer MoCap clips into manageable snippets, with individual policies trained to track each snippet. The experts use a Gaussian policy framework optimized through Proximal Policy Optimization (PPO), yielding impressive tracking accuracy across diverse motion types.
Release of MoCapAct: The dataset encompasses 2589 snippets derived from over 800 MoCap clips, offering extensive coverage of human motion styles and complexities. This resource democratizes access to high-quality motion datasets that were formerly only feasible for institutions with substantial computational capabilities.
Multi-Clip Tracking Policy: Through hierarchical policy design, leveraging encoder-decoder architectures, the paper demonstrates the ability to train single policies capable of managing the entire range of MoCap snippets. This aspect of the research showcases potential applications in hierarchical reinforcement learning and skill generalization.
Generative Motion Models with GPT: The authors explore cutting-edge applications by training a generative motion completion model, specifically an autoregressive GPT, to synthetically extend motion prompts. This highlights the utility of the dataset not only for control but also for creative AI applications in motion synthesis.

Results and Implications

The paper reports quantitative achievements where trained experts achieve a mean normalized episode reward of 0.816, with the multi-clip policies performing at approximately 80-84% of the experts' level. Additionally, the paper corroborates the practical utility of pre-trained low-level policies in new reinforcement learning setups, demonstrating faster convergence and more realistic motion patterns compared to tabula rasa learning.

Furthermore, the research offers promising implications for the future of AI, particularly in learning-based humanoid control, hierarchical task learning, and motion imitation in robotics and animation industries. MoCapAct serves as a platform for further exploration into advanced AI techniques such as offline reinforcement learning and decision transformers.

Speculation on Future Developments

This paper sets the stage for several future research directions. The accessibility of MoCapAct could spur innovations in:

Hierarchical Task Learning: Enabling the seamless integration of multiple learned skills into complex, coordinated behaviors across adaptive environments.
Robot Interaction and Coordination: Leveraging MoCapAct to simulate and refine multi-agent interactions, advancing cooperative AI for humanoid robots.
Transfer Learning and Self-Supervised Learning: Utilizing the dataset to explore transfer and self-supervised learning paradigms, enhancing generalization capabilities of humanoid policies across varying contexts.

In conclusion, the release of MoCapAct is instrumental in reducing resource barriers in simulated humanoid control research. The dataset and accompanying methodologies empower a wider range of researchers to engage with complex motion control studies, promoting collaborative advancements in AI and robotics.

PDF Markdown