Learning Visuotactile Skills with Two Multifingered Hands (2404.16823v2)

Published 25 Apr 2024 in cs.RO, cs.AI, cs.CV, and cs.LG

Abstract: Aiming to replicate human-like dexterity, perceptual experiences, and motion patterns, we explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data. Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hardware equipped with touch sensing. To tackle the first challenge, we develop HATO, a low-cost hands-arms teleoperation system that leverages off-the-shelf electronics, complemented with a software suite that enables efficient data collection; the comprehensive software suite also supports multimodal data processing, scalable policy learning, and smooth policy deployment. To tackle the latter challenge, we introduce a novel hardware adaptation by repurposing two prosthetic hands equipped with touch sensors for research. Using visuotactile data collected from our system, we learn skills to complete long-horizon, high-precision tasks which are difficult to achieve without multifingered dexterity and touch feedback. Furthermore, we empirically investigate the effects of dataset size, sensing modality, and visual input preprocessing on policy learning. Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data. Videos, code, and datasets can be found at https://toruowo.github.io/hato/ .

Citations (34)

View on Semantic Scholar

Summary

The paper presents HATO, a novel bimanual teleoperation system that leverages visuotactile data to emulate intricate, human-like manipulation skills.
It uses commercial VR controllers and adapted prosthetic hands with tactile sensors to capture multimodal data critical for diffusion-based policy learning.
Experimental results demonstrate high success rates in tasks like slippery object handover and tool-based manipulation, underscoring the system's robustness and versatility.

Exploring Bimanual Multifingered Manipulation Using Visuotactile Data

Introduction

In the pursuit of enhancing robotic dexterity, this paper introduces a unique bimanual system integrated with multifinger hands, which leverages both visual and tactile data. Addressing gaps in affordable teleoperation systems and the limited availability of multifingered hands equipped with tactile sensors, the research develops a novel teleoperation system named HATO. This system utilizes commercial VR hardware for efficient data collection and policy learning, aiming to emulate complex human-like manipulative skills.

System Development and Challenges

The work outlines two primary innovations: the HATO system and the adaptation of prosthetic hands for detailed tactile sensing.

HATO: Hands-Arms Tele-Operation

Hardware Utilization: The system incorporates two UR5e robot arms and repurposed prosthetic hands, each equipped with detailed tactile sensors.
Control Scheme: Utilizes Meta Quest 2 controllers, mapping VR controller motions to robotic arm movements and specific button interactions to hand joint manipulations. This setup allows intuitive control, catering to complex task requirements.

Multifingered Hands

Hand Design: Originally prosthetic devices, these hands are adapted with custom PCBs to facilitate research use, offering extensive touch sensitivity crucial for handling intricate tasks.

Methodology and Data Handling

The research team collected multimodal data using a comprehensive teleoperation setup, capturing precise robotic manipulations across various tasks.

Data Collection Process

Diverse sensory inputs including proprioception, touch, and visual data were synchronized and recorded at a robust rate, ensuring comprehensive coverage of each manipulation aspect.

Policy Learning

Using a diffusion-based approach to model action sequences from the multimodal dataset. This method allowed the trained policies to predict manipulations with a focus on mimicking human-like dexterity and responsiveness.

Experimental Results and Discussion

The experiments involved four complex bimanual tasks including slippery object handover and intricate tool-based tasks like steak serving. These tasks tested the system’s ability to handle objects of varying textures, weights, and complexities.

Task Performance

The system demonstrated high success rates across most tasks, particularly highlighting the capabilities in adaptive grasping and precise manipulation.

Impact of Sensory Modalities

Empirical evaluations showed that the combination of touch and vision was instrumental in achieving effective learning outcomes and task robustness, emphasizing the importance of integrated sensory inputs for comprehensive policy learning.

Conclusions and Future Work

The paper verifies the effectiveness of a low-cost, multifingered, bimanual system in executing dexterous tasks that approach human-like precision. It opens future avenues for incorporating haptic feedback to enrich interaction realism and enhancing generalizability across more diverse settings.

The researchers advocate for the continuance of this innovative approach, suggesting potential in expanding the capabilities of robotic systems to execute tasks requiring nuanced human-like dexterity and interaction. The open-source release of the hardware and software platforms used in this research aims to foster further exploration and collaboration within the field.

PDF Markdown

Related Papers

GitHub

HATO
GitHub - ToruOwO/hato (143 stars)

Tweets

https://twitter.com/ToruO_O/status/1783892120631484919

https://twitter.com/OWW/status/1783777714509320301

https://twitter.com/realmofresearch/status/1785329801571774935