Learning Multimodal Latent Dynamics for Human-Robot Interaction (2311.16380v1)

Published 27 Nov 2023 in cs.RO, cs.HC, and cs.LG

Abstract: This article presents a method for learning well-coordinated Human-Robot Interaction (HRI) from Human-Human Interactions (HHI). We devise a hybrid approach using Hidden Markov Models (HMMs) as the latent space priors for a Variational Autoencoder to model a joint distribution over the interacting agents. We leverage the interaction dynamics learned from HHI to learn HRI and incorporate the conditional generation of robot motions from human observations into the training, thereby predicting more accurate robot trajectories. The generated robot motions are further adapted with Inverse Kinematics to ensure the desired physical proximity with a human, combining the ease of joint space learning and accurate task space reachability. For contact-rich interactions, we modulate the robot's stiffness using HMM segmentation for a compliant interaction. We verify the effectiveness of our approach deployed on a Humanoid robot via a user study. Our method generalizes well to various humans despite being trained on data from just two humans. We find that Users perceive our method as more human-like, timely, and accurate and rank our method with a higher degree of preference over other baselines.

References (87)

Authors (6)

Vignesh Prasad (17 papers)
Lea Heitlinger (1 paper)
Dorothea Koert (9 papers)
Ruth Stock-Homburg (7 papers)
Jan Peters (253 papers)
Georgia Chalvatzaki (44 papers)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel hybrid approach combining HMMs and VAEs to effectively learn and predict interactive motion dynamics.
It demonstrates that integrating inverse kinematics fine-tunes robot trajectories in real time to achieve natural human responses.
Results from a humanoid robot study reveal enhanced human-likeness, timeliness, and overall preference over baseline methods.

Understanding Human-Robot Interaction Dynamics

Conceptualization

The paper of Human-Robot Interaction (HRI) is crucial for advancing collaborative and assistive robotics. A critical feature of effective HRI is ensuring that robotic movement is synchronized and well-coordinated with human actions. Achieving this involves creating systems that can understand and predict human motion and generate corresponding responsive robot motions, a task often compared to learning from human-to-human interactions (HHI). One such approach combines the insights from HHI with machine learning models, particularly powering the learning process with a latent representation of the interactive dynamics between human and robot.

Methodology

The paper focuses on a hybrid approach that integrates Hidden Markov Models (HMMs) with Variational Autoencoders (VAEs). This is achieved by using HMMs as latent space priors within VAEs to capture joint distributions over interacting agents. Training with this setup, they could predict more accurate robot trajectories as it factors human observations directly into the dynamics learning. However, in real-world applications, these generated motions need fine-tuning to directly interact with humans. This is where Inverse Kinematics (IK) comes into play, adaptively modifying robot trajectories to ensure the robot’s physical proximity and movements are not only theoretically accurate but also practically applicable in human spaces. Additionally, for contact-centric interactions such as handshakes, the robot's motion compliance is modulated accordingly to make it more natural and lifelike.

Experimental Results

The paper reports on the deployment of this approach on a humanoid robot with a user paper aimed at assessing the effectiveness of the trained model. The authors discovered that despite being trained on a limited set of human demonstrations, their model generalizes close to human movements and was preferred over other baseline methods. The hybrid training procedure seemed to have paid dividends, particularly when it comes to the perception of the robot's movements by human partners regarding human likeness, timeliness, accuracy, and preference.

Conclusions

This work demonstrates a step forward in the nuanced field of HRI, showcasing a methodology that lets robots learn from HHI through a blend of statistical and machine learning methods. The success of such an approach is evident in the generation of movement that humans perceive as more natural and cooperative within an interaction scenario. The ability of the system to not only execute learned motions but also adapt these in real-time to the specific human it is interacting with ushers in opportunities for more personalized and intuitive HRI in the future.

PDF Markdown