Emergent Mind

Learning Multimodal Latent Dynamics for Human-Robot Interaction

(2311.16380)
Published Nov 27, 2023 in cs.RO , cs.HC , and cs.LG

Abstract

This article presents a method for learning well-coordinated Human-Robot Interaction (HRI) from Human-Human Interactions (HHI). We devise a hybrid approach using Hidden Markov Models (HMMs) as the latent space priors for a Variational Autoencoder to model a joint distribution over the interacting agents. We leverage the interaction dynamics learned from HHI to learn HRI and incorporate the conditional generation of robot motions from human observations into the training, thereby predicting more accurate robot trajectories. The generated robot motions are further adapted with Inverse Kinematics to ensure the desired physical proximity with a human, combining the ease of joint space learning and accurate task space reachability. For contact-rich interactions, we modulate the robot's stiffness using HMM segmentation for a compliant interaction. We verify the effectiveness of our approach deployed on a Humanoid robot via a user study. Our method generalizes well to various humans despite being trained on data from just two humans. We find that Users perceive our method as more human-like, timely, and accurate and rank our method with a higher degree of preference over other baselines.

Overview

  • The paper addresses the advancement of Human-Robot Interaction (HRI) by creating systems that can anticipate human motion and respond with appropriate robot movements.

  • A hybrid machine learning approach is employed, using Hidden Markov Models (HMMs) and Variational Autoencoders (VAEs) to learn the latent dynamics of HRI.

  • Training incorporates Human-to-Human Interactions (HHI) insights and uses HMMs as latent space priors within VAEs, resulting in more accurate robot trajectory predictions.

  • Inverse Kinematics (IK) is used to fine-tune robot movements for practical interaction with humans, enhancing movements' naturalness, especially in contact-centric interactions.

  • User studies demonstrated that the model, despite limited training data, generated movements that were perceived as human-like and preferred over baseline methods.

Understanding Human-Robot Interaction Dynamics

Conceptualization

The study of Human-Robot Interaction (HRI) is crucial for advancing collaborative and assistive robotics. A critical feature of effective HRI is ensuring that robotic movement is synchronized and well-coordinated with human actions. Achieving this involves creating systems that can understand and predict human motion and generate corresponding responsive robot motions, a task often compared to learning from human-to-human interactions (HHI). One such approach combines the insights from HHI with machine learning models, particularly powering the learning process with a latent representation of the interactive dynamics between human and robot.

Methodology

The paper focuses on a hybrid approach that integrates Hidden Markov Models (HMMs) with Variational Autoencoders (VAEs). This is achieved by using HMMs as latent space priors within VAEs to capture joint distributions over interacting agents. Training with this setup, they could predict more accurate robot trajectories as it factors human observations directly into the dynamics learning. However, in real-world applications, these generated motions need fine-tuning to directly interact with humans. This is where Inverse Kinematics (IK) comes into play, adaptively modifying robot trajectories to ensure the robot’s physical proximity and movements are not only theoretically accurate but also practically applicable in human spaces. Additionally, for contact-centric interactions such as handshakes, the robot's motion compliance is modulated accordingly to make it more natural and lifelike.

Experimental Results

The paper reports on the deployment of this approach on a humanoid robot with a user study aimed at assessing the effectiveness of the trained model. The authors discovered that despite being trained on a limited set of human demonstrations, their model generalizes close to human movements and was preferred over other baseline methods. The hybrid training procedure seemed to have paid dividends, particularly when it comes to the perception of the robot's movements by human partners regarding human likeness, timeliness, accuracy, and preference.

Conclusions

This work demonstrates a step forward in the nuanced field of HRI, showcasing a methodology that lets robots learn from HHI through a blend of statistical and machine learning methods. The success of such an approach is evident in the generation of movement that humans perceive as more natural and cooperative within an interaction scenario. The ability of the system to not only execute learned motions but also adapt these in real-time to the specific human it is interacting with ushers in opportunities for more personalized and intuitive HRI in the future.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.