MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot   Interaction

Vignesh Prasad; Dorothea Koert; Ruth Stock-Homburg; Jan Peters,; Georgia Chalvatzaki

arXiv:2210.12418·cs.RO·January 24, 2023

MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction

Vignesh Prasad, Dorothea Koert, Ruth Stock-Homburg, Jan Peters,, Georgia Chalvatzaki

PDF

Open Access

TL;DR

MILD is a novel multimodal learning framework that models interaction dynamics in human-robot interactions using deep representation learning combined with probabilistic models, enabling accurate trajectory generation from high-dimensional data.

Contribution

The paper introduces MILD, a new method coupling deep latent space representations with HSMMs to effectively model and generate adaptive robot trajectories in HRI from high-dimensional demonstrations.

Findings

01

MILD captures multimodal interaction dynamics effectively.

02

It generates more accurate robot trajectories conditioned on human actions.

03

It can learn directly from camera pose data without extra training.

Abstract

Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human's actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation