Bidirectional Intent Communication: A Role for Large Foundation Models

Tim Schreiter; Rishi Hazra; Jens R\"uppel; Andrey Rudenko

arXiv:2408.10589·cs.RO·August 21, 2024

Bidirectional Intent Communication: A Role for Large Foundation Models

Tim Schreiter, Rishi Hazra, Jens R\"uppel, Andrey Rudenko

PDF

Open Access

TL;DR

This paper presents Bident, a framework that enables robots to engage in bidirectional, multimodal interactions with humans, enhancing assistive applications like education and healthcare through seamless integration and personalized communication.

Contribution

Bident introduces a novel multimodal, bidirectional interaction framework for robots, emphasizing human-robot cooperation in shared spaces with speech, gaze, gestures, and actions.

Findings

01

Supports verbal and physical interactions

02

Enhances human-robot cooperation in shared environments

03

Potential applications in education and healthcare

Abstract

Integrating multimodal foundation models has significantly enhanced autonomous agents' language comprehension, perception, and planning capabilities. However, while existing works adopt a \emph{task-centric} approach with minimal human interaction, applying these models to developing assistive \emph{user-centric} robots that can interact and cooperate with humans remains underexplored. This paper introduces ``Bident'', a framework designed to integrate robots seamlessly into shared spaces with humans. Bident enhances the interactive experience by incorporating multimodal inputs like speech and user gaze dynamics. Furthermore, Bident supports verbal utterances and physical actions like gestures, making it versatile for bidirectional human-robot interactions. Potential applications include personalized education, where robots can adapt to individual learning styles and paces, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies