I Can Tell What I am Doing: Toward Real-World Natural Language Grounding   of Robot Experiences

Zihan Wang; Brian Liang; Varad Dhat; Zander Brumbaugh; Nick Walker,; Ranjay Krishna; Maya Cakmak

arXiv:2411.12960·cs.RO·November 21, 2024·2 cites

I Can Tell What I am Doing: Toward Real-World Natural Language Grounding of Robot Experiences

Zihan Wang, Brian Liang, Varad Dhat, Zander Brumbaugh, Nick Walker,, Ranjay Krishna, Maya Cakmak

PDF

Open Access 1 Datasets

TL;DR

This paper presents RONAR, an LLM-based system that translates multi-modal robot experiences into natural language, improving transparency, failure analysis, and human-robot interaction.

Contribution

Introduces RONAR, a novel multi-modal framework for natural language narration of robot experiences, along with a new real-robot dataset and empirical validation.

Findings

01

RONAR outperforms existing methods in various scenarios.

02

Enhances failure recovery efficiency.

03

Improves user experience in system transparency.

Abstract

Understanding robot behaviors and experiences through natural language is crucial for developing intelligent and transparent robotic systems. Recent advancement in large language models (LLMs) makes it possible to translate complex, multi-modal robotic experiences into coherent, human-readable narratives. However, grounding real-world robot experiences into natural language is challenging due to many reasons, such as multi-modal nature of data, differing sample rates, and data volume. We introduce RONAR, an LLM-based system that generates natural language narrations from robot experiences, aiding in behavior announcement, failure analysis, and human interaction to recover failure. Evaluated across various scenarios, RONAR outperforms state-of-the-art methods and improves failure recovery efficiency. Our contributions include a multi-modal framework for robot experience narration, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

robonar/robonar
dataset· 398 dl
398 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications