Relational Future Captioning Model for Explaining Likely Collisions in   Daily Tasks

Motonari Kambara; Komei Sugiura

arXiv:2207.09083·cs.RO·July 20, 2022

Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks

Motonari Kambara, Komei Sugiura

PDF

Open Access 2 Repos

TL;DR

This paper introduces the Relational Future Captioning Model (RFCM), a novel approach for generating explanatory captions about potential future collisions in domestic robot tasks, enhancing robot understanding and safety.

Contribution

The paper presents RFCM with a Relational Self-Attention Encoder, improving relationship extraction in future event captioning for robots.

Findings

01

RFCM outperforms baseline methods on two datasets

02

Relational Self-Attention improves event relationship modeling

03

Enhanced explanation of collision risks in robot tasks

Abstract

Domestic service robots that support daily tasks are a promising solution for elderly or disabled people. It is crucial for domestic service robots to explain the collision risk before they perform actions. In this paper, our aim is to generate a caption about a future event. We propose the Relational Future Captioning Model (RFCM), a crossmodal language generation model for the future captioning task. The RFCM has the Relational Self-Attention Encoder to extract the relationships between events more effectively than the conventional self-attention in transformers. We conducted comparison experiments, and the results show the RFCM outperforms a baseline method on two datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques

Methodstravel james