A Multi-layer LSTM-based Approach for Robot Command Interaction Modeling

Martino Mensio; Emanuele Bastianelli; Ilaria Tiddi; Giuseppe Rizzo

arXiv:1811.05242·cs.CL·November 14, 2018·1 cites

A Multi-layer LSTM-based Approach for Robot Command Interaction Modeling

Martino Mensio, Emanuele Bastianelli, Ilaria Tiddi, Giuseppe Rizzo

PDF

Open Access

TL;DR

This paper proposes a multi-layer LSTM neural network with attention for semantic parsing of vocal commands to service robots, aiming to improve natural language understanding in human-robot interaction.

Contribution

It introduces a novel multi-layer LSTM-based approach with attention for semantic parsing in robot command understanding, trained on a dedicated corpus.

Findings

01

Preliminary results show promising parsing accuracy.

02

The approach outperforms some previous methods.

03

The system demonstrates potential for real-world robot interaction.

Abstract

As the first robotic platforms slowly approach our everyday life, we can imagine a near future where service robots will be easily accessible by non-expert users through vocal interfaces. The capability of managing natural language would indeed speed up the process of integrating such platform in the ordinary life. Semantic parsing is a fundamental task of the Natural Language Understanding process, as it allows extracting the meaning of a user utterance to be used by a machine. In this paper, we present a preliminary study to semantically parse user vocal commands for a House Service robot, using a multi-layer Long-Short Term Memory neural network with attention mechanism. The system is trained on the Human Robot Interaction Corpus, and it is preliminarily compared with previous approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings