Interpreting and learning voice commands with a Large Language Model for   a robot system

Stanislau Stankevich; Wojciech Dudek

arXiv:2407.21512·cs.RO·August 1, 2024·2 cites

Interpreting and learning voice commands with a Large Language Model for a robot system

Stanislau Stankevich, Wojciech Dudek

PDF

Open Access

TL;DR

This paper explores integrating Large Language Models with robot systems to enhance voice command interpretation and decision-making, aiming to create more intuitive and adaptable human-robot communication interfaces.

Contribution

It introduces a novel approach combining LLMs with databases to improve robot understanding and learning from voice commands in real-time.

Findings

01

Enhanced robot response accuracy to voice commands

02

Improved decision-making capabilities in robot systems

03

Successful integration of LLMs with database systems

Abstract

Robots are increasingly common in industry and daily life, such as in nursing homes where they can assist staff. A key challenge is developing intuitive interfaces for easy communication. The use of Large Language Models (LLMs) like GPT-4 has enhanced robot capabilities, allowing for real-time interaction and decision-making. This integration improves robots' adaptability and functionality. This project focuses on merging LLMs with databases to improve decision-making and enable knowledge acquisition for request interpretation problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques

MethodsLinear Layer · Layer Normalization · Multi-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections