Using Large Language Models for Interpreting Autonomous Robots Behaviors
Miguel A. Gonz\'alez-Santamarta, Laura Fern\'andez-Becerra, David, Sobr\'in-Hidalgo, \'Angel Manuel Guerrero-Higueras, Irene Gonz\'alez,, Francisco J. Rodr\'iguez Lera

TL;DR
This paper investigates using Large Language Models to analyze autonomous robot logs, demonstrating GPT-4's superior performance but highlighting limitations in explaining complex robot behaviors.
Contribution
It introduces a framework for categorizing robot logs with LLMs and evaluates multiple models, emphasizing GPT-4's effectiveness in log analysis tasks.
Findings
GPT-4 outperforms other models in log question answering
LLMs can categorize robot logs into different aspects
Verbosity limits the models' ability to explain complex behaviors
Abstract
The deployment of autonomous robots in various domains has raised significant concerns about their trustworthiness and accountability. This study explores the potential of Large Language Models (LLMs) in analyzing ROS 2 logs generated by autonomous robots and proposes a framework for log analysis that categorizes log files into different aspects. The study evaluates the performance of three different language models in answering questions related to StartUp, Warning, and PDDL logs. The results suggest that GPT 4, a transformer-based model, outperforms other models, however, their verbosity is not enough to answer why or how questions for all kinds of actors involved in the interaction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Software System Performance and Reliability · Topic Modeling
