Diagnosing Robotics Systems Issues with Large Language Models
Jordis Emilia Herrmann, Aswath Mandakath Gopinath, Mikael Norrlof,, Mark Niklas M\"uller

TL;DR
This paper explores the use of large language models for diagnosing issues in robotics systems, demonstrating that fine-tuned models can outperform larger models like GPT-4 in accuracy and cost-effectiveness.
Contribution
It introduces SYSDIAGBENCH, a new robotics diagnostics benchmark, and shows that fine-tuned LLMs can effectively perform root cause analysis in robotics.
Findings
QLoRA fine-tuning enables small models to outperform GPT-4 in diagnostics
A 7B-parameter model achieves accuracy comparable to GPT-4
Cost-effective LLMs can match human expert approval ratings
Abstract
Quickly resolving issues reported in industrial applications is crucial to minimize economic impact. However, the required data analysis makes diagnosing the underlying root causes a challenging and time-consuming task, even for experts. In contrast, large language models (LLMs) excel at analyzing large amounts of data. Indeed, prior work in AI-Ops demonstrates their effectiveness in analyzing IT systems. Here, we extend this work to the challenging and largely unexplored domain of robotics systems. To this end, we create SYSDIAGBENCH, a proprietary system diagnostics benchmark for robotics, containing over 2500 reported issues. We leverage SYSDIAGBENCH to investigate the performance of LLMs for root cause analysis, considering a range of model sizes and adaptation techniques. Our results show that QLoRA finetuning can be sufficient to let a 7B-parameter model outperform GPT-4 in terms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsAttention Is All You Need · Adam · Dropout · Dense Connections · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Linear Layer · Byte Pair Encoding · Absolute Position Encodings
