Can Large Language Models Assist the Comprehension of ROS2 Software Architectures?
Laura Duits, Bouazza El Moutaouakil, Ivano Malavolta

TL;DR
This study evaluates the effectiveness of various large language models in understanding complex ROS2 robotics architectures through a systematic experiment involving thousands of prompts.
Contribution
It introduces a generic algorithm for generating architecturally-relevant questions and assesses LLMs' accuracy and explanation quality in the context of ROS2 system comprehension.
Findings
Almost all questions answered correctly with an average of 98.22% accuracy.
Gemini-2.5-pro achieves 100% accuracy across all prompts and systems.
LLMs show potential but have intrinsic limitations affecting their reliability.
Abstract
Context. The most used development framework for robotics software is ROS2. ROS2 architectures are highly complex, with thousands of components communicating in a decentralized fashion. Goal. We aim to evaluate how LLMs can assist in the comprehension of factual information about the architecture of ROS2 systems. Method. We conduct a controlled experiment where we administer 1,230 prompts to 9 LLMs containing architecturally-relevant questions about 3 ROS2 systems with incremental size. We provide a generic algorithm that systematically generates architecturally-relevant questions for a ROS2 system. Then, we (i) assess the accuracy of the answers of the LLMs against a ground truth established via running and monitoring the 3 ROS2 systems and (ii) qualitatively analyse the explanations provided by the LLMs. Results. Almost all questions are answered correctly across all LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
