ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications
Lei Fu, Sahar Salimpour, Leonardo Militano, Harry Edelman, Jorge Pe\~na Queralta, Giovanni Toffetti

TL;DR
This paper presents an MCP server for analyzing robot data from ROS bags using LLMs and VLMs, enabling natural language processing and visualization for embodied AI applications, with experimental benchmarking of various models.
Contribution
Introduces a novel MCP server for robot data analysis with LLMs, supporting visualization, filtering, and benchmarking in embodied AI contexts.
Findings
Kimi K2 and Claude Sonnet 4 outperform others in tool calling.
Tool description schema and number of arguments influence success rates.
Large divide in tool calling capabilities among state-of-the-art LLMs.
Abstract
Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increasingly becoming a key component and enabler of agentic applications. However, the literature at the intersection of these verticals, i.e., Agentic Embodied AI, remains scarce. This paper introduces an MCP server for analyzing ROS and ROS 2 bags, allowing for analyzing, visualizing and processing robot data with natural language through LLMs and VLMs. We describe specific tooling built with robotics domain knowledge, with our initial release focused on mobile robotics and supporting natively the analysis of trajectories, laser scan data, transforms, or time series data. This is in addition to providing an interface to standard ROS 2 CLI tools ("ros2 bag list" or "ros2 bag info"), as well as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Motion and Animation · Social Robot Interaction and HRI
