RoboSolver: A Multi-Agent Large Language Model Framework for Solving Robotic Arm Problems
Hamid Khabazi, Ali F. Meghdari, Alireza Taheri

TL;DR
This paper introduces RoboSolver, a multi-agent framework combining LLMs and VLMs to automatically analyze and solve complex robotic arm problems, including kinematics and motion control, with high accuracy across various benchmarks.
Contribution
The study presents a novel multi-agent framework that integrates LLMs and VLMs for robotic problem solving, demonstrating significant accuracy improvements over raw models.
Findings
GPT-4o achieved 0.97 accuracy in kinematics tasks
Framework outperformed raw models by 20% in visual input tasks
Achieved 0.97 accuracy in diverse robotic tasks
Abstract
This study proposes an intelligent multi-agent framework built on LLMs and VLMs and specifically tailored to robotics. The goal is to integrate the strengths of LLMs and VLMs with computational tools to automatically analyze and solve problems related to robotic manipulators. Our developed framework accepts both textual and visual inputs and can automatically perform forward and inverse kinematics, compute velocities and accelerations of key points, generate 3D simulations of the robot, and ultimately execute motion control within the simulated environment, all according to the user's query. To evaluate the framework, three benchmark tests were designed, each consisting of ten questions. In the first benchmark test, the framework was evaluated while connected to GPT-4o, DeepSeek-V3.2, and Claude-Sonnet-4.5, as well as their corresponding raw models. The objective was to extract the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Artificial Intelligence in Games
