LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang, Zhehua Zhou, Jiayang Song, Yuheng Huang, Zhan Shu, Lei Ma

TL;DR
LADEV is a comprehensive platform that automates the testing and evaluation of vision-language-action models in robotics, improving efficiency and robustness assessment through environment generation, instruction paraphrasing, and large-scale testing.
Contribution
This work introduces LADEV, a novel platform that automates environment creation, diversifies task instructions, and enables large-scale evaluation of VLA models in robotics.
Findings
LADEV significantly improves testing efficiency for VLA models.
The platform provides a reliable baseline for model evaluation.
Experiments demonstrate LADEV's effectiveness on state-of-the-art models.
Abstract
Building on the advancements of Large Language Models (LLMs) and Vision Language Models (VLMs), recent research has introduced Vision-Language-Action (VLA) models as an integrated solution for robotic manipulation tasks. These models take camera images and natural language task instructions as input and directly generate control actions for robots to perform specified tasks, greatly improving both decision-making capabilities and interaction with human users. However, the data-driven nature of VLA models, combined with their lack of interpretability, makes the assurance of their effectiveness and robustness a challenging task. This highlights the need for a reliable testing and evaluation platform. For this purpose, in this work, we propose LADEV, a comprehensive and efficient platform specifically designed for evaluating VLA models. We first present a language-driven approach that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning
