Sci-VLA: Agentic VLA Inference Plugin for Long-Horizon Tasks in Scientific Experiments
Yiwen Pang, Bo Zhou, Changjin Li, Xuanhao Wang, Shengxiang Xu, Deng-Bao Wang, Min-Ling Zhang, Shimin Di

TL;DR
This paper introduces an inference plugin that uses a language model to guide robotic laboratory tasks, enabling reliable execution of complex, long-horizon scientific experiments without additional training.
Contribution
It proposes an LLM-based agentic inference mechanism that intervenes during task execution to handle composite tasks, addressing distributional mismatches in VLA models.
Findings
Increases success rate of atomic tasks by 42% during inference
Enables transfer from simulation to real laboratories
Operates efficiently without additional training
Abstract
Robotic laboratories play a critical role in autonomous scientific discovery by enabling scalable, continuous experimental execution. Recent vision-language-action (VLA) models offer a promising foundation for robotic laboratories. However, scientific experiments typically involve long-horizon tasks composed of multiple atomic tasks, posing a fundamental challenge to existing VLA models. While VLA models fine-tuned for scientific tasks can reliably execute atomic experimental actions seen during training, they often fail to perform composite tasks formed by reordering and composing these known atomic actions. This limitation arises from a distributional mismatch between training-time atomic tasks and inference-time composite tasks, which prevents VLA models from executing necessary transitional operations between atomic tasks. To address this challenge, we propose an Agentic VLA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Machine Learning in Materials Science · Robot Manipulation and Learning
