Sci-VLA: Agentic VLA Inference Plugin for Long-Horizon Tasks in Scientific Experiments

Yiwen Pang; Bo Zhou; Changjin Li; Xuanhao Wang; Shengxiang Xu; Deng-Bao Wang; Min-Ling Zhang; Shimin Di

arXiv:2602.09430·cs.RO·February 11, 2026

Sci-VLA: Agentic VLA Inference Plugin for Long-Horizon Tasks in Scientific Experiments

Yiwen Pang, Bo Zhou, Changjin Li, Xuanhao Wang, Shengxiang Xu, Deng-Bao Wang, Min-Ling Zhang, Shimin Di

PDF

Open Access

TL;DR

This paper introduces an inference plugin that uses a language model to guide robotic laboratory tasks, enabling reliable execution of complex, long-horizon scientific experiments without additional training.

Contribution

It proposes an LLM-based agentic inference mechanism that intervenes during task execution to handle composite tasks, addressing distributional mismatches in VLA models.

Findings

01

Increases success rate of atomic tasks by 42% during inference

02

Enables transfer from simulation to real laboratories

03

Operates efficiently without additional training

Abstract

Robotic laboratories play a critical role in autonomous scientific discovery by enabling scalable, continuous experimental execution. Recent vision-language-action (VLA) models offer a promising foundation for robotic laboratories. However, scientific experiments typically involve long-horizon tasks composed of multiple atomic tasks, posing a fundamental challenge to existing VLA models. While VLA models fine-tuned for scientific tasks can reliably execute atomic experimental actions seen during training, they often fail to perform composite tasks formed by reordering and composing these known atomic actions. This limitation arises from a distributional mismatch between training-time atomic tasks and inference-time composite tasks, which prevents VLA models from executing necessary transitional operations between atomic tasks. To address this challenge, we propose an Agentic VLA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Machine Learning in Materials Science · Robot Manipulation and Learning