Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents
Stefan Tabakov, Asen Popov, Dimitar Dimitrov, S. Ensiye Kiyamousavi, Vladimir Hristov, Boris Kraychev

TL;DR
This paper introduces Atomic Action Slicing (AAS), a method that decomposes complex demonstrations into atomic actions to improve planner alignment and generalization in vision-language-action models, supported by a new dataset and improved task success rates.
Contribution
The paper presents AAS, a novel approach for decomposing demonstrations into atomic actions, along with a new dataset and improved model performance for VLA tasks.
Findings
AAS produces a validated dataset of 2,124 atomic segments.
Fine-tuning CLIP-RT+ on the dataset improves task success rates.
A stronger segmenter closely matches planner-defined plans and remains robust.
Abstract
Current vision-language-action (VLA) models generalize poorly, particularly when tasks require new compositions of skills or objects. We introduce Atomic Action Slicing (AAS), a planner-aligned approach that decomposes long-horizon demonstrations into short, typed atomic actions that are easier for planners to use and policies to learn. Using LIBERO demonstrations, AAS produces a validated dataset of 2,124 atomic segments labeled with action type, temporal span, and confidence. A stronger segmenter (Gemini 2.5 Pro) closely matches planner-defined plans and remains robust under keyframe jitter, while smaller models perform worse on multi-object tasks. Fine-tuning CLIP-RT+ on our atomic dataset improves task success from 94.2% to 95.3% on LIBERO-Goal and 83.8% to 88.8% on LIBERO-Long. We publicly release the GATE-VLAP dataset on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
