SELF-VLA: A Skill Enhanced Agentic Vision-Language-Action Framework for Contact-Rich Disassembly

Chang Liu; Sibo Tian; Xiao Liang; Minghui Zheng

arXiv:2603.11080·cs.RO·March 13, 2026

SELF-VLA: A Skill Enhanced Agentic Vision-Language-Action Framework for Contact-Rich Disassembly

Chang Liu, Sibo Tian, Xiao Liang, Minghui Zheng

PDF

Open Access

TL;DR

SELF-VLA introduces a skill-enhanced vision-language-action framework that significantly improves robotic disassembly performance, addressing variability and complexity in contact-rich, long-horizon tasks, and surpasses existing models in accuracy and adaptability.

Contribution

The paper presents a novel agentic VLA framework with integrated disassembly skills, enhancing generalization and performance in complex robotic disassembly tasks.

Findings

01

Outperforms state-of-the-art VLA models on contact-rich tasks

02

Demonstrates improved generalization to variable EoL products

03

Achieves higher success rates in complex disassembly scenarios

Abstract

Disassembly automation has long been pursued to address the growing demand for efficient and proper recovery of valuable components from the end-of-life (EoL) electronic products. Existing approaches have demonstrated promising and regimented performance by decomposing the disassembly process into different subtasks. However, each subtask typically requires extensive data preparation, model training, and system management. Moreover, these approaches are often task- and component-specific, making them poorly suited to handle the variability and uncertainty of EoL products and limiting their generalization capabilities. All these factors restrict the practical deployment of current robotic disassembly systems and leave them highly reliant on human labor. With the recent development of foundation models in robotics, vision-language-action (VLA) models have shown impressive performance on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsManufacturing Process and Optimization · Robot Manipulation and Learning · 3D Shape Modeling and Analysis