NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models

Ziyue Zhu; Shangyang Wu; Shuai Zhao; Zhiqiu Zhao; Shengjie Li; Yi Wang; Fang Li; Haoran Luo

arXiv:2603.09542·cs.RO·March 11, 2026

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models

Ziyue Zhu, Shangyang Wu, Shuai Zhao, Zhiqiu Zhao, Shengjie Li, Yi Wang, Fang Li, Haoran Luo

PDF

Open Access

TL;DR

This paper introduces NS-VLA, a neuro-symbolic framework for vision-language-action tasks in robotics, combining symbolic encoding and reinforcement learning to improve data efficiency, generalization, and exploration.

Contribution

It presents a novel neuro-symbolic approach with online RL for VLA models, enhancing data efficiency, generalization, and primitive reuse in robotic manipulation.

Findings

01

Outperforms previous methods in one-shot training

02

Demonstrates superior zero-shot generalization

03

Achieves high data efficiency and exploration expansion

Abstract

Vision-Language-Action (VLA) models are formulated to ground instructions in visual context and generate action sequences for robotic manipulation. Despite recent progress, VLA models still face challenges in learning related and reusable primitives, reducing reliance on large-scale data and complex architectures, and enabling exploration beyond demonstrations. To address these challenges, we propose a novel Neuro-Symbolic Vision-Language-Action (NS-VLA) framework via online reinforcement learning (RL). It introduces a symbolic encoder to embedding vision and language features and extract structured primitives, utilizes a symbolic solver for data-efficient action sequencing, and leverages online RL to optimize generation via expansive exploration. Experiments on robotic manipulation benchmarks demonstrate that NS-VLA outperforms previous methods in both one-shot training and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Reinforcement Learning in Robotics