GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies

Ma\"elic Neau; Zoe Falomir; Paulo E. Santos; Anne-Gwenn Bosser; C\'edric Buche

arXiv:2511.04357·cs.RO·November 7, 2025

GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies

Ma\"elic Neau, Zoe Falomir, Paulo E. Santos, Anne-Gwenn Bosser, C\'edric Buche

PDF

Open Access

TL;DR

GraSP-VLA introduces a neuro-symbolic framework combining scene graphs and VLA policies to enhance long-horizon planning and skill learning in autonomous robots, addressing limitations of existing methods.

Contribution

It proposes a novel neuro-symbolic approach that uses scene graphs for symbolic planning and VLA policies for execution, improving scalability and generalization.

Findings

01

Effective in automatic planning domain generation from observations

02

Successful orchestration of low-level VLA policies in long-horizon tasks

03

Demonstrates potential in real-world robotic applications

Abstract

Deploying autonomous robots that can learn new skills from demonstrations is an important challenge of modern robotics. Existing solutions often apply end-to-end imitation learning with Vision-Language Action (VLA) models or symbolic approaches with Action Model Learning (AML). On the one hand, current VLA models are limited by the lack of high-level symbolic planning, which hinders their abilities in long-horizon tasks. On the other hand, symbolic approaches in AML lack generalization and scalability perspectives. In this paper we present a new neuro-symbolic approach, GraSP-VLA, a framework that uses a Continuous Scene Graph representation to generate a symbolic representation of human demonstrations. This representation is used to generate new planning domains during inference and serves as an orchestrator for low-level VLA policies, scaling up the number of actions that can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · AI-based Problem Solving and Planning