Affordance Field Intervention: Enabling VLAs to Escape Memory Traps in Robotic Manipulation
Siyu Xu, Zijian Wang, Yunke Wang, Chenghao Xia, Tao Huang, Chang Xu

TL;DR
This paper introduces Affordance Field Intervention (AFI), a hybrid framework that enhances vision-language-action models in robotic manipulation by using 3D affordance fields to improve adaptability and robustness in changing environments.
Contribution
The paper proposes a novel AFI framework that integrates 3D spatial affordance fields as a plug-in to guide and improve VLA models' performance under distribution shifts.
Findings
Achieves 23.5% average improvement in out-of-distribution scenarios on real robots.
Improves robustness of VLA models by guiding actions with affordance-based cues.
Demonstrates 20.2% performance boost on LIBERO-Pro benchmark.
Abstract
Vision-Language-Action (VLA) models have shown great performance in robotic manipulation by mapping visual observations and language instructions directly to actions. However, they remain brittle under distribution shifts: when test scenarios change, VLAs often reproduce memorized trajectories instead of adapting to the updated scene, which is a failure mode we refer to as the "Memory Trap". This limitation stems from the end-to-end design, which lacks explicit 3D spatial reasoning and prevents reliable identification of actionable regions in unfamiliar environments. To compensate for this missing spatial understanding, 3D Spatial Affordance Fields (SAFs) can provide a geometric representation that highlights where interactions are physically feasible, offering explicit cues about regions the robot should approach or avoid. We therefore introduce Affordance Field Intervention (AFI), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Robotic Path Planning Algorithms
