EchoVLA: Synergistic Declarative Memory for VLA-Driven Mobile Manipulation
Min Lin, Xiwen Liang, Bingqian Lin, Liu Jingzhi, Zijian Jiao, Kehan Li, Yu Sun, Weijia Liufu, Yuhan Ma, Yuecheng Liu, Shen Zhao, Yuzheng Zhuang, Xiaodan Liang

TL;DR
EchoVLA introduces a memory-enhanced vision-language-action model for mobile manipulation, integrating scene and episodic memories to improve navigation and manipulation tasks in changing environments.
Contribution
The paper presents EchoVLA, a novel memory-aware VLA model with a human-inspired declarative memory system for mobile manipulation tasks.
Findings
Achieves success rates of 0.52 on manipulation/navigation tasks in simulation.
Outperforms baseline by +0.20 and +0.11 in success rates.
Demonstrates effectiveness in both simulated and real-world environments.
Abstract
Recent progress in Vision-Language-Action (VLA) models has enabled embodied agents to interpret multimodal instructions and perform complex tasks. However, existing VLAs are mostly confined to short-horizon, table-top manipulation, lacking the memory and reasoning capability required for mobile manipulation, where agents must coordinate navigation and manipulation under changing spatial contexts. In this work, we present EchoVLA, a memory-aware VLA model for mobile manipulation. EchoVLA incorporates a synergistic declarative memory inspired by the human brain, consisting of a scene memory that maintains a collection of spatial-semantic maps and an episodic memory that stores task-level experiences with multimodal contextual features. The two memories are individually stored, updated, and retrieved based on current observations, task history, and instructions, and their retrieved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Robotic Path Planning Algorithms
