STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory
Mingfeng Yuan, Hao Zhang, Mahan Mohammadi, Runhao Li, Jinjun Shan, Steven L. Waslander

TL;DR
STaR introduces a scalable, task-conditioned multimodal memory framework for long-horizon robot reasoning, enabling efficient retrieval and reasoning over diverse, dynamic environments for navigation and task execution.
Contribution
The paper presents STaR, a novel framework that constructs a generalizable, multimodal long-term memory and a scalable retrieval algorithm based on the Information Bottleneck principle for robotic applications.
Findings
Outperforms baselines on NaVQA and WH-VQA benchmarks
Achieves higher success rates and lower spatial errors
Demonstrates robustness and scalability on real robots
Abstract
Mobile robots are often deployed over long durations in diverse open, dynamic scenes, including indoor setting such as warehouses and manufacturing facilities, and outdoor settings such as agricultural and roadway operations. A core challenge is to build a scalable long-horizon memory that supports an agentic workflow for planning, retrieval, and reasoning over open-ended instructions at variable granularity, while producing precise, actionable answers for navigation. We present STaR, an agentic reasoning framework that (i) constructs a task-agnostic, multimodal long-term memory that generalizes to unseen queries while preserving fine-grained environmental semantics (object attributes, spatial relations, and dynamic events), and (ii) introduces a Scalable Task Conditioned Retrieval algorithm based on the Information Bottleneck principle to extract from long-term memory a compact,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Constraint Satisfaction and Optimization · Robotics and Sensor-Based Localization
