VehicleMemBench: An Executable Benchmark for Multi-User Long-Term Memory in In-Vehicle Agents

Yuhao Chen; Yi Xu; Xinyun Ding; Xiang Fang; Shuochen Liu; Luxi Lin; Qingyu Zhang; Ya Li; Quan Liu; Tong Xu

arXiv:2603.23840·cs.AI·March 26, 2026

VehicleMemBench: An Executable Benchmark for Multi-User Long-Term Memory in In-Vehicle Agents

Yuhao Chen, Yi Xu, Xinyun Ding, Xiang Fang, Shuochen Liu, Luxi Lin, Qingyu Zhang, Ya Li, Quan Liu, Tong Xu

PDF

Open Access

TL;DR

VehicleMemBench is a new benchmark designed to evaluate multi-user long-term memory and decision-making in in-vehicle agents, addressing the limitations of existing static, single-user benchmarks.

Contribution

It introduces a comprehensive, executable simulation benchmark with multi-user memory modeling, tool interaction, and dynamic preference evolution, enabling more realistic evaluation of in-vehicle AI systems.

Findings

01

Powerful models excel at direct instructions but struggle with memory evolution.

02

Advanced memory systems face challenges in domain-specific, long-term memory tasks.

03

Dynamic user preferences significantly impact model performance.

Abstract

With the growing demand for intelligent in-vehicle experiences, vehicle-based agents are evolving from simple assistants to long-term companions. This evolution requires agents to continuously model multi-user preferences and make reliable decisions in the face of inter-user preference conflicts and changing habits over time. However, existing benchmarks are largely limited to single-user, static question-answer settings, failing to capture the temporal evolution of preferences and the multi-user, tool-interactive nature of real vehicle environments. To address this gap, we introduce VehicleMemBench, a multi-user long-context memory benchmark built on an executable in-vehicle simulation environment. The benchmark evaluates tool use and memory by comparing the post-action environment state with a predefined target state, enabling objective and reproducible evaluation without LLM-based or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Social Robot Interaction and HRI · Human-Automation Interaction and Safety