SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality
Yuzhi Lai, Shenghai Yuan, Peizheng Li, Jun Lou, Andreas Zell

TL;DR
SEER-VAR is a comprehensive egocentric vehicle AR framework that combines semantic understanding, dual SLAM tracking, and LLM-based recommendations to improve scene awareness and driver assistance in diverse driving environments.
Contribution
It introduces a novel dynamic scene separation method, dual SLAM branches, and LLM-driven AR overlays, along with a new dataset for egocentric driving scenarios.
Findings
Achieves robust spatial alignment in varied environments
Enhances perceived scene understanding and overlay relevance
Improves driver ease and safety through AR overlays
Abstract
We present SEER-VAR, a novel framework for egocentric vehicle-based augmented reality (AR) that unifies semantic decomposition, Context-Aware SLAM Branches (CASB), and LLM-driven recommendation. Unlike existing systems that assume static or single-view settings, SEER-VAR dynamically separates cabin and road scenes via depth-guided vision-language grounding. Two SLAM branches track egocentric motion in each context, while a GPT-based module generates context-aware overlays such as dashboard cues and hazard alerts. To support evaluation, we introduce EgoSLAM-Drive, a real-world dataset featuring synchronized egocentric views, 6DoF ground-truth poses, and AR annotations across diverse driving scenarios. Experiments demonstrate that SEER-VAR achieves robust spatial alignment and perceptually coherent AR rendering across varied environments. As one of the first to explore LLM-based AR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
