Temporal-Guided Visual Foundation Models for Event-Based Vision

Ruihao Xia; Junhong Cai; Luziwei Leng; Liuyi Wang; Chengju Liu; Ran Cheng; Yang Tang; Pan Zhou

arXiv:2511.06238·cs.CV·November 11, 2025

Temporal-Guided Visual Foundation Models for Event-Based Vision

Ruihao Xia, Junhong Cai, Luziwei Leng, Liuyi Wang, Chengju Liu, Ran Cheng, Yang Tang, Pan Zhou

PDF

Open Access

TL;DR

This paper introduces TGVFM, a framework that combines pretrained visual foundation models with temporal attention mechanisms to enhance event-based vision tasks like segmentation, depth estimation, and detection, achieving state-of-the-art results.

Contribution

The paper presents a novel framework integrating pretrained VFMs with a temporal context fusion block, enabling effective event-based vision processing with improved performance.

Findings

01

Achieves 16% improvement in semantic segmentation

02

Achieves 21% improvement in depth estimation

03

Achieves 16% improvement in object detection

Abstract

Event cameras offer unique advantages for vision tasks in challenging environments, yet processing asynchronous event streams remains an open challenge. While existing methods rely on specialized architectures or resource-intensive training, the potential of leveraging modern Visual Foundation Models (VFMs) pretrained on image data remains under-explored for event-based vision. To address this, we propose Temporal-Guided VFM (TGVFM), a novel framework that integrates VFMs with our temporal context fusion block seamlessly to bridge this gap. Our temporal block introduces three key components: (1) Long-Range Temporal Attention to model global temporal dependencies, (2) Dual Spatiotemporal Attention for multi-scale frame correlation, and (3) Deep Feature Guidance Mechanism to fuse semantic-temporal features. By retraining event-to-video models on real-world data and leveraging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications