Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events

Lin Zhu; Ruonan Liu; Xiao Wang; Lizhi Wang; Hua Huang

arXiv:2508.05507·cs.CV·August 8, 2025

Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events

Lin Zhu, Ruonan Liu, Xiao Wang, Lizhi Wang, Hua Huang

PDF

TL;DR

This paper introduces a physics-inspired self-supervised pre-training framework for event camera data, effectively revealing latent information like edges and textures, and improving performance on multiple vision tasks despite noise and sparsity.

Contribution

It proposes a novel three-stage pre-training framework that enhances feature extraction from noisy, sparse event data, outperforming existing methods across various tasks.

Findings

01

Outperforms state-of-the-art methods on object recognition

02

Improves semantic segmentation accuracy

03

Enhances optical flow estimation robustness

Abstract

Event camera, a novel neuromorphic vision sensor, records data with high temporal resolution and wide dynamic range, offering new possibilities for accurate visual representation in challenging scenarios. However, event data is inherently sparse and noisy, mainly reflecting brightness changes, which complicates effective feature extraction. To address this, we propose a self-supervised pre-training framework to fully reveal latent information in event data, including edge information and texture cues. Our framework consists of three stages: Difference-guided Masked Modeling, inspired by the event physical sampling process, reconstructs temporal intensity difference maps to extract enhanced information from raw event data. Backbone-fixed Feature Transition contrasts event and image features without updating the backbone to preserve representations learned from masked modeling and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.