TL;DR
This paper presents a self-supervised learning approach to enhance event-stream representations for event-based vision, improving quality and versatility across various tasks and cameras without additional fine-tuning.
Contribution
Introduces EvRepSL, a novel data-driven, self-supervised method that refines event-stream representations based on a new spatial-temporal statistic and theoretical insights.
Findings
Outperforms existing event-stream representations in classification and optical flow tasks.
Demonstrates versatility across different event cameras and vision tasks.
Eliminates the need for fine-tuning or retraining during deployment.
Abstract
Event-stream representation is the first step for many computer vision tasks using event cameras. It converts the asynchronous event-streams into a formatted structure so that conventional machine learning models can be applied easily. However, most of the state-of-the-art event-stream representations are manually designed and the quality of these representations cannot be guaranteed due to the noisy nature of event-streams. In this paper, we introduce a data-driven approach aiming at enhancing the quality of event-stream representations. Our approach commences with the introduction of a new event-stream representation based on spatial-temporal statistics, denoted as EvRep. Subsequently, we theoretically derive the intrinsic relationship between asynchronous event-streams and synchronous video frames. Building upon this theoretical relationship, we train a representation generator,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
