TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski

TL;DR
TESPEC introduces a novel self-supervised pretraining framework for event cameras that leverages long-term event sequences and a new reconstruction target, significantly improving performance in perception tasks.
Contribution
It is the first framework to utilize long event sequences for pretraining, employing a new pseudo video reconstruction method tailored for event-based data.
Findings
Achieves state-of-the-art results in object detection, segmentation, and depth estimation.
Robust to sensor noise and reduces motion blur in event data.
Effectively captures long-term temporal information for perception tasks.
Abstract
Long-term temporal information is crucial for event-based perception tasks, as raw events only encode pixel brightness changes. Recent works show that when trained from scratch, recurrent models achieve better results than feedforward models in these tasks. However, when leveraging self-supervised pre-trained weights, feedforward models can outperform their recurrent counterparts. Current self-supervised learning (SSL) methods for event-based pre-training largely mimic RGB image-based approaches. They pre-train feedforward models on raw events within a short time interval, ignoring the temporal information of events. In this work, we introduce TESPEC, a self-supervised pre-training framework tailored for learning spatio-temporal information. TESPEC is well-suited for recurrent models, as it is the first framework to leverage long event sequences during pre-training. TESPEC employs the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Domain Adaptation and Few-Shot Learning
