Event Camera Data Pre-training
Yan Yang, Liyuan Pan, Liu Liu

TL;DR
This paper introduces a self-supervised pre-training framework for event camera data that leverages paired RGB images and novel training strategies to improve downstream task performance.
Contribution
It presents a new self-supervised learning approach with event data augmentation, conditional masking, and contrastive learning, enhancing event camera data understanding.
Findings
Achieves 64.83% top-1 accuracy on N-ImageNet.
Outperforms state-of-the-art methods in downstream tasks.
Introduces a novel embedding and distribution alignment loss.
Abstract
This paper proposes a pre-trained neural network for handling event camera data. Our model is a self-supervised learning framework, and uses paired event camera data and natural RGB images for training. Our method contains three modules connected in a sequence: i) a family of event data augmentations, generating meaningful event images for self-supervised training; ii) a conditional masking strategy to sample informative event patches from event images, encouraging our model to capture the spatial layout of a scene and accelerating training; iii) a contrastive learning approach, enforcing the similarity of embeddings between matching event images, and between paired event and RGB images. An embedding projection loss is proposed to avoid the model collapse when enforcing the event image embedding similarities. A probability distribution alignment loss is proposed to encourage the event…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Event Camera Data Pre-training· youtube
Taxonomy
TopicsAdvanced Memory and Neural Computing · Advanced MRI Techniques and Applications · Atomic and Subatomic Physics Research
MethodsContrastive Learning
