TL;DR
This paper introduces ESS, a novel method for semantic segmentation using event cameras that transfers knowledge from labeled images to unlabeled events through unsupervised domain adaptation, supported by a new large-scale dataset.
Contribution
The paper presents ESS, a new approach for event-based semantic segmentation that does not require motion hallucination or pixel alignment, and introduces the DSEC-Semantic dataset.
Findings
ESS outperforms existing UDA methods using image labels alone.
ESS surpasses state-of-the-art supervised methods when combined with event labels.
The approach enables new research directions in event camera applications.
Abstract
Retrieving accurate semantic information in challenging high dynamic range (HDR) and high-speed conditions remains an open challenge for image-based algorithms due to severe image degradations. Event cameras promise to address these challenges since they feature a much higher dynamic range and are resilient to motion blur. Nonetheless, semantic segmentation with event cameras is still in its infancy which is chiefly due to the lack of high-quality, labeled datasets. In this work, we introduce ESS (Event-based Semantic Segmentation), which tackles this problem by directly transferring the semantic segmentation task from existing labeled image datasets to unlabeled events via unsupervised domain adaptation (UDA). Compared to existing UDA methods, our approach aligns recurrent, motion-invariant event embeddings with image embeddings. For this reason, our method neither requires video data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Advanced Neural Network Applications · Machine Learning and ELM
