Event-guided Low-light Video Semantic Segmentation
Zhen Yao, Mooi Choo Chuah

TL;DR
EVSNet is a lightweight, event-guided framework that significantly improves low-light video semantic segmentation by leveraging motion information from event cameras, achieving state-of-the-art performance with high efficiency.
Contribution
The paper introduces EVSNet, a novel, efficient architecture that uses event modality to enhance low-light video segmentation, addressing visibility and temporal consistency challenges.
Findings
Outperforms state-of-the-art methods on large-scale datasets.
Achieves up to 11x higher parameter efficiency.
Effectively leverages event data for illumination-invariant segmentation.
Abstract
Recent video semantic segmentation (VSS) methods have demonstrated promising results in well-lit environments. However, their performance significantly drops in low-light scenarios due to limited visibility and reduced contextual details. In addition, unfavorable low-light conditions make it harder to incorporate temporal consistency across video frames and thus, lead to video flickering effects. Compared with conventional cameras, event cameras can capture motion dynamics, filter out temporal-redundant information, and are robust to lighting conditions. To this end, we propose EVSNet, a lightweight framework that leverages event modality to guide the learning of a unified illumination-invariant representation. Specifically, we leverage a Motion Extraction Module to extract short-term and long-term temporal motions from event modality and a Motion Fusion Module to integrate image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Video Analysis and Summarization · Advanced Vision and Imaging
