LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video   Reconstruction

Kanghao Chen; Hangyu Li; JiaZhou Zhou; Zeyu Wang; Lin Wang

arXiv:2407.05547·cs.CV·July 18, 2024·1 cites

LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video Reconstruction

Kanghao Chen, Hangyu Li, JiaZhou Zhou, Zeyu Wang, Lin Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces LaSe-E2V, a novel framework that leverages language guidance and diffusion models to improve semantic consistency and quality in event-to-video reconstruction, addressing artifacts and regional blur.

Contribution

It proposes a language-guided, semantic-aware E2V reconstruction method with event-conditioned attention, new loss functions, and data augmentation strategies for training without paired data.

Findings

01

Outperforms existing methods on multiple challenging datasets.

02

Achieves higher semantic consistency and visual quality.

03

Effective in scenarios with fast motion and low light.

Abstract

Event cameras harness advantages such as low latency, high temporal resolution, and high dynamic range (HDR), compared to standard cameras. Due to the distinct imaging paradigm shift, a dominant line of research focuses on event-to-video (E2V) reconstruction to bridge event-based and standard computer vision. However, this task remains challenging due to its inherently ill-posed nature: event cameras only detect the edge and motion information locally. Consequently, the reconstructed videos are often plagued by artifacts and regional blur, primarily caused by the ambiguous semantics of event data. In this paper, we find language naturally conveys abundant semantic information, rendering it stunningly superior in ensuring semantic consistency for E2V reconstruction. Accordingly, we propose a novel framework, called LaSe-E2V, that can achieve semantic-aware high-quality E2V reconstruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LaSe-E2V: Towards Language-guided Semantic-aware Event-to-Video Reconstruction· slideslive

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsSoftmax · Attention Is All You Need · Diffusion