OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Lingdong Kong; Youquan Liu; Lai Xing Ng; Benoit R. Cottereau; and Wei Tsang Ooi

arXiv:2405.05259·cs.CV·May 9, 2024·1 cites

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Lingdong Kong, Youquan Liu, Lai Xing Ng, Benoit R. Cottereau, and Wei Tsang Ooi

PDF

Open Access 1 Repo

TL;DR

OpenESS introduces a novel method that leverages image, text, and event data to improve event-based semantic segmentation, achieving high accuracy without extensive annotations by transferring CLIP knowledge and using cross-modality regularization.

Contribution

This work is the first to combine image, text, and event data for scalable, annotation-efficient semantic segmentation using CLIP knowledge transfer and novel regularization techniques.

Findings

01

Achieves 53.93% mIoU on DDD17 without labels.

02

Achieves 43.31% mIoU on DSEC-Semantic without labels.

03

Outperforms existing ESS methods on benchmark datasets.

Abstract

Event-based semantic segmentation (ESS) is a fundamental yet challenging task for event camera sensing. The difficulties in interpreting and annotating event data limit its scalability. While domain adaptation from images to event data can help to mitigate this issue, there exist data representational differences that require additional effort to resolve. In this work, for the first time, we synergize information from image, text, and event-data domains and introduce OpenESS to enable scalable ESS in an open-world, annotation-efficient manner. We achieve this goal by transferring the semantically rich CLIP knowledge from image-text pairs to event streams. To pursue better cross-modality adaptation, we propose a frame-to-event contrastive distillation and a text-to-event semantic consistency regularization. Experimental results on popular ESS benchmarks showed our approach outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ldkong1205/openess
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAtomic and Subatomic Physics Research · Advanced Memory and Neural Computing · Radiation Detection and Scintillator Technologies

MethodsContrastive Language-Image Pre-training