EZSR: Event-based Zero-Shot Recognition

Yan Yang; Liyuan Pan; Dongxu Li; Liu Liu

arXiv:2407.21616·cs.CV·November 26, 2024

EZSR: Event-based Zero-Shot Recognition

Yan Yang, Liyuan Pan, Dongxu Li, Liu Liu

PDF

Open Access

TL;DR

This paper introduces EZSR, a novel event-based zero-shot object recognition method that leverages a new event encoder and data synthesis techniques to outperform existing approaches on benchmark datasets.

Contribution

The study develops an event encoder without reconstruction networks, analyzes performance bottlenecks, and proposes a scalar-wise modulation strategy for improved zero-shot recognition.

Findings

01

Achieves 47.84% zero-shot accuracy on N-ImageNet with ViT/B-16.

02

Demonstrates superior performance over previous methods on standard benchmarks.

03

Shows effective scaling with increased parameters and synthesized data.

Abstract

This paper studies zero-shot object recognition using event camera data. Guided by CLIP, which is pre-trained on RGB images, existing approaches achieve zero-shot object recognition by optimizing embedding similarities between event data and RGB images respectively encoded by an event encoder and the CLIP image encoder. Alternatively, several methods learn RGB frame reconstructions from event data for the CLIP image encoder. However, they often result in suboptimal zero-shot performance. This study develops an event encoder without relying on additional reconstruction networks. We theoretically analyze the performance bottlenecks of previous approaches: the embedding optimization objectives are prone to suffer from the spatial sparsity of event data, causing semantic misalignments between the learned event embedding space and the CLIP text embedding space. To mitigate the issue, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications · Radiation Detection and Scintillator Technologies · Nuclear Physics and Applications

MethodsContrastive Language-Image Pre-training