Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion   based Classification

Chengguo Yuan; Yu Jin; Zongzhen Wu; Fanting Wei; Yangzirui Wang; Lan; Chen; and Xiao Wang

arXiv:2308.11937·cs.CV·August 24, 2023·2 cites

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion based Classification

Chengguo Yuan, Yu Jin, Zongzhen Wu, Fanting Wei, Yangzirui Wang, Lan, Chen, and Xiao Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a dual-stream Transformer and GNN-based framework for event image and voxel data fusion, significantly improving event-based object classification accuracy.

Contribution

It proposes a novel dual-stream architecture with a bottleneck Transformer for effective fusion of event images and voxels, enhancing feature representation.

Findings

01

Achieves state-of-the-art results on event classification datasets

02

Effectively models spatial and 3D stereo information separately

03

Demonstrates superior performance over existing methods

Abstract

Recognizing target objects using an event-based camera draws more and more attention in recent years. Existing works usually represent the event streams into point-cloud, voxel, image, etc, and learn the feature representations using various deep neural networks. Their final results may be limited by the following factors: monotonous modal expressions and the design of the network structure. To address the aforementioned challenges, this paper proposes a novel dual-stream framework for event representation, extraction, and fusion. This framework simultaneously models two common representations: event images and event voxels. By utilizing Transformer and Structured Graph Neural Network (GNN) architectures, spatial information and three-dimensional stereo information can be learned separately. Additionally, a bottleneck Transformer is introduced to facilitate the fusion of the dual-stream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

event-ahu/efv_event_classification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced X-ray and CT Imaging · Medical Imaging Techniques and Applications · Radiation Detection and Scintillator Technologies

MethodsMulti-Head Attention · Attention Is All You Need · Graph Neural Network · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · 1x1 Convolution · Layer Normalization · Pointwise Convolution