Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms
Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian

TL;DR
This paper introduces CeleX-HAR, a large-scale high-definition event-based human action recognition dataset with 150 categories, and proposes a novel EVMamba model that effectively encodes spatio-temporal information for improved recognition performance.
Contribution
The paper presents a high-resolution event-based HAR dataset and a new EVMamba network with multi-directional spatial and voxel temporal scanning mechanisms.
Findings
EVMamba outperforms existing models on multiple datasets.
CeleX-HAR provides diverse real-world scenarios for benchmarking.
Source code and dataset will be publicly released.
Abstract
Human Action Recognition (HAR) stands as a pivotal research domain in both computer vision and artificial intelligence, with RGB cameras dominating as the preferred tool for investigation and innovation in this field. However, in real-world applications, RGB cameras encounter numerous challenges, including light conditions, fast motion, and privacy concerns. Consequently, bio-inspired event cameras have garnered increasing attention due to their advantages of low energy consumption, high dynamic range, etc. Nevertheless, most existing event-based HAR datasets are low resolution (). In this paper, we propose a large-scale, high-definition () human action recognition dataset based on the CeleX-V event camera, termed CeleX-HAR. It encompasses 150 commonly occurring action categories, comprising a total of 124,625 video sequences. Various factors such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Human Pose and Action Recognition
MethodsSoftmax · Attention Is All You Need · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
