A Temporal Densely Connected Recurrent Network for Event-based Human Pose Estimation
Zhanpeng Shao, Wen Zhou, Wuzhen Wang, Jianyu Yang, Youfu Li

TL;DR
This paper introduces a densely connected recurrent network for human pose estimation using event camera data, effectively handling incomplete information by modeling temporal and geometric consistency, and provides a new large-scale dataset for evaluation.
Contribution
The paper proposes a novel densely connected recurrent architecture for event-based human pose estimation and introduces a large-scale multimodal dataset with pose annotations.
Findings
Effective pose estimation on public datasets
Outperforms existing methods in accuracy
Demonstrates robustness to incomplete event data
Abstract
Event camera is an emerging bio-inspired vision sensors that report per-pixel brightness changes asynchronously. It holds noticeable advantage of high dynamic range, high speed response, and low power budget that enable it to best capture local motions in uncontrolled environments. This motivates us to unlock the potential of event cameras for human pose estimation, as the human pose estimation with event cameras is rarely explored. Due to the novel paradigm shift from conventional frame-based cameras, however, event signals in a time interval contain very limited information, as event cameras can only capture the moving body parts and ignores those static body parts, resulting in some parts to be incomplete or even disappeared in the time interval. This paper proposes a novel densely connected recurrent architecture to address the problem of incomplete information. By this recurrent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · CCD and CMOS Imaging Sensors · EEG and Brain-Computer Interfaces
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
