PFM-VEPAR: Prompting Foundation Models for RGB-Event Camera based Pedestrian Attribute Recognition
Minghe Xu, Rouying Wu, ChiaWei Chu, Xiao Wang, Yu Li

TL;DR
This paper introduces PFM-VEPAR, a novel framework that efficiently combines lightweight frequency-domain event features with RGB data using a memory-augmented cross-attention mechanism for improved pedestrian attribute recognition in challenging conditions.
Contribution
It proposes an efficient event prompter with DCT/IDCT operations and a memory bank with Hopfield networks to enhance RGB-based pedestrian attribute recognition without heavy computation.
Findings
Significant performance improvements on benchmark datasets
Reduced computational cost compared to existing methods
Effective fusion of event and RGB data for attribute recognition
Abstract
Event-based pedestrian attribute recognition (PAR) leverages motion cues to enhance RGB cameras in low-light and motion-blur scenarios, enabling more accurate inference of attributes like age and emotion. However, existing two-stream multimodal fusion methods introduce significant computational overhead and neglect the valuable guidance from contextual samples. To address these limitations, this paper proposes an Event Prompter. Discarding the computationally expensive auxiliary backbone, this module directly applies extremely lightweight and efficient Discrete Cosine Transform (DCT) and Inverse DCT (IDCT) operations to the event data. This design extracts frequency-domain event features at a minimal computational cost, thereby effectively augmenting the RGB branch. Furthermore, an external memory bank designed to provide rich prior knowledge, combined with modern Hopfield networks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Advanced Neural Network Applications
