UniPAR: A Unified Framework for Pedestrian Attribute Recognition
Minghe Xu, Rouying Wu, Jiarui Xu, Minhao Sun, Zikang Yan, Xiao Wang, ChiaWei Chu, Yu Li

TL;DR
UniPAR introduces a unified Transformer-based framework for pedestrian attribute recognition that handles multiple datasets and modalities, improving cross-domain robustness and performance in challenging environments.
Contribution
The paper presents UniPAR, a novel unified model with a phased fusion encoder and dynamic classification head for diverse pedestrian attribute recognition tasks.
Findings
Achieves comparable performance to state-of-the-art methods on benchmark datasets.
Enhances cross-domain generalization through multi-dataset joint training.
Improves robustness in low light and motion blur conditions.
Abstract
Pedestrian Attribute Recognition is a foundational computer vision task that provides essential support for downstream applications, including person retrieval in video surveillance and intelligent retail analytics. However, existing research is frequently constrained by the ``one-model-per-dataset" paradigm and struggles to handle significant discrepancies across domains in terms of modalities, attribute definitions, and environmental scenarios. To address these challenges, we propose UniPAR, a unified Transformer-based framework for PAR. By incorporating a unified data scheduling strategy and a dynamic classification head, UniPAR enables a single model to simultaneously process diverse datasets from heterogeneous modalities, including RGB images, video sequences, and event streams. We also introduce an innovative phased fusion encoder that explicitly aligns visual features with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
