Dynamic Patch-aware Enrichment Transformer for Occluded Person   Re-Identification

Xin Zhang; Keren Fu; and Qijun Zhao

arXiv:2402.10435·cs.CV·February 19, 2024·3 cites

Dynamic Patch-aware Enrichment Transformer for Occluded Person Re-Identification

Xin Zhang, Keren Fu, and Qijun Zhao

PDF

Open Access

TL;DR

This paper introduces DPEFormer, a novel transformer-based model that automatically distinguishes occlusion-free human body features for person re-identification, improving robustness without external detectors or precise alignment.

Contribution

The paper proposes a dynamic patch token selection module, a feature blending module, and a realistic occlusion augmentation strategy, advancing occluded person re-ID methods.

Findings

01

Significantly outperforms existing state-of-the-art methods on occluded re-ID benchmarks.

02

Effectively distinguishes human body features from occlusions without external detectors.

03

Enhances feature representation through novel modules and data augmentation strategies.

Abstract

Person re-identification (re-ID) continues to pose a significant challenge, particularly in scenarios involving occlusions. Prior approaches aimed at tackling occlusions have predominantly focused on aligning physical body features through the utilization of external semantic cues. However, these methods tend to be intricate and susceptible to noise. To address the aforementioned challenges, we present an innovative end-to-end solution known as the Dynamic Patch-aware Enrichment Transformer (DPEFormer). This model effectively distinguishes human body information from occlusions automatically and dynamically, eliminating the need for external detectors or precise image alignment. Specifically, we introduce a dynamic patch token selection module (DPSM). DPSM utilizes a label-guided proxy token as an intermediary to identify informative occlusion-free tokens. These tokens are then selected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Face recognition and analysis

MethodsLinear Layer · Absolute Position Encodings · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Adam · Softmax · Attention Is All You Need · Contrastive Learning