Semantically Self-Aligned Network for Text-to-Image Part-aware Person   Re-identification

Zefeng Ding; Changxing Ding; Zhiyin Shao; Dacheng Tao

arXiv:2107.12666·cs.CV·August 10, 2021·79 cites

Semantically Self-Aligned Network for Text-to-Image Part-aware Person Re-identification

Zefeng Ding, Changxing Ding, Zhiyin Shao, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper introduces SSAN, a novel network for text-to-image person re-identification that aligns semantic features across modalities, captures body part relationships, and reduces intra-class variance, achieving superior performance.

Contribution

The paper presents a semantically self-aligned network with part-level feature extraction, a multi-view non-local module, and a compound ranking loss, advancing text-to-image ReID methods.

Findings

01

SSAN outperforms existing methods on benchmark datasets.

02

The new ICFG-PEDES database facilitates future research.

03

The proposed components effectively reduce intra-class variance.

Abstract

Text-to-image person re-identification (ReID) aims to search for images containing a person of interest using textual descriptions. However, due to the significant modality gap and the large intra-class variance in textual descriptions, text-to-image ReID remains a challenging problem. Accordingly, in this paper, we propose a Semantically Self-Aligned Network (SSAN) to handle the above problems. First, we propose a novel method that automatically extracts semantically aligned part-level features from the two modalities. Second, we design a multi-view non-local network that captures the relationships between body parts, thereby establishing better correspondences between body parts and noun phrases. Third, we introduce a Compound Ranking (CR) loss that makes use of textual descriptions for other images of the same identity to provide extra supervision, thereby effectively reducing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zifyloo/SSAN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Human Pose and Action Recognition