Learning to Cluster Faces via Transformer

Jinxing Ye; Xioajiang Peng; Baigui Sun; Kai Wang; Xiuyu Sun; Hao Li,; Hanqing Wu

arXiv:2104.11502·cs.CV·April 26, 2021·6 cites

Learning to Cluster Faces via Transformer

Jinxing Ye, Xioajiang Peng, Baigui Sun, Kai Wang, Xiuyu Sun, Hao Li,, Hanqing Wu

PDF

Open Access

TL;DR

This paper introduces a Face Transformer model that improves face clustering accuracy by leveraging local context and relation encoding, achieving state-of-the-art results on benchmark datasets.

Contribution

The paper proposes a novel Face Transformer architecture that decomposes face clustering into relation encoding and linkage prediction, enhancing robustness and accuracy.

Findings

01

Achieves 91.12% pairwise F-score on MS-Celeb-1M

02

Outperforms existing methods on face clustering benchmarks

03

Demonstrates robustness to pose, occlusion, and image quality variations

Abstract

Face clustering is a useful tool for applications like automatic face annotation and retrieval. The main challenge is that it is difficult to cluster images from the same identity with different face poses, occlusions, and image quality. Traditional clustering methods usually ignore the relationship between individual images and their neighbors which may contain useful context information. In this paper, we repurpose the well-known Transformer and introduce a Face Transformer for supervised face clustering. In Face Transformer, we decompose the face clustering into two steps: relation encoding and linkage predicting. Specifically, given a face image, a \textbf{relation encoder} module aggregates local context information from its neighbors and a \textbf{linkage predictor} module judges whether a pair of images belong to the same cluster or not. In the local linkage graph view, Face…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition · Biometric Identification and Security

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Layer Normalization · Residual Connection · Byte Pair Encoding · Adam