# Face Clustering: Representation and Pairwise Constraints

**Authors:** Yichun Shi, Charles Otto, Anil K. Jain

arXiv: 1706.05067 · 2018-07-30

## TL;DR

This paper introduces a novel face clustering method using a ResNet-based representation and a CRF model called ConPaC, which outperforms existing algorithms and can incorporate pairwise constraints for semi-supervised clustering.

## Contribution

The paper presents ConPaC, a new clustering algorithm based on CRF and belief propagation, with a ResNet face representation and a scalable k-NN variant for large datasets.

## Key findings

- ConPaC outperforms k-means, spectral clustering, and rank-order on LFW and IJB-B datasets.
- The method effectively incorporates pairwise constraints for semi-supervised clustering.
- The k-NN variant of ConPaC is scalable for large datasets.

## Abstract

Clustering face images according to their identity has two important applications: (i) grouping a collection of face images when no external labels are associated with images, and (ii) indexing for efficient large scale face retrieval. The clustering problem is composed of two key parts: face representation and choice of similarity for grouping faces. We first propose a representation based on ResNet, which has been shown to perform very well in image classification problems. Given this representation, we design a clustering algorithm, Conditional Pairwise Clustering (ConPaC), which directly estimates the adjacency matrix only based on the similarity between face images. This allows a dynamic selection of number of clusters and retains pairwise similarity between faces. ConPaC formulates the clustering problem as a Conditional Random Field (CRF) model and uses Loopy Belief Propagation to find an approximate solution for maximizing the posterior probability of the adjacency matrix. Experimental results on two benchmark face datasets (LFW and IJB-B) show that ConPaC outperforms well known clustering algorithms such as k-means, spectral clustering and approximate rank-order. Additionally, our algorithm can naturally incorporate pairwise constraints to obtain a semi-supervised version that leads to improved clustering performance. We also propose an k-NN variant of ConPaC, which has a linear time complexity given a k-NN graph, suitable for large datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.05067/full.md

## Figures

53 figures with captions in the complete paper: https://tomesphere.com/paper/1706.05067/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/1706.05067/full.md

---
Source: https://tomesphere.com/paper/1706.05067