Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric   Regularization Approach

Tri Nguyen; Shahana Ibrahim; Xiao Fu

arXiv:2305.19391·cs.LG·June 1, 2023·2 cites

Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach

Tri Nguyen, Shahana Ibrahim, Xiao Fu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes the theoretical properties of deep constrained clustering with noisy pairwise annotations and introduces a geometric regularization method that guarantees data membership identification despite annotation noise.

Contribution

It provides a theoretical understanding of logistic loss in deep constrained clustering and proposes a new geometric regularization loss to handle noisy annotations effectively.

Findings

01

Logistic DCC loss ensures data membership identifiability under certain conditions.

02

The new geometric regularization loss is robust to unknown annotation confusions.

03

Experimental results validate the effectiveness of the proposed method on multiple datasets.

Abstract

The recent integration of deep learning and pairwise similarity annotation-based constrained clustering -- i.e., $deep constrained clustering$ (DCC) -- has proven effective for incorporating weak supervision into massive data clustering: Less than 1% of pair similarity annotations can often substantially enhance the clustering accuracy. However, beyond empirical successes, there is a lack of understanding of DCC. In addition, many DCC paradigms are sensitive to annotation noise, but performance-guaranteed noisy DCC methods have been largely elusive. This work first takes a deep look into a recently emerged logistic loss function of DCC, and characterizes its theoretical properties. Our result shows that the logistic DCC loss ensures the identifiability of data membership under reasonable conditions, which may shed light on its effectiveness in practice. Building upon this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ductri/volmaxdcc
pytorchOfficial

Videos

Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach· slideslive

Taxonomy

TopicsRemote-Sensing Image Classification · Face and Expression Recognition · Video Surveillance and Tracking Methods