Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Tri Nguyen, Shahana Ibrahim, Xiao Fu

TL;DR
This paper analyzes the theoretical properties of deep constrained clustering with noisy pairwise annotations and introduces a geometric regularization method that guarantees data membership identification despite annotation noise.
Contribution
It provides a theoretical understanding of logistic loss in deep constrained clustering and proposes a new geometric regularization loss to handle noisy annotations effectively.
Findings
Logistic DCC loss ensures data membership identifiability under certain conditions.
The new geometric regularization loss is robust to unknown annotation confusions.
Experimental results validate the effectiveness of the proposed method on multiple datasets.
Abstract
The recent integration of deep learning and pairwise similarity annotation-based constrained clustering -- i.e., (DCC) -- has proven effective for incorporating weak supervision into massive data clustering: Less than 1% of pair similarity annotations can often substantially enhance the clustering accuracy. However, beyond empirical successes, there is a lack of understanding of DCC. In addition, many DCC paradigms are sensitive to annotation noise, but performance-guaranteed noisy DCC methods have been largely elusive. This work first takes a deep look into a recently emerged logistic loss function of DCC, and characterizes its theoretical properties. Our result shows that the logistic DCC loss ensures the identifiability of data membership under reasonable conditions, which may shed light on its effectiveness in practice. Building upon this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRemote-Sensing Image Classification · Face and Expression Recognition · Video Surveillance and Tracking Methods
