Semi-Supervised Information-Maximization Clustering
Daniele Calandriello, Gang Niu, Masashi Sugiyama

TL;DR
This paper introduces a semi-supervised clustering algorithm based on information maximization that effectively incorporates prior must-link and cannot-link constraints, offering analytical solutions and parameter tuning.
Contribution
It extends an existing unsupervised information-maximization clustering method to semi-supervised settings with efficient eigendecomposition and systematic parameter optimization.
Findings
The method is computationally efficient with analytical eigendecomposition.
It effectively incorporates must-link and cannot-link constraints.
Experiments demonstrate its practical usefulness.
Abstract
Semi-supervised clustering aims to introduce prior knowledge in the decision process of a clustering algorithm. In this paper, we propose a novel semi-supervised clustering algorithm based on the information-maximization principle. The proposed method is an extension of a previous unsupervised information-maximization clustering algorithm based on squared-loss mutual information to effectively incorporate must-links and cannot-links. The proposed method is computationally efficient because the clustering solution can be obtained analytically via eigendecomposition. Furthermore, the proposed method allows systematic optimization of tuning parameters such as the kernel width, given the degree of belief in the must-links and cannot-links. The usefulness of the proposed method is demonstrated through experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Text and Document Classification Technologies
