Constraint-Based Clustering Selection

Toon Van Craenendonck; Hendrik Blockeel

arXiv:1609.07272·stat.ML·September 26, 2016

Constraint-Based Clustering Selection

Toon Van Craenendonck, Hendrik Blockeel

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised clustering selection method that uses user-provided constraints to choose the best clustering from various algorithms and parameters, often outperforming existing methods.

Contribution

It proposes a new approach to semi-supervised clustering by selecting among different algorithms using constraints, rather than modifying a single algorithm.

Findings

01

The method often outperforms existing semi-supervised clustering techniques.

02

Using constraints for algorithm selection improves clustering quality.

03

Empirical results demonstrate the effectiveness of the proposed approach.

Abstract

Semi-supervised clustering methods incorporate a limited amount of supervision into the clustering process. Typically, this supervision is provided by the user in the form of pairwise constraints. Existing methods use such constraints in one of the following ways: they adapt their clustering procedure, their similarity metric, or both. All of these approaches operate within the scope of individual clustering algorithms. In contrast, we propose to use constraints to choose between clusterings generated by very different unsupervised clustering algorithms, run with different parameter settings. We empirically show that this simple approach often outperforms existing semi-supervised clustering methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Data Mining Algorithms and Applications