Privileged Information for Data Clustering

Jan Feyereisl; Uwe Aickelin

arXiv:1305.7454·cs.LG·June 3, 2013

Privileged Information for Data Clustering

Jan Feyereisl, Uwe Aickelin

PDF

TL;DR

This paper explores the use of privileged information in unsupervised data clustering, proposing new methods to improve stability and accuracy, and demonstrating their effectiveness on artificial and real-world datasets.

Contribution

It introduces the aRi-MAX and P-Dot algorithms that incorporate privileged information into clustering, extending Vapnik's supervised learning ideas to unsupervised settings.

Findings

01

aRi-MAX improves KMeans stability on artificial data

02

P-Dot fuses privileged and technical data for better clustering

03

Application to digit recognition confirms effectiveness

Abstract

Many machine learning algorithms assume that all input samples are independently and identically distributed from some common distribution on either the input space X, in the case of unsupervised learning, or the input and output space X x Y in the case of supervised and semi-supervised learning. In the last number of years the relaxation of this assumption has been explored and the importance of incorporation of additional information within machine learning algorithms became more apparent. Traditionally such fusion of information was the domain of semi-supervised learning. More recently the inclusion of knowledge from separate hypothetical spaces has been proposed by Vapnik as part of the supervised setting. In this work we are interested in exploring Vapnik's idea of master-class learning and the associated learning using privileged information, however within the unsupervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.