Verbal Characterization of Probabilistic Clusters using Minimal Discriminative Propositions
Yoshitaka Kameya, Satoru Nakamura, Tatsuya Iwasaki, Taisuke Sato

TL;DR
This paper introduces a method for automatically verbalizing and interpreting probabilistic clusters by extracting minimal discriminative propositions, aiding understanding and evaluation of clustering results across diverse datasets.
Contribution
It proposes a novel approach to verbalize clusters using conjunctions of attribute-value propositions, enhancing interpretability and evaluation in mixture model clustering.
Findings
Effective verbalization of clusters demonstrated on standard datasets
Method handles continuous attributes and missing values
Feedback from interpretation improves clustering evaluation
Abstract
In a knowledge discovery process, interpretation and evaluation of the mined results are indispensable in practice. In the case of data clustering, however, it is often difficult to see in what aspect each cluster has been formed. This paper proposes a method for automatic and objective characterization or "verbalization" of the clusters obtained by mixture models, in which we collect conjunctions of propositions (attribute-value pairs) that help us interpret or evaluate the clusters. The proposed method provides us with a new, in-depth and consistent tool for cluster interpretation/evaluation, and works for various types of datasets including continuous attributes and missing values. Experimental results with a couple of standard datasets exhibit the utility of the proposed method, and the importance of the feedbacks from the interpretation/evaluation step.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Clustering Algorithms Research
