# Mutual Information based labelling and comparing clusters

**Authors:** Rob Koopman, Shenghui Wang

arXiv: 1702.08199 · 2017-02-28

## TL;DR

This paper introduces a Mutual Information based method for labeling and comparing clusters of journal articles, enhancing interpretability and semantic understanding of clustering results.

## Contribution

It proposes a novel Mutual Information based approach for cluster labeling and introduces lexical fingerprints for comparing clusters across different solutions.

## Key findings

- Mutual Information effectively identifies discriminative cluster labels.
- Lexical fingerprints enable semantic comparison of clusters.
- The method improves interpretability of clustering results.

## Abstract

After a clustering solution is generated automatically, labelling these clusters becomes important to help understanding the results. In this paper, we propose to use a Mutual Information based method to label clusters of journal articles. Topical terms which have the highest Normalised Mutual Information (NMI) with a certain cluster are selected to be the labels of the cluster. Discussion of the labelling technique with a domain expert was used as a check that the labels are discriminating not only lexical-wise but also semantically. Based on a common set of topical terms, we also propose to generate lexical fingerprints as a representation of individual clusters. Eventually, we visualise and compare these fingerprints of different clusters from either one clustering solution or different ones.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.08199/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1702.08199/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/1702.08199/full.md

---
Source: https://tomesphere.com/paper/1702.08199