Ultrametric Component Analysis with Application to Analysis of Text and of Emotion
Fionn Murtagh

TL;DR
This paper reviews methods for identifying ultrametric structures in data sets, develops a new consensus clustering approach, and applies these techniques to analyze emotional content in narratives.
Contribution
It introduces a novel consensus of hierarchical clusterings for ultrametric analysis and applies it to interpret emotional content in text data.
Findings
Developed a framework for visualizing ultrametric parts of data
Proposed a new ultrametricity coefficient based on triangle angles
Applied ultrametric analysis to quantify emotions in narratives
Abstract
We review the theory and practice of determining what parts of a data set are ultrametric. It is assumed that the data set, to begin with, is endowed with a metric, and we include discussion of how this can be brought about if a dissimilarity, only, holds. The basis for part of the metric-endowed data set being ultrametric is to consider triplets of the observables (vectors). We develop a novel consensus of hierarchical clusterings. We do this in order to have a framework (including visualization and supporting interpretation) for the parts of the data that are determined to be ultrametric. Furthermore a major objective is to determine locally ultrametric relationships as opposed to non-local ultrametric relationships. As part of this work, we also study a particular property of our ultrametricity coefficient, namely, it being a function of the difference of angles of the base angles of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsadvanced mathematical theories · Advanced Data Compression Techniques · Topological and Geometric Data Analysis
