Generalizing the Mean and Variance to Categorical Data Using Metrics
Roger Bilisoly

TL;DR
This paper extends the concepts of mean and variance to categorical data using metric-based methods, demonstrating applications in linguistic variability and group theory.
Contribution
It introduces a general framework for defining mean and variance on categorical data, broadening their applicability beyond numerical datasets.
Findings
Quantified spelling variability in Middle English.
Defined and applied variability in finite groups.
Abstract
Researchers have developed ways to generalize the mean and variance to situations in which a data metric is available. We apply the tools developed in Pennec (2006) to categorical data, and show the generality of this approach by considering two quite different applications. First, spelling variability in Middle English is quantified. Second, variability of a finite group (in the sense of group theory) is defined and applied to an example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Advanced Statistical Methods and Models · Data Management and Algorithms
