Improved classification for compositional data using the $\alpha$-transformation
Michail Tsagris, Simon Preston, Andrew T.A. Wood

TL;DR
This paper introduces an $oldsymbol{ extalpha}$-transformation-based method for classifying compositional data, demonstrating that intermediate $oldsymbol{ extalpha}$ values can outperform traditional approaches in various datasets.
Contribution
It proposes a flexible $oldsymbol{ extalpha}$-transformation approach for compositional data classification, unifying and extending existing methods with empirical validation.
Findings
Intermediate $oldsymbol{ extalpha}$ values can improve classification accuracy.
Performance varies depending on dataset characteristics.
Using the $oldsymbol{ extalpha}$-transformation offers a versatile framework.
Abstract
In compositional data analysis an observation is a vector containing non-negative values, only the relative sizes of which are considered to be of interest. Without loss of generality, a compositional vector can be taken to be a vector of proportions that sum to one. Data of this type arise in many areas including geology, archaeology, biology, economics and political science. In this paper we investigate methods for classification of compositional data. Our approach centres on the idea of using the -transformation to transform the data and then to classify the transformed data via regularised discriminant analysis and the k-nearest neighbours algorithm. Using the -transformation generalises two rival approaches in compositional data analysis, one (when ) that treats the data as though they were Euclidean, ignoring the compositional constraint, and another…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Hydrocarbon exploration and reservoir analysis
