Amalgamations in a hierarchy as a way of variable selection in compositional data analysis

Michael Greenacre; Martin Graeve

arXiv:2511.14622·stat.ME·November 27, 2025

Amalgamations in a hierarchy as a way of variable selection in compositional data analysis

Michael Greenacre, Martin Graeve

PDF

Open Access

TL;DR

This paper introduces a hierarchical amalgamation approach for variable selection in compositional data analysis, utilizing domain knowledge to create meaningful subsets and assessing their explanatory power through logratio variance.

Contribution

It proposes a novel hierarchical amalgamation method for variable selection in compositional data, demonstrated on fatty acid data from marine organisms.

Findings

01

Hierarchical amalgamations explain significant logratio variance.

02

Method provides an alternative to traditional variable selection.

03

Application to marine fatty acids illustrates practical utility.

Abstract

In certain fields where compositional data are studied, the compositional components, called parts, can be combined into certain subsets, called amalgamations, that are based on domain knowledge. Furthermore, these subsets can form a natural hierarchy of amalgamations subdividing into sub-amalgamations. The authors, a statistician and a biochemist, demonstrate how to create a hierarchy of amalgamations in the context of fatty acid compositions in a sample of marine organisms. Following a tradition in compositional data analysis, these amalgamations are transformed to logratios, and their usefulness as new variables is quantified by the percentage of total logratio variance that they explain. This method is proposed as an alternative method of variable selection in compositional data analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeochemistry and Geologic Mapping · Geological and Geochemical Analysis · Hydrocarbon exploration and reservoir analysis