Tree Estimation and Saddlepoint-Based Diagnostics for the Nested Dirichlet Distribution: Application to Compositional Behavioral Data
Jacob A. Turner, Monnie McGee, Bianca A. Luedeker

TL;DR
This paper introduces a data-driven method and diagnostic tools for the Nested Dirichlet Distribution, enhancing its practical application to compositional data by identifying tree structures and assessing model fit.
Contribution
We develop a greedy algorithm for data-driven tree structure discovery and propose saddlepoint-based diagnostics for model evaluation in NDD modeling.
Findings
Effective tree structure identification from data
Accurate diagnostics for model fit and influential observations
Successful application to behavioral data from mice experiments
Abstract
The Nested Dirichlet Distribution (NDD) provides a flexible alternative to the Dirichlet distribution for modeling compositional data, relaxing constraints on component variances and correlations through a hierarchical tree structure. While theoretically appealing, the NDD is underused in practice due to two main limitations: the need to predefine the tree structure and the lack of diagnostics for evaluating model fit. This paper addresses both issues. First, we introduce a data-driven, greedy tree-finding algorithm that identifies plausible NDD tree structures from observed data. Second, we propose novel diagnostic tools, including pseudo-residuals based on a saddlepoint approximation to the marginal distributions and a likelihood displacement measure to detect influential observations. These tools provide accurate and computationally tractable assessments of model fit, even when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Bayesian Methods and Mixture Models · Morphological variations and asymmetry
