Consensus Tree Estimation with False Discovery Rate Control via Partially Ordered Sets
Maria Alejandra Valdez Cabrera, Amy D Willis, Armeen Taeb

TL;DR
This paper introduces a novel method for consensus tree estimation that controls false discoveries using partial orders, accommodating complex tree structures and providing finite-sample guarantees, with applications in biology and beyond.
Contribution
It presents the first estimator for consensus trees that offers finite-sample, model-free false discovery rate control by framing the problem as feature selection over a partially ordered set.
Findings
Controls false discovery rate at a nominal level
Handles unequal leaf sets and non-binary trees
Provides uncertainty quantification for tree features
Abstract
Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and anatomy. Summarizing a collection of trees by a single representative is challenging, in part due to the dimension of both the sample and parameter space. We frame consensus tree estimation as a structured feature-selection problem, where leaves and edges are the features. We introduce a partial order on leaf-labeled trees, use it to define true and false discoveries for a candidate summary tree, and develop an estimation algorithm that controls the false discovery rate at a nominal level for a broad class of non-parametric generative models. Furthermore, using the partial order structure, we assess the stability of each feature in a selected tree.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Ecosystem dynamics and resilience · Genomics and Phylogenetic Studies
