Distance Assisted Recursive Testing

Xuechan Li; Anthony Sung; Jichun Xie

arXiv:2103.11085·stat.ME·September 28, 2021·1 cites

Distance Assisted Recursive Testing

Xuechan Li, Anthony Sung, Jichun Xie

PDF

Open Access 1 Repo

TL;DR

DART is a novel recursive testing framework that leverages known feature distances to improve feature selection accuracy, demonstrated through theoretical analysis, simulations, and a clinical microbiota study.

Contribution

It introduces a two-stage method transforming distance matrices into aggregation trees for enhanced multiple testing in feature selection.

Findings

01

DART controls false discovery proportion effectively in high probability.

02

DART outperforms existing methods in simulations under various models.

03

Application to clinical data identified microbiota impacted by treatment.

Abstract

In many applications, a large number of features are collected with the goal to identify a few important ones. Sometimes, these features lie in a metric space with a known distance matrix, which partially reflects their co-importance pattern. Proper use of the distance matrix will boost the power of identifying important features. Hence, we develop a new multiple testing framework named the Distance Assisted Recursive Testing (DART). DART has two stages. In stage 1, we transform the distance matrix into an aggregation tree, where each node represents a set of features. In stage 2, based on the aggregation tree, we set up dynamic node hypotheses and perform multiple testing on the tree. All rejections are mapped back to the features. Under mild assumptions, the false discovery proportion of DART converges to the desired level in high probability converging to one. We illustrate by theory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xxli8080/DART_Code
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Statistical Methods in Clinical Trials · Molecular Biology Techniques and Applications