Two-sample comparison through additive tree models for density ratios

Naoki Awaya; Yuliang Xu; and Li Ma

arXiv:2508.03059·stat.ME·April 23, 2026

Two-sample comparison through additive tree models for density ratios

Naoki Awaya, Yuliang Xu, and Li Ma

PDF

TL;DR

This paper introduces additive tree models with a novel balancing loss for two-sample density ratio estimation, enabling efficient, Bayesian uncertainty quantification and applications to microbiome data.

Contribution

It proposes a new loss function and Bayesian framework for density ratio estimation using tree models, enhancing accuracy, efficiency, and uncertainty quantification.

Findings

01

Achieves accurate density ratio estimation with computational efficiency.

02

Provides Bayesian uncertainty quantification for high-dimensional data.

03

Demonstrates application to microbiome data quality assessment.

Abstract

The ratio of two densities provides a direct characterization of their differences. We consider the two-sample comparison problem by estimating this ratio given i.i.d. observations from two distributions. To this end, we propose additive tree models for density ratio estimation along with efficient algorithms using a new loss function, the balancing loss. The loss allows tree-based models to be trained using several algorithms originally designed for supervised learning, such as forward-stagewise optimization and gradient boosting. Moreover, the balancing loss resembles an exponential family kernel, and it can serve as a pseudo-likelihood with conjugate priors. This property enables generalized Bayesian inference on the density ratio using backfitting samplers designed for Bayesian additive regression trees (BART). Our Bayesian strategy provides uncertainty quantification for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.