Data-driven multinomial random forest: A new random forest variant with   strong consistency

JunHao Chen

arXiv:2211.15154·cs.LG·October 17, 2023

Data-driven multinomial random forest: A new random forest variant with strong consistency

JunHao Chen

PDF

Open Access

TL;DR

This paper introduces a new data-driven multinomial random forest (DMRF) that achieves strong consistency with probability 1, improves theoretical properties, and outperforms previous RF variants in classification and regression tasks.

Contribution

The paper proposes DMRF, a novel random forest variant with strong consistency and comparable complexity to BreimanRF, enhancing theoretical guarantees and practical performance.

Findings

01

DMRF achieves strong consistency with probability 1.

02

DMRF outperforms previous weakly consistent RF variants.

03

In most cases, DMRF surpasses BreimanRF in classification tasks.

Abstract

In this paper, we modify the proof methods of some previously weakly consistent variants of random forests into strongly consistent proof methods, and improve the data utilization of these variants in order to obtain better theoretical properties and experimental performance. In addition, we propose a data-driven multinomial random forest (DMRF), which has the same complexity with BreimanRF (proposed by Breiman) while satisfying strong consistency with probability 1. It has better performance in classification and regression problems than previous RF variants that only satisfy weak consistency, and in most cases even surpasses BreimanRF in classification tasks. To the best of our knowledge, DMRF is currently a low-complexity and high-performing variation of random forests that achieves strong consistency with probability 1.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Bayesian Modeling and Causal Inference · Topic Modeling