Generative Forests
Richard Nock, Mathieu Guillame-Bert

TL;DR
This paper introduces a new class of forest-based generative models for tabular data, with a simple training algorithm that guarantees convergence and outperforms existing methods in data generation, imputation, and density estimation.
Contribution
A novel forest-based generative modeling framework with a convergent boosting algorithm tailored for tabular data, enhancing data quality and practical applicability.
Findings
Significant improvements in generated data quality over state-of-the-art methods.
Models effectively perform missing data imputation and density estimation.
Competitive with diverse models like neural nets, kernels, and graphical models.
Abstract
We focus on generative AI for a type of data that still represent one of the most prevalent form of data: tabular data. Our paper introduces two key contributions: a new powerful class of forest-based models fit for such tasks and a simple training algorithm with strong convergence guarantees in a boosting model that parallels that of the original weak / strong supervised learning setting. This algorithm can be implemented by a few tweaks to the most popular induction scheme for decision tree induction (i.e. supervised learning) with two classes. Experiments on the quality of generated data display substantial improvements compared to the state of the art. The losses our algorithm minimize and the structure of our models make them practical for related tasks that require fast estimation of a density given a generative model and an observation (even partially specified): such tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis
