# Fr\'echet random forests for metric space valued regression with non   euclidean predictors

**Authors:** Louis Capitaine, J\'er\'emie Bigot, Rodolphe Thi\'ebaut, Robin, Genuer

arXiv: 1906.01741 · 2024-02-19

## TL;DR

This paper introduces Fréchet trees and forests, extending random forests to handle data in general metric spaces, including complex data types like curves, images, and shapes, with theoretical guarantees and practical applications.

## Contribution

It develops a novel splitting and prediction framework for random forests in metric spaces, enabling analysis of heterogeneous data types with theoretical consistency results.

## Key findings

- Effective handling of heterogeneous data including images and curves.
- Theoretical consistency of the Fréchet regressogram predictor.
- Successful application to air quality data.

## Abstract

Random forests are a statistical learning method widely used in many areas of scientific research because of its ability to learn complex relationships between input and output variables and also its capacity to handle high-dimensional data. However, current random forest approaches are not flexible enough to handle heterogeneous data such as curves, images and shapes. In this paper, we introduce Fr\'echet trees and Fr\'echet random forests, which allow to handle data for which input and output variables take values in general metric spaces. To this end, a new way of splitting the nodes of trees is introduced and the prediction procedures of trees and forests are generalized. Then, random forests out-of-bag error and variable importance score are naturally adapted. A consistency theorem for Fr\'echet regressogram predictor using data-driven partitions is given and applied to Fr\'echet purely uniformly random trees. The method is studied through several simulation scenarios on heterogeneous data combining longitudinal, image and scalar data. Finally, one real dataset about air quality is used to illustrate the use of the proposed method in practice.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01741/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01741/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/1906.01741/full.md

---
Source: https://tomesphere.com/paper/1906.01741