Ordered Correlation Forest
Riccardo Di Francesco

TL;DR
This paper introduces the ordered correlation forest, a flexible non-parametric estimator for ordered categorical outcomes that improves prediction accuracy and provides valid confidence intervals without restrictive assumptions.
Contribution
The paper develops a novel ordered correlation forest method that handles non-linearities and does not rely on specific error distribution assumptions, enhancing analysis of ordered outcomes.
Findings
Outperforms alternative forest estimators in prediction accuracy
Provides valid confidence intervals for covariates' marginal effects
Demonstrates superior performance on synthetic data
Abstract
Empirical studies in various social sciences often involve categorical outcomes with inherent ordering, such as self-evaluations of subjective well-being and self-assessments in health domains. While ordered choice models, such as the ordered logit and ordered probit, are popular tools for analyzing these outcomes, they may impose restrictive parametric and distributional assumptions. This paper introduces a novel estimator, the ordered correlation forest, that can naturally handle non-linearities in the data and does not assume a specific error term distribution. The proposed estimator modifies a standard random forest splitting criterion to build a collection of forests, each estimating the conditional probability of a single class. Under an "honesty" condition, predictions are consistent and asymptotically normal. The weights induced by each forest are used to obtain standard errors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychological Well-being and Life Satisfaction · Health disparities and outcomes · Income, Poverty, and Inequality
