Generalized Linear Tree Space Nearest Neighbor

Michael Kim

arXiv:2103.16408·cs.LG·March 31, 2021

Generalized Linear Tree Space Nearest Neighbor

Michael Kim

PDF

Open Access

TL;DR

This paper introduces GLTSNN, a novel ensemble method combining decision trees and 1NN projections, which achieves competitive MSE performance and has promising theoretical properties similar to 1NN classifiers.

Contribution

The paper proposes GLTSNN, a new stacking approach that integrates decision trees with 1NN projections, offering theoretical insights and competitive results.

Findings

01

GLTSNN is competitive with Random Forest in MSE on several datasets.

02

The method reduces variance through averaging multiple projections.

03

Theoretical conjecture suggests asymptotic error bounds similar to 1NN classifiers.

Abstract

We present a novel method of stacking decision trees by projection into an ordered time split out-of-fold (OOF) one nearest neighbor (1NN) space. The predictions of these one nearest neighbors are combined through a linear model. This process is repeated many times and averaged to reduce variance. Generalized Linear Tree Space Nearest Neighbor (GLTSNN) is competitive with respect to Mean Squared Error (MSE) compared to Random Forest (RF) on several publicly available datasets. Some of the theoretical and applied advantages of GLTSNN are discussed. We conjecture a classifier based upon the GLTSNN would have an error that is asymptotically bounded by twice the Bayes error rate like k = 1 Nearest Neighbor.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Machine Learning and Data Classification