KNN Ensembles for Tweedie Regression: The Power of Multiscale   Neighborhoods

Colleen M. Farrelly

arXiv:1708.02122·stat.ML·August 8, 2017·5 cites

KNN Ensembles for Tweedie Regression: The Power of Multiscale Neighborhoods

Colleen M. Farrelly

PDF

Open Access

TL;DR

This paper investigates the use of multiscale KNN ensembles, varying k and bagging strategies, to improve Tweedie regression, demonstrating superior performance over existing methods through extensive simulations and real data analysis.

Contribution

It introduces novel KNN ensemble algorithms that vary k and bagging strategies, showing their effectiveness in Tweedie regression tasks and connecting to topological data analysis insights.

Findings

01

Varying k improves prediction beyond bagging features or samples.

02

KNN ensembles outperform state-of-the-art models in Tweedie regression.

03

Ensembles are robust to the curse of dimensionality.

Abstract

Very few K-nearest-neighbor (KNN) ensembles exist, despite the efficacy of this approach in regression, classification, and outlier detection. Those that do exist focus on bagging features, rather than varying k or bagging observations; it is unknown whether varying k or bagging observations can improve prediction. Given recent studies from topological data analysis, varying k may function like multiscale topological methods, providing stability and better prediction, as well as increased ensemble diversity. This paper explores 7 KNN ensemble algorithms combining bagged features, bagged observations, and varied k to understand how each of these contribute to model fit. Specifically, these algorithms are tested on Tweedie regression problems through simulations and 6 real datasets; results are compared to state-of-the-art machine learning models including extreme learning machines,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning