Gradient Boosting Mapping for Dimensionality Reduction and Feature Extraction
Anri Patron, Ayush Prasad, Hoang Phuc Hau Luu, Kai Puolam\"aki

TL;DR
This paper introduces Gradient Boosting Mapping (GBMAP), a supervised dimensionality reduction technique that creates effective features and distance measures, improving model interpretability, reducing overfitting, and detecting out-of-distribution data efficiently.
Contribution
GBMAP is a novel supervised dimensionality reduction method using weak learners to generate embeddings that enhance learning performance and enable principled distance measurement.
Findings
GBMAP produces features that improve supervised learning performance.
It enables detection of out-of-distribution data points.
The method is fast, scalable, and competitive with state-of-the-art models.
Abstract
A fundamental problem in supervised learning is to find a good set of features or distance measures. If the new set of features is of lower dimensionality and can be obtained by a simple transformation of the original data, they can make the model understandable, reduce overfitting, and even help to detect distribution drift. We propose a supervised dimensionality reduction method Gradient Boosting Mapping (GBMAP), where the outputs of weak learners -- defined as one-layer perceptrons -- define the embedding. We show that the embedding coordinates provide better features for the supervised learning task, making simple linear models competitive with the state-of-the-art regressors and classifiers. We also use the embedding to find a principled distance measure between points. The features and distance measures automatically ignore directions irrelevant to the supervised learning task. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques
MethodsSparse Evolutionary Training
