A First Step Towards Distribution Invariant Regression Metrics

Mario Michael Krell; Bilal Wehbe

arXiv:2009.05176·cs.LG·September 14, 2020·1 cites

A First Step Towards Distribution Invariant Regression Metrics

Mario Michael Krell, Bilal Wehbe

PDF

Open Access

TL;DR

This paper introduces distribution-invariant regression metrics that adjust for varying data distributions, enabling fairer comparisons across datasets and revealing overfitting in regression models.

Contribution

It proposes a novel weighting scheme using Gaussian kernel density estimators to modify classical regression metrics for distribution invariance.

Findings

01

Classical metrics are sensitive to distribution shifts.

02

Weighted metrics reduce sensitivity to distribution changes.

03

New metrics help compare results across datasets with different distributions.

Abstract

Regression evaluation has been performed for decades. Some metrics have been identified to be robust against shifting and scaling of the data but considering the different distributions of data is much more difficult to address (imbalance problem) even though it largely impacts the comparability between evaluations on different datasets. In classification, it has been stated repeatedly that performance metrics like the F-Measure and Accuracy are highly dependent on the class distribution and that comparisons between different datasets with different distributions are impossible. We show that the same problem exists in regression. The distribution of odometry parameters in robotic applications can for example largely vary between different recording sessions. Here, we need regression algorithms that either perform equally well for all function values, or that focus on certain boundary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Machine Learning and Data Classification