A Systematic Bias of Machine Learning Regression Models and Its   Correction: an Application to Imaging-based Brain Age Prediction

Hwiyoung Lee; Shuo Chen

arXiv:2405.15950·stat.ML·September 5, 2024·3 cites

A Systematic Bias of Machine Learning Regression Models and Its Correction: an Application to Imaging-based Brain Age Prediction

Hwiyoung Lee, Shuo Chen

PDF

Open Access 1 Repo

TL;DR

This paper identifies a common linear bias in machine learning regression models, especially for outlier values, and proposes a correction method that effectively eliminates this bias, demonstrated in neuroimaging-based brain age prediction.

Contribution

The paper introduces a general constrained optimization approach to correct systematic bias in machine learning regression models, validated through simulations and neuroimaging data.

Findings

01

Bias persists across various models

02

Proposed correction effectively removes bias

03

Unbiased brain age predictions achieved

Abstract

Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the "systematic bias of machine learning regression". In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hwiyoungstat/sbmr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealth, Environment, Cognitive Aging · Machine Learning in Healthcare