Improving predictions by nonlinear regression models from outlying input   data

William W. Hsieh

arXiv:2003.07926·cs.LG·March 19, 2020·5 cites

Improving predictions by nonlinear regression models from outlying input data

William W. Hsieh

PDF

Open Access

TL;DR

This study shows that nonlinear regression models perform poorly on outlying environmental data, but a hybrid approach using linear extrapolation improves prediction reliability for outliers.

Contribution

The paper introduces NLR$_{ ext{OR}}$, a method combining nonlinear and linear extrapolation to enhance prediction accuracy on outlier data in environmental sciences.

Findings

01

NLR outperforms LR on non-outliers.

02

NLR underperforms LR on outliers.

03

NLR$_{ ext{OR}}$ reduces poor extrapolations and improves outlier predictions.

Abstract

When applying machine learning/statistical methods to the environmental sciences, nonlinear regression (NLR) models often perform only slightly better and occasionally worse than linear regression (LR). The proposed reason for this conundrum is that NLR models can give predictions much worse than LR when given input data which lie outside the domain used in model training. Continuous unbounded variables are widely used in environmental sciences, whence not uncommon for new input data to lie far outside the training domain. For six environmental datasets, inputs in the test data were classified as "outliers" and "non-outliers" based on the Mahalanobis distance from the training input data. The prediction scores (mean absolute error, Spearman correlation) showed NLR to outperform LR for the non-outliers, but often underperform LR for the outliers. An approach based on Occam's Razor (OR)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHydrological Forecasting Using AI · Advanced Statistical Methods and Models · Data Analysis with R

MethodsLinear Regression