Pearson's correlation under the scope: Assessment of the efficiency of   Pearson's correlation to select predictor variables for linear models

Mustafa Attallah

arXiv:2409.01295·stat.ME·September 10, 2024

Pearson's correlation under the scope: Assessment of the efficiency of Pearson's correlation to select predictor variables for linear models

Mustafa Attallah

PDF

Open Access

TL;DR

This paper critically evaluates the effectiveness of Pearson's correlation in selecting predictor variables for linear models, demonstrating its limitations through empirical analysis on real datasets.

Contribution

It provides an empirical assessment of Pearson's correlation's reliability in predictor selection, highlighting its potential pitfalls in linear modeling.

Findings

01

Pearson's correlation can be misleading in predictor selection.

02

Linear models based on correlated predictors may have higher error.

03

Correlation values do not always translate to better model performance.

Abstract

This article examines the limitations of Pearson's correlation in selecting predictor variables for linear models. Using mtcars and iris datasets from R, this paper demonstrates the limitation of this correlation measure when selecting a proper independent variable to model miles per gallon (mpg) from mtcars data and the petal length from the iris data. This paper exhibits the findings by reporting Pearson's correlation values for two potential predictor variables for each response variable, then builds a linear model to predict the response variable using each predictor variable. The error metrics for each model are then reported to evaluate how reliable Pearson's correlation is in selecting the best predictor variable. The results show that Pearson's correlation can be deceiving if used to select the predictor variable to build a linear model for a dependent variable.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Neural Networks and Applications