Inject Machine Learning into Significance Test for Misspecified Linear Models
Jiaye Teng, Yang Yuan

TL;DR
This paper introduces an assumption-free method that integrates machine learning with significance testing to improve linear approximation and significance assessment in both linear and non-linear models.
Contribution
The paper proposes a novel machine learning-based approach for significance testing that does not rely on linear assumptions, outperforming traditional linear regression in non-linear scenarios.
Findings
Outperforms linear regression in non-linear ground truth cases
Provides theoretical guarantees for the estimator's properties
Offers a more reliable significance test in complex models
Abstract
Due to its strong interpretability, linear regression is widely used in social science, from which significance test provides the significance level of models or coefficients in the traditional statistical inference. However, linear regression methods rely on the linear assumptions of the ground truth function, which do not necessarily hold in practice. As a result, even for simple non-linear cases, linear regression may fail to report the correct significance level. In this paper, we present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation. Afterward, we get the estimator by adding adjustments based on the validation set. We prove the concentration inequalities and asymptotic properties of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Face and Expression Recognition
MethodsLinear Regression
