Standing on the Shoulders of Machine Learning: Can We Improve Hypothesis   Testing?

Gary Cornwall; Jeff Chen; Beau Sauley

arXiv:2103.01368·econ.EM·March 3, 2021·1 cites

Standing on the Shoulders of Machine Learning: Can We Improve Hypothesis Testing?

Gary Cornwall, Jeff Chen, Beau Sauley

PDF

Open Access 1 Repo

TL;DR

This paper integrates modern machine learning classification models into hypothesis testing, enabling more flexible, powerful, and context-aware tests, demonstrated through improved unit root testing in time series econometrics.

Contribution

It introduces the use of machine learning algorithms as mapping functions for hypothesis testing, creating pseudo-composite tests and improving power and accuracy.

Findings

01

Boosted decision stumps recover the full size-power trade-off.

02

Complex algorithms like random forests enable multi-statistic testing.

03

Application to unit root testing shows 17% accuracy and 36% sensitivity improvements.

Abstract

In this paper we have updated the hypothesis testing framework by drawing upon modern computational power and classification models from machine learning. We show that a simple classification algorithm such as a boosted decision stump can be used to fully recover the full size-power trade-off for any single test statistic. This recovery implies an equivalence, under certain conditions, between the basic building block of modern machine learning and hypothesis testing. Second, we show that more complex algorithms such as the random forest and gradient boosted machine can serve as mapping functions in place of the traditional null distribution. This allows for multiple test statistics and other information to be evaluated simultaneously and thus form a pseudo-composite hypothesis test. Moreover, we show how practitioners can make explicit the relative costs of Type I and Type II errors to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DataScienceForPublicPolicy/hypML
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications · Statistical Methods and Inference · Statistical Mechanics and Entropy