# Goodness-of-fit testing in high-dimensional generalized linear models

**Authors:** Jana Jankov\'a, Rajen D. Shah, Peter B\"uhlmann, Richard J. Samworth

arXiv: 1908.03606 · 2019-11-14

## TL;DR

This paper introduces a flexible family of goodness-of-fit tests for high-dimensional generalized linear models, utilizing residual analysis and machine learning methods to detect model misspecification with controlled error rates.

## Contribution

It presents a novel, adaptable testing framework that leverages residual signals and modern machine learning techniques for high-dimensional model validation.

## Key findings

- Effective in detecting model misspecification in simulations
- Controls type I error asymptotically under the null hypothesis
- Demonstrates utility on real data examples

## Abstract

We propose a family of tests to assess the goodness-of-fit of a high-dimensional generalized linear model. Our framework is flexible and may be used to construct an omnibus test or directed against testing specific non-linearities and interaction effects, or for testing the significance of groups of variables. The methodology is based on extracting left-over signal in the residuals from an initial fit of a generalized linear model. This can be achieved by predicting this signal from the residuals using modern flexible regression or machine learning methods such as random forests or boosted trees. Under the null hypothesis that the generalized linear model is correct, no signal is left in the residuals and our test statistic has a Gaussian limiting distribution, translating to asymptotic control of type I error. Under a local alternative, we establish a guarantee on the power of the test. We illustrate the effectiveness of the methodology on simulated and real data examples by testing goodness-of-fit in logistic regression models. Software implementing the methodology is available in the R package `GRPtests'.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.03606/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1908.03606/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1908.03606/full.md

---
Source: https://tomesphere.com/paper/1908.03606