A Note on High Dimensional Linear Regression with Interactions
Ning Hao, Hao Helen Zhang

TL;DR
This paper clarifies fundamental issues in high-dimensional linear regression with interactions, evaluates two-stage methods, and proposes new strategies for interaction selection under the marginality principle.
Contribution
It provides theoretical justification for two-stage interaction selection methods and introduces new strategies aligned with the marginality principle.
Findings
Two-stage methods can be theoretically justified in high-dimensional settings.
Counterexample of Turlach (2004) is revisited to support two-stage approaches.
New strategies for interaction selection are proposed based on the marginality principle.
Abstract
The problem of interaction selection has recently caught much attention in high dimensional data analysis. This note aims to address and clarify several fundamental issues in interaction selection for linear regression models, especially when the input dimension p is much larger than the sample size n. We first discuss issues such as a valid way of defining importance for the main effects and interaction effects, the invariance principle, and the strong heredity condition. Then we focus on two-stage methods, which are computationally attractive for large p problems but regarded heuristic in the literature. We will revisit the counterexample of Turlach (2004) and provide new insight to justify two-stage methods from a theoretical perspective. In the end, we suggest some new strategies for interaction selection under the marginality principle, which is followed by a numerical example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Control Systems and Identification · Advanced Statistical Methods and Models
