Robust linear regression under latent group heterogeneity
Xifeng Li, Shuzhen Yang

TL;DR
This paper introduces a robust linear regression method that accounts for mean and variance uncertainty without prior group information, improving estimation accuracy in heterogeneous data.
Contribution
It proposes the EMMB algorithm based on sublinear expectation theory, a novel two-step approach for modeling latent heterogeneity in linear regression.
Findings
The method outperforms ordinary least squares in simulations.
It effectively captures heterogeneity in real-world PM2.5 data.
Theoretical properties of estimators are rigorously established.
Abstract
Uncertainty is ubiquitous in real-world data, and the assumptions underlying classical linear regression models are often violated in practice. Inspired by the theory of sublinear expectation, we consider a linear regression model where the random intercept term has mean uncertainty and the error term has variance uncertainty. We develop a novel two-step approach, named Expectation-Maximization with Moving Block (EMMB), to estimate the model parameters. The proposed method requires no prior knowledge of group structures or change points. Theoretical properties of the estimators are established under mild regularity conditions. Simulation studies and a real-data application to PM2.5 concentration modeling in Beijing demonstrate the superiority of the proposed method: it captures substantial intercept heterogeneity overlooked by ordinary least squares and yields more accurate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
