Detecting confounding in multivariate linear models via spectral analysis
Dominik Janzing, Bernhard Schoelkopf

TL;DR
This paper introduces a spectral analysis method to detect and quantify confounding effects in multivariate linear models, leveraging high-dimensional concentration results and the orientation of regression coefficients.
Contribution
The paper presents a novel spectral approach for identifying and measuring confounding in multivariate linear models using large-dimensional concentration of measure principles.
Findings
Confounding distorts the typical orientation of regression coefficients.
The method can quantify confounding effects for models with a scalar confounder.
Spectral analysis reveals characteristic patterns indicating confounding presence.
Abstract
We study a model where one target variable Y is correlated with a vector X:=(X_1,...,X_d) of predictor variables being potential causes of Y. We describe a method that infers to what extent the statistical dependences between X and Y are due to the influence of X on Y and to what extent due to a hidden common cause (confounder) of X and Y. The method relies on concentration of measure results for large dimensions d and an independence assumption stating that, in the absence of confounding, the vector of regression coefficients describing the influence of each X on Y typically has `generic orientation' relative to the eigenspaces of the covariance matrix of X. For the special case of a scalar confounder we show that confounding typically spoils this generic orientation in a characteristic way that can be used to quantitatively estimate the amount of confounding.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Random Matrices and Applications · Markov Chains and Monte Carlo Methods
