Interpretation of High-Dimensional Linear Regression: Effects of Nullspace and Regularization Demonstrated on Battery Data
Joachim Schaeffer, Eric Lenz, William C. Chueh, Martin Z. Bazant, Rolf, Findeisen, Richard D. Braatz

TL;DR
This paper explores how nullspace and regularization affect interpretability in high-dimensional linear regression, especially in chemical and biological data, and proposes methods to improve understanding of model coefficients.
Contribution
It introduces a nullspace-based optimization approach to compare regression coefficients with physical knowledge, enhancing interpretability in high-dimensional settings.
Findings
Regularization and z-scoring, aligned with physical knowledge, improve interpretability.
Nullspace considerations can hinder or help interpretability depending on design choices.
Methods like fused lasso that produce coefficients orthogonal to the nullspace can enhance interpretability.
Abstract
High-dimensional linear regression is important in many scientific fields. This article considers discrete measured data of underlying smooth latent processes, as is often obtained from chemical or biological systems. Interpretation in high dimensions is challenging because the nullspace and its interplay with regularization shapes regression coefficients. The data's nullspace contains all coefficients that satisfy , thus allowing very different coefficients to yield identical predictions. We developed an optimization formulation to compare regression coefficients and coefficients obtained by physical engineering knowledge to understand which part of the coefficient differences are close to the nullspace. This nullspace method is tested on a synthetic example and lithium-ion battery data. The case studies show that regularization and z-scoring are design choices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Fault Detection and Control Systems · Computational Drug Discovery Methods
MethodsLinear Regression
