A Linear Approach to Data Poisoning
Donald Flynn, Diego Granziol

TL;DR
This paper develops a theoretical framework using random matrix theory to analyze the effects of data poisoning attacks on high-dimensional linear models, providing explicit formulas for the impact of poisoning strategies.
Contribution
It introduces a closed-form analytical approach to quantify data poisoning effects in high-dimensional ridge regression models, linking poisoning strength, regularization, and overparameterization.
Findings
Derives explicit formulas for poisoned scores in high-dimensional ridge regression.
Recovers the interpolation threshold as the ratio of features to samples approaches one.
Shows that model weights align with the poisoning direction under attack.
Abstract
Backdoor and data-poisoning attacks can flip predictions with tiny training corruptions, yet a sharp theory linking poisoning strength, overparameterization, and regularization is lacking. We analyze ridge least squares with an unpenalized intercept in the high-dimensional regime \(p,n\to\infty\), \(p/n\to c\). Targeted poisoning is modelled by shifting a \(\theta\)-fraction of one class by a direction \(\mathbf{v}\) and relabelling. Using resolvent techniques and deterministic equivalents from random matrix theory, we derive closed-form limits for the poisoned score explicit in the model parameters. The formulas yield scaling laws, recover the interpolation threshold as \(c\to1\) in the ridgeless limit, and show that the weights align with the poisoning direction. Synthetic experiments match theory across sweeps of the parameters and MNIST backdoor tests show qualitatively consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
