Non-omniscient backdoor injection with one poison sample: Proving the one-poison hypothesis for linear regression, linear classification, and 2-layer ReLU neural networks

Thorsten Peinemann; Paula Arnold; Sebastian Berndt; Thomas Eisenbarth; Esfandiar Mohammadi

arXiv:2508.05600·cs.LG·January 6, 2026

Non-omniscient backdoor injection with one poison sample: Proving the one-poison hypothesis for linear regression, linear classification, and 2-layer ReLU neural networks

Thorsten Peinemann, Paula Arnold, Sebastian Berndt, Thomas Eisenbarth, Esfandiar Mohammadi

PDF

TL;DR

This paper proves that a single poisoned data point can successfully inject a backdoor into models like linear regression, classification, and 2-layer ReLU neural networks, with minimal impact on benign performance.

Contribution

It establishes the one-poison hypothesis, showing that one carefully crafted sample can cause backdoor attacks across multiple model types, with theoretical proofs and experimental validation.

Findings

01

A single poison sample can inject a backdoor with zero error.

02

Models with poison samples using unused directions are equivalent to clean models.

03

Impact on benign task performance remains limited in most cases.

Abstract

Backdoor poisoning attacks are a threat to machine learning models trained on large data collected from untrusted sources; these attacks enable attackers to inject malicious behavior into the model that can be triggered by specially crafted inputs. Prior work has established bounds on the success of backdoor attacks and their impact on the benign learning task, however, an open question is what amount of poison data is needed for a successful backdoor attack. Typical attacks either use few samples but need much information about the data points, or need to poison many data points. In this paper, we formulate the one-poison hypothesis: An adversary with one poison sample and limited background knowledge can inject a backdoor with zero backdooring-error and without significantly impacting the benign learning task performance. Moreover, we prove the one-poison hypothesis for linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.