Gray-Box Poisoning of Continuous Malware Ingestion Pipelines
Jan Dolej\v{s}, Martin Jure\v{c}ek, R\'obert L\'orencz

TL;DR
This paper explores gray-box poisoning attacks on malware detection pipelines, demonstrating how subtle adversarial modifications can significantly reduce detection effectiveness and proposing an ensemble-based defense to mitigate such threats.
Contribution
It introduces a realistic poisoning threat model for continuous malware ingestion systems and evaluates a defense mechanism using ensemble filtering.
Findings
Adversarial IAT manipulations significantly lower detection recall.
The ensemble defense filters up to 95.6% of poisoning samples.
Subtle perturbations pose a challenge for low-visibility attacks.
Abstract
Modern malware detection pipelines rely on continuous data ingestion and machine learning to counter the high volume of novel threats. This work investigates a realistic gray-box poisoning threat model targeting these pipelines. Using the secml_malware framework, we generate problem-space adversarial binaries through functionality-preserving manipulations, specifically Import Address Table (IAT) and section injections. We evaluate the impact of these poisoned samples when ingested into a defender's training set for a LightGBM malware detection model. Our empirical results demonstrate that subtle IAT-based perturbations enable compact poisoning samples that significantly degrade detection recall. These findings illustrate the inherent challenge of developing low-visibility adversarial perturbations that maintain high poisoning efficacy within continuous learning systems. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
