Gray-Box Poisoning of Continuous Malware Ingestion Pipelines

Jan Dolej\v{s}; Martin Jure\v{c}ek; R\'obert L\'orencz

arXiv:2605.04698·cs.CR·May 7, 2026

Gray-Box Poisoning of Continuous Malware Ingestion Pipelines

Jan Dolej\v{s}, Martin Jure\v{c}ek, R\'obert L\'orencz

PDF

TL;DR

This paper explores gray-box poisoning attacks on malware detection pipelines, demonstrating how subtle adversarial modifications can significantly reduce detection effectiveness and proposing an ensemble-based defense to mitigate such threats.

Contribution

It introduces a realistic poisoning threat model for continuous malware ingestion systems and evaluates a defense mechanism using ensemble filtering.

Findings

01

Adversarial IAT manipulations significantly lower detection recall.

02

The ensemble defense filters up to 95.6% of poisoning samples.

03

Subtle perturbations pose a challenge for low-visibility attacks.

Abstract

Modern malware detection pipelines rely on continuous data ingestion and machine learning to counter the high volume of novel threats. This work investigates a realistic gray-box poisoning threat model targeting these pipelines. Using the secml_malware framework, we generate problem-space adversarial binaries through functionality-preserving manipulations, specifically Import Address Table (IAT) and section injections. We evaluate the impact of these poisoned samples when ingested into a defender's training set for a LightGBM malware detection model. Our empirical results demonstrate that subtle IAT-based perturbations enable compact poisoning samples that significantly degrade detection recall. These findings illustrate the inherent challenge of developing low-visibility adversarial perturbations that maintain high poisoning efficacy within continuous learning systems. We further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.