Backdoor Mitigation in Object Detection via Adversarial Fine-Tuning
Kealan Dunnett, Reza Arablouei, Dimity Miller, Volkan Dedeoglu, Raja Jurdak

TL;DR
This paper presents a detection-aware adversarial fine-tuning method to mitigate backdoor attacks in object detection models, effectively reducing attack success while maintaining detection performance.
Contribution
It introduces a novel framework with soft-branch minimisation and dual-objective loss for effective backdoor mitigation in object detection without attack objective knowledge.
Findings
More effective reduction of attack success compared to baselines.
Preserves true detection performance on clean data.
Applicable to CNN- and Transformer-based detectors.
Abstract
Backdoor attacks can implant malicious behaviours into deep models while preserving performance on clean data, posing a serious threat to safety-critical vision systems. Although backdoor mitigation has been studied extensively for image classification, defenses for object detection remain comparatively underdeveloped. Adversarial fine-tuning is a common backdoor mitigation approach in classification, but adapting it to detection is nontrivial as classification-oriented adversarial generation does not match the detection attack space, where attacks may cause object misclassification or disappearance, and standard detection losses can dilute the repair signal across many predictions. We address these challenges through a detection-aware adversarial fine-tuning framework for mitigating object-detection backdoors when the defender has access only to a compromised detector and a small clean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
