Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection
Faisal Alamri, Sinan Kalkan, Nicolas Pugeault

TL;DR
This paper introduces a Transformer-Encoder Detector Module that enhances object detection accuracy and robustness against adversarial attacks by incorporating contextual information, outperforming baseline models in various metrics.
Contribution
The paper presents a novel context module for object detectors that improves detection performance and adversarial robustness by integrating scene context and visual features.
Findings
Up to 13% higher mAP, F1, and AUC scores compared to baseline.
8-point higher mAP on adversarially attacked images.
Simple context module significantly enhances detector reliability.
Abstract
Deep neural network approaches have demonstrated high performance in object recognition (CNN) and detection (Faster-RCNN) tasks, but experiments have shown that such architectures are vulnerable to adversarial attacks (FFF, UAP): low amplitude perturbations, barely perceptible by the human eye, can lead to a drastic reduction in labeling performance. This article proposes a new context module, called \textit{Transformer-Encoder Detector Module}, that can be applied to an object detector to (i) improve the labeling of object instances; and (ii) improve the detector's robustness to adversarial attacks. The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13\% compared to the baseline Faster-RCNN detector, and an mAP score 8 points higher on images subjected to FFF or UAP attacks due to the inclusion of both contextual and visual features extracted from scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
