Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations
Nils H\"utten, Florian H\"olken, Hasan Tercan, Tobias Meisen

TL;DR
This paper systematically ablates key components of detection transformers inspired by neuroscience studies, revealing their roles in model performance and suggesting avenues for simplification and improved transparency.
Contribution
It introduces a neuroscience-inspired ablation methodology for detection transformers and provides detailed insights into component importance and redundancy.
Findings
DETR is sensitive to encoder MHSA and decoder MHCA ablations
DDETR's deformable attention enhances robustness
DINO's look-forward update rule increases resilience
Abstract
In recent years, Explainable AI has gained traction as an approach to enhancing model interpretability and transparency, particularly in complex models such as detection transformers. Despite rapid advancements, a substantial research gap remains in understanding the distinct roles of internal components - knowledge that is essential for improving transparency and efficiency. Inspired by neuroscientific ablation studies, which investigate the functions of brain regions through selective impairment, we systematically analyze the impact of ablating key components in three state-of-the-art detection transformer models: Detection transformer (DETR), deformable detection transformer (DDETR), and DETR with improved denoising anchor boxes (DINO). The ablations target query embeddings, encoder and decoder multi-head self-attentions (MHSA) as well as decoder multi-head cross-attention (MHCA)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
