Focused Decoding Enables 3D Anatomical Detection by Transformers
Bastian Wittmann, Fernando Navarro, Suprosanna Shit, Bjoern Menze

TL;DR
This paper introduces Focused Decoder, a novel 3D anatomical detection transformer that uses anatomical region atlases to improve detection accuracy and explainability in medical imaging with limited annotated data.
Contribution
The paper presents a new Detection Transformer architecture tailored for 3D medical imaging that reduces data requirements and enhances interpretability using anatomical region information.
Findings
Achieves strong detection performance on CT datasets.
Reduces need for large annotated datasets.
Provides intuitive explainability through attention weights.
Abstract
Detection Transformers represent end-to-end object detection approaches based on a Transformer encoder-decoder architecture, exploiting the attention mechanism for global relation modeling. Although Detection Transformers deliver results on par with or even superior to their highly optimized CNN-based counterparts operating on 2D natural images, their success is closely coupled to access to a vast amount of training data. This, however, restricts the feasibility of employing Detection Transformers in the medical domain, as access to annotated data is typically limited. To tackle this issue and facilitate the advent of medical Detection Transformers, we propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder. Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Medical Imaging and Analysis · Advanced Neural Network Applications
MethodsAttention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Adam
