Simulation-Driven Training of Vision Transformers Enabling Metal Segmentation in X-Ray Images
Fuxin Fan, Ludwig Ritschl, Marcel Beister, Ramyar Biniazan, Bj\"orn, Kreher, Tristan M. Gottschalk, Steffen Kappler, Andreas Maier

TL;DR
This paper introduces a simulation-based training method for vision transformers to improve metal segmentation in X-ray images, demonstrating superior performance over CNNs and highlighting the potential of CAD-generated data for clinical applications.
Contribution
The study presents a novel simulation-driven training approach using CAD models and vision transformers for metal segmentation in X-ray images, outperforming traditional CNNs.
Findings
Vision transformer with dual decoder achieves 0.90 dice score.
CAD-based data generation enhances training flexibility and generalization.
Transformer-based model outperforms CNN encoder-based networks.
Abstract
In several image acquisition and processing steps of X-ray radiography, knowledge of the existence of metal implants and their exact position is highly beneficial (e.g. dose regulation, image contrast adjustment). Another application which would benefit from an accurate metal segmentation is cone beam computed tomography (CBCT) which is based on 2D X-ray projections. Due to the high attenuation of metals, severe artifacts occur in the 3D X-ray acquisitions. The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact avoidance and reduction algorithms. Since the generation of high quality clinical training is a constant challenge, this study proposes to generate simulated X-ray images based on CT data sets combined with self-designed computer aided design (CAD) implants and make use of convolutional neural network (CNN) and vision transformer (ViT) for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced X-ray and CT Imaging · Advanced X-ray Imaging Techniques · Mineral Processing and Grinding
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Max Pooling · Convolution · Concatenated Skip Connection · Layer Normalization · Softmax · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia?
