Calibrated Self-supervised Vision Transformers Improve Intracranial Arterial Calcification Segmentation from Clinical CT Head Scans
Benjamin Jin, Grant Mair, Joanna M. Wardlaw, Maria del C. Vald\'es Hern\'andez

TL;DR
This paper demonstrates that calibrated self-supervised Vision Transformers, trained with masked autoencoders, outperform traditional methods in intracranial arterial calcification segmentation from clinical CT scans, with improved robustness and clinical risk assessment.
Contribution
First application of pre-trained ViTs with MAE for IAC segmentation, showing significant performance gains over baseline models in medical imaging.
Findings
ViTs outperform nnU-Net baseline by 3.2 Dice points.
Low patch sizes are essential for effective ViT segmentation.
ViTs enhance robustness to slice thickness and improve risk classification by 46%.
Abstract
Vision Transformers (ViTs) have gained significant popularity in the natural image domain but have been less successful in 3D medical image segmentation. Nevertheless, 3D ViTs are particularly interesting for large medical imaging volumes due to their efficient self-supervised training within the masked autoencoder (MAE) framework, which enables the use of imaging data without the need for expensive manual annotations. Intracranial arterial calcification (IAC) is an imaging biomarker visible on routinely acquired CT scans linked to neurovascular diseases such as stroke and dementia, and automated IAC quantification could enable their large-scale risk assessment. We pre-train ViTs with MAE and fine-tune them for IAC segmentation for the first time. To develop our models, we use highly heterogeneous data from a large clinical trial, the third International Stroke Trial (IST-3). We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
