DINO-LG: Enhancing Vision Transformers with Label Guidance for Coronary Artery Calcium Detection

Mahmut S. Gokmen; Caner Ozcan; Moneera N. Haque; Steve W. Leung; C. Seth Parker; W. Brent Seales; Cody Bumgardner

arXiv:2411.07976·eess.IV·February 11, 2026

DINO-LG: Enhancing Vision Transformers with Label Guidance for Coronary Artery Calcium Detection

Mahmut S. Gokmen, Caner Ozcan, Moneera N. Haque, Steve W. Leung, C. Seth Parker, W. Brent Seales, Cody Bumgardner

PDF

TL;DR

This paper introduces DINO-LG, a label-guided self-supervised learning method for vision transformers that improves coronary artery calcium detection and scoring from CT scans, addressing data scarcity and imbalance issues.

Contribution

DINO-LG is a novel extension of DINO that incorporates label guidance and targeted augmentation, significantly enhancing CAC detection and scoring accuracy in medical imaging.

Findings

01

Achieved 89% sensitivity and 90% specificity in CAC slice detection.

02

Reduced false negatives by 49% and false positives by 57%.

03

Attained 90% accuracy in CAC risk classification.

Abstract

Coronary artery disease (CAD), one of the leading causes of mortality worldwide, necessitates effective risk assessment strategies, with coronary artery calcium (CAC) scoring via computed tomography (CT) being a key method for prevention. Traditional methods, primarily based on UNET architectures implemented on pre-built models, face challenges like the scarcity of annotated CT scans containing CAC and imbalanced datasets, leading to reduced performance in segmentation and scoring tasks. In this study, we address these limitations by introducing DINO-LG, a novel label-guided extension of DINO (self-distillation with no labels) that incorporates targeted augmentation on annotated calcified regions during self-supervised pre-training. Our three-stage pipeline integrates Vision Transformer (ViT-Base/8) feature extraction via DINO-LG trained on 914 CT scans comprising 700 gated and 214…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Softmax · Linear Layer · Dense Connections · Layer Normalization · Multi-Head Attention · Residual Connection · Concatenated Skip Connection · Max Pooling · Focus