VL-OrdinalFormer: Vision Language Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading
Zahid Ullah, Jihie Kim

TL;DR
VL-OrdinalFormer is a novel vision language guided ordinal transformer framework that improves automated knee osteoarthritis grading accuracy and interpretability by incorporating clinical textual concepts and advanced training strategies.
Contribution
The paper introduces VL-OrdinalFormer, integrating CLIP-driven semantic alignment with ordinal regression for enhanced, interpretable KOA severity assessment from radiographs.
Findings
Achieves state-of-the-art macro F1 score and accuracy on OAI dataset.
Significantly improves classification of intermediate grades KL1 and KL2.
Provides clinically relevant interpretability through Grad CAM and CLIP maps.
Abstract
Knee osteoarthritis (KOA) is a leading cause of disability worldwide, and accurate severity assessment using the Kellgren Lawrence (KL) grading system is critical for clinical decision making. However, radiographic distinctions between early disease stages, particularly KL1 and KL2, are subtle and frequently lead to inter-observer variability among radiologists. To address these challenges, we propose VLOrdinalFormer, a vision language guided ordinal learning framework for fully automated KOA grading from knee radiographs. The proposed method combines a ViT L16 backbone with CORAL based ordinal regression and a Contrastive Language Image Pretraining (CLIP) driven semantic alignment module, allowing the model to incorporate clinically meaningful textual concepts related to joint space narrowing, osteophyte formation, and subchondral sclerosis. To improve robustness and mitigate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOsteoarthritis Treatment and Mechanisms · Total Knee Arthroplasty Outcomes · Domain Adaptation and Few-Shot Learning
