VL-OrdinalFormer: Vision Language Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading

Zahid Ullah; Jihie Kim

arXiv:2601.00879·cs.CV·January 6, 2026

VL-OrdinalFormer: Vision Language Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading

Zahid Ullah, Jihie Kim

PDF

Open Access

TL;DR

VL-OrdinalFormer is a novel vision language guided ordinal transformer framework that improves automated knee osteoarthritis grading accuracy and interpretability by incorporating clinical textual concepts and advanced training strategies.

Contribution

The paper introduces VL-OrdinalFormer, integrating CLIP-driven semantic alignment with ordinal regression for enhanced, interpretable KOA severity assessment from radiographs.

Findings

01

Achieves state-of-the-art macro F1 score and accuracy on OAI dataset.

02

Significantly improves classification of intermediate grades KL1 and KL2.

03

Provides clinically relevant interpretability through Grad CAM and CLIP maps.

Abstract

Knee osteoarthritis (KOA) is a leading cause of disability worldwide, and accurate severity assessment using the Kellgren Lawrence (KL) grading system is critical for clinical decision making. However, radiographic distinctions between early disease stages, particularly KL1 and KL2, are subtle and frequently lead to inter-observer variability among radiologists. To address these challenges, we propose VLOrdinalFormer, a vision language guided ordinal learning framework for fully automated KOA grading from knee radiographs. The proposed method combines a ViT L16 backbone with CORAL based ordinal regression and a Contrastive Language Image Pretraining (CLIP) driven semantic alignment module, allowing the model to incorporate clinically meaningful textual concepts related to joint space narrowing, osteophyte formation, and subchondral sclerosis. To improve robustness and mitigate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOsteoarthritis Treatment and Mechanisms · Total Knee Arthroplasty Outcomes · Domain Adaptation and Few-Shot Learning