Rethinking Evaluation of Multiple Sclerosis (MS) Lesion Segmentation Models
Abdul Basit, Ashir Rashid, Muhammad Abdullah Hanif, Muhammad Shafique

TL;DR
This paper argues for a reevaluation of how MS lesion segmentation models are assessed, emphasizing metrics that reflect clinical relevance and real-world applicability beyond traditional Dice scores.
Contribution
It introduces a detailed problem fingerprinting aligned with neurologists' needs and evaluates current models using these metrics on open datasets.
Findings
Current models often rely solely on Dice score for evaluation.
Neurologists focus on lesion detection and disease progression metrics.
State-of-the-art models show limitations in clinically relevant performance.
Abstract
Multiple Sclerosis (MS) is a chronic autoimmune disease that can significantly reduce the quality of life of a patient. Existing treatment options can only help slow down the progression of the disease. Therefore, early detection and precise monitoring of disease progression are important. Deep learning offers state-of-the-art models for detecting and segmenting MS lesions in brain MRI scans. However, most of these models are evaluated using the Dice score, without accounting for lesion-wise detection and segmentation performance or other metrics that quantify model performance in cases that are complex or confusing for human annotators, or in cases that are essential for disease detection and progression monitoring. In this paper, we highlight the need to rethink the evaluation of MS lesion segmentation models. In this context, we first present problem fingerprinting in detail to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
