# Grading Diabetic Retinopathy Using Comparative Assessment: A Pilot Study Comparing Paired Image Comparisons With Direct Grading

**Authors:** Mohammed Al-Roubaie

PMC · DOI: 10.7759/cureus.102559 · Cureus · 2026-01-29

## TL;DR

A pilot study found that comparing retinal images side-by-side improves diabetic retinopathy grading accuracy and consistency compared to traditional methods, especially for less experienced clinicians.

## Contribution

This study introduces comparative assessment as a novel method for diabetic retinopathy grading, showing improved performance over direct categorical grading.

## Key findings

- Comparative assessment showed higher accuracy and specificity than direct grading.
- Paired image comparisons demonstrated greater consistency across repeated grading rounds.
- Classification performance differences between methods were statistically significant.

## Abstract

Introduction

Accurate grading of diabetic retinopathy is essential for effective screening, clinical decision-making, and evaluation of automated diagnostic systems. Conventional grading relies on categorical severity scales, which are subject to inter- and intra-observer variability, particularly among less-experienced or junior graders and in cases with subtle disease features. Comparative assessment using paired image comparisons may offer a complementary approach by reframing grading as a relative severity judgement and potentially reducing grading variability.

Methods

This pilot study evaluated retinal fundus photographs obtained from a publicly available dataset. Ninety images spanning the spectrum of diabetic retinopathy severity were graded using two approaches: direct grading according to the International Clinical Diabetic Retinopathy Severity Scale and comparative assessment using paired image comparisons. Both methods were performed twice by a junior clinician following structured training to assess repeatability. Classification performance for discrimination between the presence and absence of diabetic retinopathy was compared using confusion matrices and McNemar’s test.

Results

Comparative assessment demonstrated higher overall accuracy and improved specificity compared with direct grading across repeated grading rounds, while maintaining high sensitivity. Paired image comparison showed greater consistency between grading attempts, whereas direct grading exhibited greater variability. Differences in classification performance between methods were statistically significant.

Conclusion

In this pilot study, comparative assessment using paired image comparisons outperformed conventional direct grading for discrimination between the presence and absence of diabetic retinopathy when applied by a junior grader. These findings suggest that relative severity judgement may represent a viable alternative or adjunct to traditional categorical grading systems, particularly in contexts where grading variability is a concern. Larger studies involving multiple graders and real-world screening images are required to validate these findings and define the clinical role of comparative assessment.

## Linked entities

- **Diseases:** diabetic retinopathy (MONDO:0005266)

## Full-text entities

- **Diseases:** Diabetic (MESH:D003920), visual impairment (MESH:D014786), NPDR (OMIM:603933), PDR (MESH:C564461), DR (MESH:D003930)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12949601/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC12949601/full.md

---
Source: https://tomesphere.com/paper/PMC12949601