# Enhancing accessibility: a multi-level platform for visual question answering in diabetic retinopathy for individuals with disabilities

**Authors:** Sarah Alotaibi, Suheer Al-Hadhrami, Saad Al-Ahmadi

PMC · DOI: 10.3389/frai.2025.1646176 · Frontiers in Artificial Intelligence · 2025-11-03

## TL;DR

This paper introduces a multi-level VQA system to help visually impaired individuals by improving accuracy in answering visual questions about diabetic retinopathy.

## Contribution

A bi-level VQA framework that improves accuracy by routing questions to specialized models based on type.

## Key findings

- The bi-level VQA model increased accuracy from 87.41% to 88.41% over existing methods.
- Using multiple specialized models for different question types enhances system performance.
- The framework shows promise for future development of advanced multi-level VQA systems.

## Abstract

Individuals with visual disabilities possess impairments that affect their ability to perceive visual information, ranging from partial to complete vision loss. Visual disabilities affect about 2.2 billion people globally. In this paper, we introduce a new multi-level Visual Questioning Answering (VQA) framework for visually disabled people that leverages the strengths of various VQA models of the multi-level components to enhance system performance. The model relies on a bi-level architecture that employs two distinct layers. In the first level, the model classifies the question type. This classification guides the visual question to the appropriate component model in the second level. This bi-level architecture incorporates a switch function that enables the system to select the optimal VQA model for each specific question, hence enhancing overall accuracy. The experimental findings indicate that the multi-level VQA technique is significantly effective. The bi-level VQA model enhances the overall accuracy over the state-of-the-art from 87.41% to 88.41%. This finding suggests the use of multiple levels with different models can boost the VQA systems' performance. This research presents a promising direction for developing advanced, multi-level VQA systems. Future work may explore optimizing and experimenting with various model levels to enhance performance further.

## Linked entities

- **Diseases:** diabetic retinopathy (MONDO:0005266)

## Full-text entities

- **Diseases:** Visual disabilities (MESH:D014786), diabetic retinopathy (MESH:D003930)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12621105/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12621105/full.md

## References

96 references — full list in the complete paper: https://tomesphere.com/paper/PMC12621105/full.md

---
Source: https://tomesphere.com/paper/PMC12621105