Not There Yet: Evaluating Vision Language Models in Simulating the Visual Perception of People with Low Vision

Rosiana Natalie; Wenqian Xu; Ruei-Che Chang; Rada Mihalcea; Anhong Guo

arXiv:2508.10972·cs.CV·August 18, 2025

Not There Yet: Evaluating Vision Language Models in Simulating the Visual Perception of People with Low Vision

Rosiana Natalie, Wenqian Xu, Ruei-Che Chang, Rada Mihalcea, Anhong Guo

PDF

TL;DR

This study assesses how well vision language models can simulate the visual perception of low vision individuals, revealing limitations in agreement and highlighting the importance of combined information prompts for improved simulation accuracy.

Contribution

It introduces a benchmark dataset and evaluation framework for simulating low vision perception using VLMs, and identifies key factors influencing simulation accuracy.

Findings

01

Low agreement (0.59) when minimal prompts are used.

02

Combining vision info and example responses increases agreement to 0.70.

03

Single combined example significantly outperforms individual response types.

Abstract

Advances in vision language models (VLMs) have enabled the simulation of general human behavior through their reasoning and problem solving capabilities. However, prior research has not investigated such simulation capabilities in the accessibility domain. In this paper, we evaluate the extent to which VLMs can simulate the vision perception of low vision individuals when interpreting images. We first compile a benchmark dataset through a survey study with 40 low vision participants, collecting their brief and detailed vision information and both open-ended and multiple-choice image perception and recognition responses to up to 25 images. Using these responses, we construct prompts for VLMs (GPT-4o) to create simulated agents of each participant, varying the included information on vision information and example image responses. We evaluate the agreement between VLM-generated responses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.