Context-Aware Image Descriptions for Web Accessibility

Ananya Gubbi Mohanbabu; Amy Pavel

arXiv:2409.03054·cs.HC·September 6, 2024

Context-Aware Image Descriptions for Web Accessibility

Ananya Gubbi Mohanbabu, Amy Pavel

PDF

TL;DR

This paper presents a Chrome Extension that enhances image descriptions for blind and low vision users by incorporating webpage context, significantly improving description quality and relevance based on user feedback.

Contribution

The study introduces a novel method for generating context-aware image descriptions using GPT-4V, integrating webpage context to improve accessibility for BLV users.

Findings

01

Participants preferred context-aware descriptions over context-free ones.

02

Context-aware descriptions scored higher in quality, imaginability, relevance, and plausibility.

03

Participants expressed interest in using context-aware descriptions across various online platforms.

Abstract

Blind and low vision (BLV) internet users access images on the web via text descriptions. New vision-to-language models such as GPT-V, Gemini, and LLaVa can now provide detailed image descriptions on-demand. While prior research and guidelines state that BLV audiences' information preferences depend on the context of the image, existing tools for accessing vision-to-language models provide only context-free image descriptions by generating descriptions for the image alone without considering the surrounding webpage context. To explore how to integrate image context into image descriptions, we designed a Chrome Extension that automatically extracts webpage context to inform GPT-4V-generated image descriptions. We gained feedback from 12 BLV participants in a user study comparing typical context-free image descriptions to context-aware image descriptions. We then further evaluated our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.