Inferring Alt-text For UI Icons With Large Language Models During App Development
Sabrina Haque, Christoph Csallner

TL;DR
This paper presents IconDesc, a novel approach using large language models to generate meaningful alt-text for UI icons during app development, improving accessibility without requiring complete UI screens.
Contribution
Introducing IconDesc, a method that fine-tunes LLMs with partial UI data and contextual icon information to generate alt-text, addressing limitations of prior deep learning and vision-language models.
Findings
Significant improvement in relevant alt-text generation
Effective during iterative app development phases
User study confirms enhanced accessibility support
Abstract
Ensuring accessibility in mobile applications remains a significant challenge, particularly for visually impaired users who rely on screen readers. User interface icons are essential for navigation and interaction and often lack meaningful alt-text, creating barriers to effective use. Traditional deep learning approaches for generating alt-text require extensive datasets and struggle with the diversity and imbalance of icon types. More recent Vision Language Models (VLMs) require complete UI screens, which can be impractical during the iterative phases of app development. To address these issues, we introduce a novel method using Large Language Models (LLMs) to autonomously generate informative alt-text for mobile UI icons with partial UI data. By incorporating icon context, that include class, resource ID, bounds, OCR-detected text, and contextual information from parent and sibling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis
