AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility
Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai, Du, Yiyi Tao

TL;DR
AltGen is an AI-powered pipeline that automates the generation of descriptive alt text for images in EPUB files, significantly improving accessibility for visually impaired users through advanced generative models and multimodal analysis.
Contribution
This paper introduces AltGen, a novel AI-driven system that combines computer vision and language models to automatically produce high-quality alt text for EPUB images, enhancing accessibility at scale.
Findings
Achieved 97.5% reduction in accessibility errors.
Outperformed existing methods in accuracy and relevance.
Received positive user feedback on usability improvements.
Abstract
Digital accessibility is a cornerstone of inclusive content delivery, yet many EPUB files fail to meet fundamental accessibility standards, particularly in providing descriptive alt text for images. Alt text plays a critical role in enabling visually impaired users to understand visual content through assistive technologies. However, generating high-quality alt text at scale is a resource-intensive process, creating significant challenges for organizations aiming to ensure accessibility compliance. This paper introduces AltGen, a novel AI-driven pipeline designed to automate the generation of alt text for images in EPUB files. By integrating state-of-the-art generative models, including advanced transformer-based architectures, AltGen achieves contextually relevant and linguistically coherent alt text descriptions. The pipeline encompasses multiple stages, starting with data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification
MethodsContrastive Language-Image Pre-training
