DescribePro: Collaborative Audio Description with Human-AI Interaction
Maryam Cheema, Sina Elahimanesh, Samuel Martin, Pooyan Fazli, and Hasti Seifi

TL;DR
DescribePro is a collaborative system that combines human expertise and AI to improve the efficiency and quality of audio description creation for visually impaired audiences, supporting iterative refinement and community collaboration.
Contribution
The paper introduces DescribePro, a novel collaborative platform that integrates multimodal large language models and manual editing for improved audio description authoring.
Findings
AI support reduces repetitive work for describers
Helps professionals preserve stylistic choices
Eases cognitive load for novice describers
Abstract
Audio description (AD) makes video content accessible to millions of blind and low vision (BLV) users. However, creating high-quality AD involves a trade-off between the precision of human-crafted descriptions and the efficiency of AI-generated ones. To address this, we present DescribePro a collaborative AD authoring system that enables describers to iteratively refine AI-generated descriptions through multimodal large language model prompting and manual editing. DescribePro also supports community collaboration by allowing users to fork and edit existing ADs, enabling the exploration of different narrative styles. We evaluate DescribePro with 18 describers (9 professionals and 9 novices) using quantitative and qualitative methods. Results show that AI support reduces repetitive work while helping professionals preserve their stylistic choices and easing the cognitive load for novices.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
