DescribePro: Collaborative Audio Description with Human-AI Interaction

Maryam Cheema; Sina Elahimanesh; Samuel Martin; Pooyan Fazli; and Hasti Seifi

arXiv:2508.01092·cs.HC·August 5, 2025

DescribePro: Collaborative Audio Description with Human-AI Interaction

Maryam Cheema, Sina Elahimanesh, Samuel Martin, Pooyan Fazli, and Hasti Seifi

PDF

TL;DR

DescribePro is a collaborative system that combines human expertise and AI to improve the efficiency and quality of audio description creation for visually impaired audiences, supporting iterative refinement and community collaboration.

Contribution

The paper introduces DescribePro, a novel collaborative platform that integrates multimodal large language models and manual editing for improved audio description authoring.

Findings

01

AI support reduces repetitive work for describers

02

Helps professionals preserve stylistic choices

03

Eases cognitive load for novice describers

Abstract

Audio description (AD) makes video content accessible to millions of blind and low vision (BLV) users. However, creating high-quality AD involves a trade-off between the precision of human-crafted descriptions and the efficiency of AI-generated ones. To address this, we present DescribePro a collaborative AD authoring system that enables describers to iteratively refine AI-generated descriptions through multimodal large language model prompting and manual editing. DescribePro also supports community collaboration by allowing users to fork and edit existing ADs, enabling the exploration of different narrative styles. We evaluate DescribePro with 18 describers (9 professionals and 9 novices) using quantitative and qualitative methods. Results show that AI support reduces repetitive work while helping professionals preserve their stylistic choices and easing the cognitive load for novices.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.