AuDirector: A Self-Reflective Closed-Loop Framework for Immersive Audio Storytelling

Yiming Ren; Xuenan Xu; Ziyang Zhang; Wen Wu; Baoxiang Li; Chao Zhang

arXiv:2605.11866·cs.SD·May 21, 2026

AuDirector: A Self-Reflective Closed-Loop Framework for Immersive Audio Storytelling

Yiming Ren, Xuenan Xu, Ziyang Zhang, Wen Wu, Baoxiang Li, Chao Zhang

PDF

1 Repo

TL;DR

AuDirector is a novel self-reflective framework for immersive audio storytelling that enhances coherence, expressiveness, and user interactivity through a multi-agent closed-loop system.

Contribution

It introduces a self-reflective, multi-agent framework with novel modules for character-aware synthesis, self-correction, and human-guided refinement in audio storytelling.

Findings

01

Achieves superior coherence, expressiveness, and fidelity over baselines.

02

Effectively integrates natural language feedback for interactive refinement.

03

Demonstrates improved audio quality through systematic self-correction.

Abstract

Despite advances in text and visual generation, creating coherent long-form audio narratives remains challenging. Existing frameworks often exhibit limitations such as mismatched character settings with voice performance, insufficient self-correction mechanisms, and limited human interactivity. To address these challenges, we propose AuDirector, a self-reflective closed-loop multi-agent framework. Specifically, it involves an Identity-Aware Pre-production mechanism that transforms narrative texts into character profiles and utterance-level emotional instructions to retrieve suitable voice candidates and guide expressive speech synthesis, thereby promoting context-aligned voice adaptation. To enhance quality, a Collaborative Synthesis and Correction module introduces a closed-loop self-correction mechanism to systematically audit and regenerate defective audio components. Furthermore, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous-itsh.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.