AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration

Minjiang Huang; Jipeng Qiang; Yi Zhu; Chaowei Zhang; Xiangyu Zhao; Kui Yu

arXiv:2512.23300·cs.CL·December 30, 2025

AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration

Minjiang Huang, Jipeng Qiang, Yi Zhu, Chaowei Zhang, Xiangyu Zhao, Kui Yu

PDF

Open Access

TL;DR

AI4Reading is a multi-agent system that uses large language models and speech synthesis to automate the creation of Chinese audiobook interpretations, aiming for accuracy and clarity.

Contribution

This paper introduces a novel multi-agent framework leveraging LLMs and speech synthesis to automate audiobook interpretation creation, improving efficiency and content quality.

Findings

01

Generated scripts are simpler and more accurate than expert interpretations.

02

The system achieves key objectives of content preservation, clarity, and logical structure.

03

Speech quality still needs improvement compared to human narration.

Abstract

Audiobook interpretations are attracting increasing attention, as they provide accessible and in-depth analyses of books that offer readers practical insights and intellectual inspiration. However, their manual creation process remains time-consuming and resource-intensive. To address this challenge, we propose AI4Reading, a multi-agent collaboration system leveraging large language models (LLMs) and speech synthesis technology to generate podcast, like audiobook interpretations. The system is designed to meet three key objectives: accurate content preservation, enhanced comprehensibility, and a logical narrative structure. To achieve these goals, we develop a framework composed of 11 specialized agents,including topic analysts, case analysts, editors, a narrator, and proofreaders that work in concert to explore themes, extract real world cases, refine content organization, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Speech and dialogue systems