S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature

Abigail Berthe-Pardo (1); Gaspard Michel (1; 2); Elena V. Epure (2; 3); Christophe Cerisara (1) ((1) LORIA; Vand{\oe}uvre-l\`es-Nancy; France; (2) Deezer Research; Paris; France; (3) Idiap Research Institute; Switzerland)

arXiv:2603.00958·cs.CL·March 3, 2026

S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature

Abigail Berthe-Pardo (1), Gaspard Michel (1, 2), Elena V. Epure (2, 3), Christophe Cerisara (1) ((1) LORIA, Vand{\oe}uvre-l\`es-Nancy, France, (2) Deezer Research, Paris, France, (3) Idiap Research Institute, Switzerland)

PDF

Open Access

TL;DR

S-VoCAL introduces a novel dataset and evaluation framework for inferring fictional character voice attributes from literature, enabling better characterization in synthetic narration systems.

Contribution

It provides the first dedicated dataset and evaluation tools for extracting voice-related character attributes from literary texts, including a new similarity metric based on language model embeddings.

Findings

01

RAG pipeline reliably infers Age and Gender attributes

02

Struggles to accurately infer Origin and Physical Health

03

Dataset and code are publicly available

Abstract

With recent advances in Text-to-Speech (TTS) systems, synthetic audiobook narration has seen increased interest, reaching unprecedented levels of naturalness. However, larger gaps remain in synthetic narration systems' ability to impersonate fictional characters, and convey complex emotions or prosody. A promising direction to enhance character identification is the assignment of plausible voices to each fictional characters in a book. This step typically requires complex inference of attributes in book-length contexts, such as a character's age, gender, origin or physical health, which in turns requires dedicated benchmark datasets to evaluate extraction systems' performances. We present S-VoCAL (Speaking Voice Character Attributes in Literature), the first dataset and evaluation framework dedicated to evaluate the inference of voice-related fictional character attributes. S-VoCAL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling · Mental Health via Writing