EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative   storytelling in games, television and graphic novels

Kari Ali Noriy; Xiaosong Yang; Jian Jun Zhang

arXiv:2305.13137·cs.CL·May 26, 2023·2 cites

EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels

Kari Ali Noriy, Xiaosong Yang, Jian Jun Zhang

PDF

Open Access 1 Repo 1 Datasets

TL;DR

The EMNS corpus provides a high-quality, emotive speech dataset with labeled emotional states, enhancing naturalness and expressiveness in text-to-speech applications for narrative storytelling in media.

Contribution

It introduces a novel, labeled emotive speech dataset with high authenticity, supporting improved emotional expressiveness in TTS systems for storytelling.

Findings

01

Achieved highest scores in emotion conveyance and expressiveness

02

Outperformed other datasets in conveying shared emotions

03

Participants recognized recordings as genuine and expressive

Abstract

The increasing adoption of text-to-speech technologies has led to a growing demand for natural and emotive voices that adapt to a conversation's context and emotional tone. The Emotive Narrative Storytelling (EMNS) corpus is a unique speech dataset created to enhance conversations' expressiveness and emotive quality in interactive narrative-driven systems. The corpus consists of a 2.3-hour recording featuring a female speaker delivering labelled utterances. It encompasses eight acted emotional states, evenly distributed with a variance of 0.68%, along with expressiveness levels and natural language descriptions with word emphasis labels. The evaluation of audio samples from different datasets revealed that the EMNS corpus achieved the highest average scores in accurately conveying emotions and demonstrating expressiveness. It outperformed other datasets in conveying shared emotions and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

knoriy/emns-dct
noneOfficial

Datasets

amu-cai/CAMEO
dataset· 356 dl
356 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Humor Studies and Applications · Subtitles and Audiovisual Media