Automatic Annotation of Direct Speech in Written French Narratives

No\'e Durandard; Viet-Anh Tran; Gaspard Michel; Elena V.; Epure

arXiv:2306.15634·cs.CL·January 24, 2025

Automatic Annotation of Direct Speech in Written French Narratives

No\'e Durandard, Viet-Anh Tran, Gaspard Michel, Elena V., Epure

PDF

Open Access 1 Repo

TL;DR

This paper presents a comprehensive framework for automatic annotation of direct speech in French narratives, including dataset creation, baseline adaptation, and evaluation, highlighting challenges and future directions.

Contribution

It introduces the largest annotated French narrative dataset and a unified evaluation framework for AADS models in French, advancing research in this area.

Findings

01

Baseline models show limited generalisation performance.

02

Characteristics of different models influence annotation accuracy.

03

The dataset and framework facilitate future research in French AADS.

Abstract

The automatic annotation of direct speech (AADS) in written text has been often used in computational narrative understanding. Methods based on either rules or deep neural networks have been explored, in particular for English or German languages. Yet, for French, our target language, not many works exist. Our goal is to create a unified framework to design and evaluate AADS models in French. For this, we consolidated the largest-to-date French narrative dataset annotated with DS per word; we adapted various baselines for sequence labelling or from AADS in other languages; and we designed and conducted an extensive evaluation focused on generalisation. Results show that the task still requires substantial efforts and emphasise characteristics of each baseline. Although this framework could be improved, it is a step further to encourage more research on the topic.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deezer/aads_french
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems