Coconstructions in spoken data: UD annotation guidelines and first results

Ludovica Pannitto; Sylvain Kahane; Kaja Dobrovoljc; Elena Battaglia; Bruno Guillaume; Caterina Mauri; Eleonora Zucchini

arXiv:2603.28261·cs.CL·March 31, 2026

Coconstructions in spoken data: UD annotation guidelines and first results

Ludovica Pannitto, Sylvain Kahane, Kaja Dobrovoljc, Elena Battaglia, Bruno Guillaume, Caterina Mauri, Eleonora Zucchini

PDF

TL;DR

This paper introduces annotation guidelines for cross-turn syntactic dependencies in spoken language, proposing two representations and new distinctions to improve spoken language treebanks within the UD framework.

Contribution

It presents novel annotation guidelines and two representations for cross-turn dependencies, enhancing spoken language syntactic annotation within the Universal Dependencies framework.

Findings

01

Proposed speaker-based and dependency-based representations for cross-turn dependencies.

02

Introduced new distinctions between reformulations, repairs, and unfinished elements.

03

Enhanced spoken language treebanks with detailed annotation guidelines.

Abstract

The paper proposes annotation guidelines for syntactic dependencies that span across speaker turns - including collaborative coconstructions proper, wh-question answers, and backchannels - in spoken language treebanks within the Universal Dependencies framework. Two representations are proposed: a speaker-based representation following the segmentation into speech turns, and a dependency-based representation with dependencies across speech turns. New propositions are also put forward to distinguish between reformulations and repairs, and to promote elements in unfinished phrases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.