Coconstructions in spoken data: UD annotation guidelines and first results
Ludovica Pannitto, Sylvain Kahane, Kaja Dobrovoljc, Elena Battaglia, Bruno Guillaume, Caterina Mauri, Eleonora Zucchini

TL;DR
This paper introduces annotation guidelines for cross-turn syntactic dependencies in spoken language, proposing two representations and new distinctions to improve spoken language treebanks within the UD framework.
Contribution
It presents novel annotation guidelines and two representations for cross-turn dependencies, enhancing spoken language syntactic annotation within the Universal Dependencies framework.
Findings
Proposed speaker-based and dependency-based representations for cross-turn dependencies.
Introduced new distinctions between reformulations, repairs, and unfinished elements.
Enhanced spoken language treebanks with detailed annotation guidelines.
Abstract
The paper proposes annotation guidelines for syntactic dependencies that span across speaker turns - including collaborative coconstructions proper, wh-question answers, and backchannels - in spoken language treebanks within the Universal Dependencies framework. Two representations are proposed: a speaker-based representation following the segmentation into speech turns, and a dependency-based representation with dependencies across speech turns. New propositions are also put forward to distinguish between reformulations and repairs, and to promote elements in unfinished phrases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
