Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2
Eug\'enio Ribeiro, Ricardo Ribeiro, David Martins de Matos

TL;DR
This paper develops strategies to convert dialog act annotations from the LEGO corpus into the ISO 24617-2 standard, increasing the amount of standardized annotated data for dialog research.
Contribution
It introduces a mapping approach that expands annotated data for the ISO 24617-2 standard using the LEGO corpus, addressing data scarcity and domain dependency issues.
Findings
Added 347 dialogs annotated with ISO 24617-2
Enhanced data availability for dialog research
Facilitated standardization of dialog act annotations
Abstract
In this paper we present strategies for mapping the dialog act annotations of the LEGO corpus into the communicative functions of the ISO 24617-2 standard. Using these strategies, we obtained an additional 347 dialogs annotated according to the standard. This is particularly important given the reduced amount of existing data in those conditions due to the recency of the standard. Furthermore, these are dialogs from a widely explored corpus for dialog related tasks. However, its dialog annotations have been neglected due to their high domain-dependency, which renders them unuseful outside the context of the corpus. Thus, through our mapping process, we both obtain more data annotated according to a recent standard and provide useful dialog act annotations for a widely explored corpus in the context of dialog research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Multi-Agent Systems and Negotiation
