ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents
Stefano Mezza, Alessandra Cervone, Giuliano Tortoreto, Evgeny A., Stepanov, Giuseppe Riccardi

TL;DR
This paper introduces a methodology to unify various dialogue act datasets under the ISO standard, enabling the training of a robust, domain-independent dialogue act tagger for open-domain conversational agents.
Contribution
It proposes a mapping approach for multiple datasets to the ISO standard, creating a large, task-independent corpus for dialogue act classification.
Findings
Training on combined corpora improves robustness across DA categories.
The ISO-standard DA tagger performs well on out-of-domain data.
Mapping datasets to ISO standard enables cross-domain applicability.
Abstract
Dialogue Act (DA) tagging is crucial for spoken language understanding systems, as it provides a general representation of speakers' intents, not bound to a particular dialogue system. Unfortunately, publicly available data sets with DA annotation are all based on different annotation schemes and thus incompatible with each other. Moreover, their schemes often do not cover all aspects necessary for open-domain human-machine interaction. In this paper, we propose a methodology to map several publicly available corpora to a subset of the ISO standard, in order to create a large task-independent training corpus for DA classification. We show the feasibility of using this corpus to train a domain-independent DA tagger testing it on out-of-domain conversational data, and argue the importance of training on multiple corpora to achieve robustness across different DA categories.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Topic Modeling
