TL;DR
This paper introduces a hierarchical context tagging approach for utterance rewriting that improves coreference resolution and grammaticality by predicting slotted rules and filling them with context spans, outperforming existing methods.
Contribution
The proposed hierarchical context tagger (HCT) effectively combines rule prediction and span filling to enhance utterance rewriting, especially for out-of-context tokens.
Findings
HCT outperforms state-of-the-art systems by ~2 BLEU points.
HCT effectively adds out-of-context tokens and multiple spans.
Clustering rules reduces the long tail of rule distribution.
Abstract
Utterance rewriting aims to recover coreferences and omitted information from the latest turn of a multi-turn dialogue. Recently, methods that tag rather than linearly generate sequences have proven stronger in both in- and out-of-domain rewriting settings. This is due to a tagger's smaller search space as it can only copy tokens from the dialogue context. However, these methods may suffer from low coverage when phrases that must be added to a source utterance cannot be covered by a single context span. This can occur in languages like English that introduce tokens such as prepositions into the rewrite for grammaticality. We propose a hierarchical context tagger (HCT) that mitigates this issue by predicting slotted rules (e.g., "besides_") whose slots are later filled with context spans. HCT (i) tags the source string with token-level edit actions and slotted rules and (ii) fills in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
