Schema-Guided Semantic Accuracy: Faithfulness in Task-Oriented Dialogue Response Generation
Jinghong Chen, Weizhe Lin, Bill Byrne

TL;DR
This paper introduces Schema-Guided Semantic Accuracy (SGSAcc), a new evaluation metric for task-oriented dialogue generation that assesses faithfulness across categorical and non-categorical slots, and proposes methods to improve generation fidelity.
Contribution
The paper proposes SGSAcc, a novel evaluation metric based on textual entailment, and demonstrates how prefix tuning and ensemble methods improve faithful utterance generation.
Findings
SGSAcc aligns well with human judgment.
Prefix tuning improves categorical slot generation.
Ensemble models achieve lowest SER and high SGSAcc.
Abstract
Ensuring that generated utterances are faithful to dialogue actions is crucial for Task-Oriented Dialogue Response Generation. Slot Error Rate (SER) only partially measures generation quality in that it solely assesses utterances generated from non-categorical slots whose values are expected to be reproduced exactly. Utterances generated from categorical slots, which are more variable, are not assessed by SER. We propose Schema-Guided Semantic Accuracy (SGSAcc) to evaluate utterances generated from both categorical and non-categorical slots by recognizing textual entailment. We show that SGSAcc can be applied to evaluate utterances generated from a wide range of dialogue actions in the Schema Guided Dialogue (SGD) dataset with good agreement with human judgment. We also identify a previously overlooked weakness in generating faithful utterances from categorical slots in unseen domains.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Inverse Square Root Schedule · Multi-Head Attention · Residual Connection · Dense Connections · Dropout
