Knowing What You Know: Calibrating Dialogue Belief State Distributions   via Ensembles

Carel van Niekerk; Michael Heck; Christian Geishauser; Hsien-Chin Lin,; Nurul Lubis; Marco Moresi; Milica Ga\v{s}i\'c

arXiv:2010.02586·cs.CL·November 24, 2020

Knowing What You Know: Calibrating Dialogue Belief State Distributions via Ensembles

Carel van Niekerk, Michael Heck, Christian Geishauser, Hsien-Chin Lin,, Nurul Lubis, Marco Moresi, Milica Ga\v{s}i\'c

PDF

TL;DR

This paper introduces a calibrated ensemble approach for multi-domain dialogue belief tracking, significantly improving both calibration and accuracy over existing models, enhancing dialogue system reliability.

Contribution

It presents a novel ensemble method that calibrates belief state distributions, achieving state-of-the-art calibration and higher accuracy in dialogue belief tracking.

Findings

01

Achieves state-of-the-art calibration in belief distributions.

02

Outperforms previous belief trackers in accuracy.

03

Provides more reliable dialogue state estimates.

Abstract

The ability to accurately track what happens during a conversation is essential for the performance of a dialogue system. Current state-of-the-art multi-domain dialogue state trackers achieve just over 55% accuracy on the current go-to benchmark, which means that in almost every second dialogue turn they place full confidence in an incorrect dialogue state. Belief trackers, on the other hand, maintain a distribution over possible dialogue states. However, they lack in performance compared to dialogue state trackers, and do not produce well calibrated distributions. In this work we present state-of-the-art performance in calibration for multi-domain dialogue belief trackers using a calibrated ensemble of models. Our resulting dialogue belief tracker also outperforms previous dialogue belief tracking models in terms of accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.