Improving Open-Domain Dialogue Evaluation with a Causal Inference Model
Cat P. Le, Luke Dai, Michael Johnston, Yang Liu, Marilyn Walker, Reza, Ghanadan

TL;DR
This paper introduces a novel causal inference model called CF-LSTM for predicting user and expert ratings of open-domain dialogues, improving evaluation accuracy in conversational systems.
Contribution
The paper proposes the CF-LSTM, a new causal inference approach that enhances dialogue rating predictions by modeling underlying causes from turn-level features.
Findings
CF-LSTM outperforms baseline models in rating prediction accuracy.
The model effectively captures causal factors influencing user satisfaction.
Experimental results demonstrate improved classification performance.
Abstract
Effective evaluation methods remain a significant challenge for research on open-domain conversational dialogue systems. Explicit satisfaction ratings can be elicited from users, but users often do not provide ratings when asked, and those they give can be highly subjective. Post-hoc ratings by experts are an alternative, but these can be both expensive and complex to collect. Here, we explore the creation of automated methods for predicting both expert and user ratings of open-domain dialogues. We compare four different approaches. First, we train a baseline model using an end-to-end transformer to predict ratings directly from the raw dialogue text. The other three methods are variants of a two-stage approach in which we first extract interpretable features at the turn level that capture, among other aspects, user dialogue behaviors indicating contradiction, repetition, disinterest,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Expert finding and Q&A systems
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
