On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method
Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart

TL;DR
This paper conducts a large-scale robustness study of dialogue history modeling in Conversational Question Answering, revealing that high benchmark scores do not ensure robustness, and introduces a simple prompt-based method that improves robustness across various settings.
Contribution
It provides the first comprehensive robustness analysis of history modeling in CQA and proposes a novel prompt-based approach that enhances robustness.
Findings
Benchmark scores do not guarantee robustness.
Different methods perform variably under different settings.
The proposed prompt-based method shows strong robustness.
Abstract
Most works on modeling the conversation history in Conversational Question Answering (CQA) report a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g. from large to small sets) and domain. In this work, we design and conduct the first large-scale robustness study of history modeling approaches for CQA. We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings. Equipped with the insights from our study, we design a novel prompt-based history modeling approach, and demonstrate its strong robustness across various settings. Our approach is inspired by existing methods that highlight historic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
