Ask the Right Questions: Active Question Reformulation with Reinforcement Learning
Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech, Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

TL;DR
This paper introduces a reinforcement learning approach to actively reformulate questions in a QA system to improve answer quality, outperforming existing models and discovering novel reformulation strategies.
Contribution
It presents a new active question reformulation method using reinforcement learning, trained end-to-end to optimize answer quality in a QA setting.
Findings
The agent outperforms state-of-the-art models on SearchQA.
Reformulations differ from natural language paraphrases, resembling IR techniques.
The learned strategies include term re-weighting and stemming.
Abstract
We frame Question Answering (QA) as a Reinforcement Learning task, an approach that we call Active Question Answering. We propose an agent that sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers. The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer. The reformulation system is trained end-to-end to maximize answer quality using policy gradient. We evaluate on SearchQA, a dataset of complex questions extracted from Jeopardy!. The agent outperforms a state-of-the-art base model, playing the role of the environment, and other benchmarks. We also analyze the language that the agent has learned while interacting with the question answering system. We find that successful question reformulations look quite…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Domain Adaptation and Few-Shot Learning
