# Learning When Not to Answer: A Ternary Reward Structure for   Reinforcement Learning based Question Answering

**Authors:** Fr\'ederic Godin, Anjishnu Kumar, Arpit Mittal

arXiv: 1902.10236 · 2019-04-05

## TL;DR

This paper proposes a ternary reward structure for reinforcement learning in question-answering over knowledge graphs, improving system confidence and precision by allowing agents to abstain from answering when appropriate.

## Contribution

It introduces a new ternary reward scheme and a more representative performance metric for QA systems, enhancing their practical applicability and confidence modeling.

## Key findings

- Improved precision of answered questions
- Better confidence modeling in agents
- Enhanced performance with supervised bootstrap

## Abstract

In this paper, we investigate the challenges of using reinforcement learning agents for question-answering over knowledge graphs for real-world applications. We examine the performance metrics used by state-of-the-art systems and determine that they are inadequate for such settings. More specifically, they do not evaluate the systems correctly for situations when there is no answer available and thus agents optimized for these metrics are poor at modeling confidence. We introduce a simple new performance metric for evaluating question-answering agents that is more representative of practical usage conditions, and optimize for this metric by extending the binary reward structure used in prior work to a ternary reward structure which also rewards an agent for not answering a question rather than giving an incorrect answer. We show that this can drastically improve the precision of answered questions while only not answering a limited number of previously correctly answered questions. Employing a supervised learning strategy using depth-first-search paths to bootstrap the reinforcement learning algorithm further improves performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.10236/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1902.10236/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1902.10236/full.md

---
Source: https://tomesphere.com/paper/1902.10236