# Sub-domain Modelling for Dialogue Management with Hierarchical   Reinforcement Learning

**Authors:** Pawe{\l} Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrk\v{s}i\'c,, Tsung-Hsien Wen, I\~nigo Casanueva, Lina Rojas-Barahona, Milica Ga\v{s}i\'c

arXiv: 1706.06210 · 2017-07-18

## TL;DR

This paper introduces a hierarchical reinforcement learning approach using the option framework to improve multi-domain dialogue management, enabling faster learning and better policies for complex conversational systems.

## Contribution

The paper proposes a novel hierarchical reinforcement learning architecture for multi-domain dialogue management, demonstrating improved learning speed and policy quality over flat methods.

## Key findings

- Faster policy learning with the hierarchical approach
- Better performance compared to flat reinforcement learning methods
- Effective adaptation of pretrained policies to complex systems

## Abstract

Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarchical reinforcement learning using the option framework. Next, we show that the proposed architecture learns faster and arrives at a better policy than the existing flat ones do. Moreover, we show how pretrained policies can be adapted to more complex systems with an additional set of new actions. In doing that, we show that our approach has the potential to facilitate policy optimisation for more sophisticated multi-domain dialogue systems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.06210/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1706.06210/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1706.06210/full.md

---
Source: https://tomesphere.com/paper/1706.06210