Improving Search through A3C Reinforcement Learning based Conversational   Agent

Milan Aggarwal; Aarushi Arora; Shagun Sodhani; Balaji Krishnamurthy

arXiv:1709.05638·cs.AI·August 21, 2018

Improving Search through A3C Reinforcement Learning based Conversational Agent

Milan Aggarwal, Aarushi Arora, Shagun Sodhani, Balaji Krishnamurthy

PDF

Open Access

TL;DR

This paper presents a reinforcement learning approach using A3C to develop a conversational search agent that assists users in subjective digital asset searches, improving interaction efficiency and effectiveness.

Contribution

It introduces a virtual user model for training, an A3C-based architecture for context preservation, and demonstrates superior performance over Q-learning in subjective search tasks.

Findings

01

A3C agent achieves higher rewards than Q-learning.

02

The virtual user accelerates training by efficiently sampling user behavior.

03

The agent provides contextually relevant assistance in subjective search.

Abstract

We develop a reinforcement learning based search assistant which can assist users through a set of actions and sequence of interactions to enable them realize their intent. Our approach caters to subjective search where the user is seeking digital assets such as images which is fundamentally different from the tasks which have objective and limited search modalities. Labeled conversational data is generally not available in such search tasks and training the agent through human interactions can be time consuming. We propose a stochastic virtual user which impersonates a real user and can be used to sample user behavior efficiently to train the agent which accelerates the bootstrapping of the agent. We develop A3C algorithm based context preserving architecture which enables the agent to provide contextual assistance to the user. We compare the A3C agent with Q-learning and evaluate its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Recommender Systems and Techniques · Data Stream Mining Techniques

MethodsEntropy Regularization · Dense Connections · Softmax · Convolution · A3C · Q-Learning