Adversarial Conversational Shaping for Intelligent Agents
Piotr Tarasiewicz, Sultan Kenjeyev, Ilana Sebag, Shehab Alshehabi

TL;DR
This paper explores adversarial conversational shaping techniques, using GAN-based models to improve the stability and accuracy of chatbots and dialogue systems in natural language processing.
Contribution
It introduces two novel GAN-based models, GANPG and REGS, for enhancing conversational agents through adversarial training and reinforcement learning.
Findings
GANPG and REGS improve dialogue quality and stability
Transformers outperform seq2seq in reinforcement learning setup
Reward mechanisms enhance partial and full sequence generation
Abstract
The recent emergence of deep learning methods has enabled the research community to achieve state-of-the art results in several domains including natural language processing. However, the current robocall system remains unstable and inaccurate: text generator and chat-bots can be tedious and misunderstand human-like dialogue. In this work, we study the performance of two models able to enhance an intelligent conversational agent through adversarial conversational shaping: a generative adversarial network with policy gradient (GANPG) and a generative adversarial network with reward for every generation step (REGS) based on the REGS model presented in Li et al. [18] . This model is able to assign rewards to both partially and fully generated text sequences. We discuss performance with different training details : seq2seq [ 36] and transformers [37 ] in a reinforcement learning framework.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Generative Adversarial Networks and Image Synthesis
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
