Deep Learning Based Chatbot Models
Richard Csaky

TL;DR
This paper surveys recent chatbot research, critiques current models for neglecting external priors, and proposes adapting Transformer architectures with additional features like persona and mood to improve response relevance.
Contribution
It introduces a novel approach to enhance chatbot responses by integrating external priors into Transformer-based models, addressing limitations of existing architectures.
Findings
Transformer models with added features improve response relevance
Current models often neglect external priors like persona or mood
Augmented models outperform vanilla Transformer on conversational data
Abstract
A conversational agent (chatbot) is a piece of software that is able to communicate with humans using natural language. Modeling conversation is an important task in natural language processing and artificial intelligence. While chatbots can be used for various tasks, in general they have to understand users' utterances and provide responses that are relevant to the problem at hand. In my work, I conduct an in-depth survey of recent literature, examining over 70 publications related to chatbots published in the last 3 years. Then, I proceed to make the argument that the very nature of the general conversation domain demands approaches that are different from current state-of-of-the-art architectures. Based on several examples from the literature I show why current chatbot models fail to take into account enough priors when generating responses and how this affects the quality of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Sentiment Analysis and Opinion Mining
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
