Grounding in social media: An approach to building a chit-chat dialogue model
Ritvik Choudhary, Daisuke Kawahara

TL;DR
This paper proposes a social media grounded approach to enhance open-domain dialogue systems by mimicking human responses through casual Reddit interactions, improving conversational richness and relevance.
Contribution
It introduces a joint retriever-generator model that leverages Reddit comments to provide external context, broadening beyond traditional structured knowledge sources.
Findings
Improved response relevance in automatic evaluations.
Enhanced human-like conversational quality.
Effective use of social media data for dialogue modeling.
Abstract
Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation. However, even with recent advancements in the field, existing open-domain generative models fail to capture and utilize external knowledge, leading to repetitive or generic responses to unseen utterances. Current work on knowledge-grounded dialogue generation primarily focuses on persona incorporation or searching a fact-based structured knowledge source such as Wikipedia. Our method takes a broader and simpler approach, which aims to improve the raw conversation ability of the system by mimicking the human response behavior through casual interactions found on social media. Utilizing a joint retriever-generator setup, the model queries a large set of filtered comment data from Reddit to act as additional context for the seq2seq generator.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · AI in Service Interactions
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
