Sequence-to-Sequence Learning on Keywords for Efficient FAQ Retrieval
Sourav Dutta, Haytham Assem, and Edward Burgin

TL;DR
This paper introduces TI-S2S, a novel framework combining keyword extraction and embeddings within a Seq2Seq model to improve FAQ retrieval accuracy, achieving around 92% precision-at-rank-5.
Contribution
It presents a new learning framework that enhances FAQ retrieval by integrating TF-IDF keywords and Word2Vec embeddings into a Seq2Seq architecture, with a variant for candidate guidance.
Findings
Achieves 92% precision-at-rank-5 in FAQ retrieval
Provides 13% improvement over existing methods
Demonstrates effectiveness on publicly available datasets
Abstract
Frequently-Asked-Question (FAQ) retrieval provides an effective procedure for responding to user's natural language based queries. Such platforms are becoming common in enterprise chatbots, product question answering, and preliminary technical support for customers. However, the challenge in such scenarios lies in bridging the lexical and semantic gap between varied query formulations and the corresponding answers, both of which typically have a very short span. This paper proposes TI-S2S, a novel learning framework combining TF-IDF based keyword extraction and Word2Vec embeddings for training a Sequence-to-Sequence (Seq2Seq) architecture. It achieves high precision for FAQ retrieval by better understanding the underlying intent of a user question captured via the representative keywords. We further propose a variant with an additional neural network module for guiding retrieval via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
