Retrieving and Ranking Similar Questions from Question-Answer Archives Using Topic Modelling and Topic Distribution Regression
Pedro Chahuara, Thomas Lampert, Pierre Gancarski

TL;DR
This paper introduces a new model for ranking similar questions in Q&A platforms by combining topic modeling with a regression approach to better handle vocabulary differences and question length issues.
Contribution
It presents a novel integration of topic distribution regression to improve question similarity ranking, outperforming previous translation and topic modeling methods.
Findings
Outperforms translation-based methods on real datasets
Effective in handling vocabulary differences between questions and answers
Improves ranking accuracy for similar questions
Abstract
Presented herein is a novel model for similar question ranking within collaborative question answer platforms. The presented approach integrates a regression stage to relate topics derived from questions to those derived from question-answer pairs. This helps to avoid problems caused by the differences in vocabulary used within questions and answers, and the tendency for questions to be shorter than answers. The performance of the model is shown to outperform translation methods and topic modelling (without regression) on several real-world datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
