Efficient and Interpretable Information Retrieval for Product Question Answering with Heterogeneous Data
Biplob Biswas, Rajiv Ramnath

TL;DR
This paper introduces a hybrid IR model combining lexical and semantic representations for product QA, achieving improved accuracy, interpretability, and efficiency over existing methods.
Contribution
The paper presents a dual hybrid encoder architecture that jointly learns and combines lexical and dense semantic representations with term expansion, enhancing IR performance and interpretability.
Findings
Outperforms independent retrievers by 10.95% in MRR@5
Reduces response time by 30%
Cuts computational load by 38%
Abstract
Expansion-enhanced sparse lexical representation improves information retrieval (IR) by minimizing vocabulary mismatch problems during lexical matching. In this paper, we explore the potential of jointly learning dense semantic representation and combining it with the lexical one for ranking candidate information. We present a hybrid information retrieval mechanism that maximizes lexical and semantic matching while minimizing their shortcomings. Our architecture consists of dual hybrid encoders that independently encode queries and information elements. Each encoder jointly learns a dense semantic representation and a sparse lexical representation augmented by a learnable term expansion of the corresponding text through contrastive learning. We demonstrate the efficacy of our model in single-stage ranking of a benchmark product question-answering dataset containing the typical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques
