REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Ambuje Gupta, Mrinal Rawat, Andreas Stolcke, Roberto Pieraccini

TL;DR
REFINE enhances retrieval in question-answering systems by fine-tuning embedding models with synthetic data and model fusion, significantly improving recall especially in scarce data scenarios.
Contribution
The paper introduces REFINE, a novel method combining synthetic data generation and model fusion to fine-tune embeddings for better domain adaptation in retrieval tasks.
Findings
Outperforms vanilla pretrained models with standard fine-tuning.
Achieves 5.76% recall improvement on TOURISM dataset.
Shows 6.58% and 0.32% improvements on SQUAD and RAG-12000 datasets.
Abstract
Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA), relying on retrieving relevant documents from a vector store computed using a pretrained embedding model. However, if the retrieved context is inaccurate, the answers generated using the large language model (LLM) may contain errors or hallucinations. Although pretrained embedding models have advanced, adapting them to new domains remains challenging. Fine-tuning is a potential solution, but industry settings often lack the necessary fine-tuning data. To address these challenges, we propose REFINE, a novel technique that generates synthetic data from available documents and then uses a model fusion approach to fine-tune embeddings for improved retrieval performance in new domains, while preserving out-of-domain capability. We conducted experiments on the two public datasets: SQUAD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
