NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Sepanta Zeighami, Zac Wellmer, Aditya Parameswaran

TL;DR
NUDGE introduces a non-parametric, efficient method for fine-tuning embeddings directly to improve k-NN retrieval accuracy, outperforming existing fine-tuning and adaptor approaches in speed and effectiveness.
Contribution
The paper presents NUDGE, a novel non-parametric approach for embedding fine-tuning that is more accurate and efficient than traditional methods, with theoretical and experimental validation.
Findings
NUDGE improves NDCG@10 by over 10% on average.
NUDGE is 200x faster than fine-tuning pre-trained models.
NUDGE achieves higher accuracy increases compared to existing methods.
Abstract
-Nearest Neighbor search on dense vector embeddings (-NN retrieval) from pre-trained embedding models is the predominant retrieval method for text and images, as well as Retrieval-Augmented Generation (RAG) pipelines. In practice, application developers often fine-tune the embeddings to improve their accuracy on the dataset and query workload in hand. Existing approaches either fine-tune the pre-trained model itself or, more efficiently, but at the cost of accuracy, train adaptor models to transform the output of the pre-trained model. We present NUDGE, a family of novel non-parametric embedding fine-tuning approaches that are significantly more accurate and efficient than both sets of existing approaches. NUDGE directly modifies the embeddings of data records to maximize the accuracy of -NN retrieval. We present a thorough theoretical and experimental study of NUDGE's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Advanced Image and Video Retrieval Techniques
