Domain-Adaptive and Scalable Dense Retrieval for Content-Based Recommendation

Mritunjay Pandey (Aditya Birla Group)

arXiv:2602.00899·cs.LG·February 3, 2026

Domain-Adaptive and Scalable Dense Retrieval for Content-Based Recommendation

Mritunjay Pandey (Aditya Birla Group)

PDF

Open Access

TL;DR

This paper introduces a scalable, domain-adapted dense retrieval system for content-based recommendation that significantly outperforms traditional methods in accuracy while maintaining practical efficiency for large catalogs.

Contribution

The authors develop a dense retrieval approach using a two-tower bi-encoder fine-tuned with contrastive learning, optimized for large-scale e-commerce recommendation tasks.

Findings

01

Recall@10 improved from 0.26 to 0.66 over BM25

02

Achieves 6.1 ms median CPU inference latency

03

Reduces model size by 4x

Abstract

E-commerce recommendation and search commonly rely on sparse keyword matching (e.g., BM25), which breaks down under vocabulary mismatch when user intent has limited lexical overlap with product metadata. We cast content-based recommendation as recommendation-as-retrieval: given a natural-language intent signal (a query or review), retrieve the top-K most relevant items from a large catalog via semantic similarity. We present a scalable dense retrieval system based on a two-tower bi-encoder, fine-tuned on the Amazon Reviews 2023 (Fashion) subset using supervised contrastive learning with Multiple Negatives Ranking Loss. We construct training pairs from review text (as a query proxy) and item metadata (as the positive document) and fine-tune on 50,000 sampled interactions with a maximum sequence length of 500 tokens. For efficient serving, we combine FAISS HNSW indexing with an ONNX…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies