Diagnosing LLM-based Rerankers in Cold-Start Recommender Systems: Coverage, Exposure and Practical Mitigations

Ekaterina Lemdiasova; Nikita Zmanovskii

arXiv:2604.16318·cs.IR·April 21, 2026

Diagnosing LLM-based Rerankers in Cold-Start Recommender Systems: Coverage, Exposure and Practical Mitigations

Ekaterina Lemdiasova, Nikita Zmanovskii

PDF

TL;DR

This study systematically diagnoses the limitations of LLM-based rerankers in cold-start recommender systems, revealing issues in coverage, exposure bias, and score discrimination, and offers practical mitigation strategies.

Contribution

It identifies key failure modes of LLM rerankers in cold-start scenarios and proposes effective solutions to improve their practical deployment.

Findings

01

LLM rerankers have low retrieval coverage in candidate generation.

02

Popularity-based ranking outperforms LLM reranking in accuracy.

03

Retrieval stage limitations are the main cause of performance gaps.

Abstract

Large language models (LLMs) and cross-encoder rerankers have gained attention for improving recommender systems, particularly in cold-start scenarios where user interaction history is limited. However, practical deployment reveals significant performance gaps between LLM-based approaches and simple baselines. This paper presents a systematic diagnostic study of cross-encoder rerankers in cold-start movie recommendation using the Serendipity-2018 dataset. Through controlled experiments with 500 users across multiple random seeds, we identify three critical failure modes: (1) low retrieval coverage in candidate generation (recall@200 = 0.109 vs. 0.609 for baselines), (2) severe exposure bias with rerankers concentrating recommendations on 3 unique items versus 497 for random baseline, and (3) minimal score discrimination between relevant and irrelevant items (mean difference = 0.098,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.