Can Small Language Models Use What They Retrieve? An Empirical Study of Retrieval Utilization Across Model Scale
Sanchit Pandey (BITS Pilani, Hyderabad, India)

TL;DR
This study investigates whether small language models under 7B parameters can effectively utilize retrieved information in retrieval-augmented generation, revealing significant utilization bottlenecks and potential negative impacts at this scale.
Contribution
It provides an empirical analysis of retrieval utilization across model sizes and introduces a knowledge split to isolate utilization failures from retrieval quality issues.
Findings
Models under 7B fail to extract correct answers 85-100% of the time on questions they can't answer alone.
Adding retrieval context often destroys previously known answers, indicating distraction effects.
The main failure mode is irrelevant generation, where models ignore provided context.
Abstract
Retrieval augmented generation RAG is widely deployed to improve factual accuracy in language models yet it remains unclear whether smaller models of size 7B parameters or less can effectively utilize retrieved information. To investigate this question we evaluate five model sizes from 360M to 8B across three architecture families SmolLM2 Qwen2.5 and Llama 3.1 under four retrieval conditions including no retrieval BM25 dense retrieval using E5 large v2 and oracle retrieval where the retrieved passage is guaranteed to contain the answer. We introduce a parametric knowledge split that separates questions a model can already answer from those that require external knowledge which allows us to isolate utilization failure from retrieval quality failure. We find three main results. First even with oracle retrieval models of size 7B or smaller fail to extract the correct answer 85 to 100…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Information Retrieval and Search Behavior
