Semantic Search for Information Retrieval

Kayla Farivar

arXiv:2508.17694·cs.IR·August 26, 2025

Semantic Search for Information Retrieval

Kayla Farivar

PDF

TL;DR

This paper surveys the evolution of information retrieval systems from traditional lexical methods to advanced semantic models, highlighting architectures like DPR, ColBERT, SPLADE, and MonoT5, and discussing evaluation and future challenges.

Contribution

It provides a comprehensive overview of modern semantic retrieval architectures and discusses evaluation methods, challenges, and future research directions.

Findings

01

Semantic retrievers outperform lexical methods in accuracy.

02

Introduction of dense bi-encoders, late-interaction models, and neural sparse retrieval.

03

Discussion of evaluation tactics and future challenges in semantic search.

Abstract

Information retrieval systems have progressed notably from lexical techniques such as BM25 and TF-IDF to modern semantic retrievers. This survey provides a brief overview of the BM25 baseline, then discusses the architecture of modern state-of-the-art semantic retrievers. Advancing from BERT, we introduce dense bi-encoders (DPR), late-interaction models (ColBERT), and neural sparse retrieval (SPLADE). Finally, we examine MonoT5, a cross-encoder model. We conclude with common evaluation tactics, pressing challenges, and propositions for future directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.