RankEvolve: Automating the Discovery of Retrieval Algorithms via LLM-Driven Evolution
Jinming Nian, Fangchen Li, Dae Hoon Park, Yi Fang

TL;DR
This paper presents RankEvolve, a method that uses large language models and evolutionary algorithms to automatically discover improved retrieval algorithms, starting from traditional methods like BM25, showing promising results across multiple IR datasets.
Contribution
Introducing RankEvolve, a novel framework that automates the discovery of retrieval algorithms using LLM-guided evolutionary search, starting from standard IR methods.
Findings
Evolved algorithms outperform baseline methods on multiple datasets.
Evolved algorithms show good transferability to unseen benchmarks.
Evaluator-guided LLM evolution is effective for algorithm discovery.
Abstract
Retrieval algorithms like BM25 and query likelihood with Dirichlet smoothing remain strong and efficient first-stage rankers, yet improvements have mostly relied on parameter tuning and human intuition. We investigate whether a large language model, guided by an evaluator and evolutionary search, can automatically discover improved lexical retrieval algorithms. We introduce RankEvolve, a program evolution setup based on AlphaEvolve, in which candidate ranking algorithms are represented as executable code and iteratively mutated, recombined, and selected based on retrieval performance across 12 IR datasets from BEIR and BRIGHT. RankEvolve starts from two seed programs: BM25 and query likelihood with Dirichlet smoothing. The evolved algorithms are novel, effective, and show promising transfer to the full BEIR and BRIGHT benchmarks as well as TREC DL 19 and 20. Our results suggest that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Natural Language Processing Techniques
