CoRank: LLM-Based Compact Reranking with Document Features for Scientific Retrieval
Runchu Tian, Xueqiang Xu, Bowen Jin, SeongKu Kang, Jiawei Han

TL;DR
CoRank is a novel, training-free reranking framework that uses compact semantic features to improve scientific document retrieval by effectively leveraging large language models.
Contribution
It introduces a three-stage reranking process utilizing semantic features, addressing first-stage retrieval limitations and enhancing retrieval accuracy across multiple datasets.
Findings
Significant improvement in nDCG@10 scores (from 50.6 to 55.5) across datasets.
Effective integration of semantic features with LLMs for scientific retrieval.
Model-agnostic and training-free approach enhances reranking performance.
Abstract
Scientific retrieval is essential for advancing scientific knowledge discovery. Within this process, document reranking plays a critical role in refining first-stage retrieval results. However, standard LLM listwise reranking faces challenges in the scientific domain. First-stage retrieval is often suboptimal in the scientific domain, so relevant documents are ranked lower. Meanwhile, conventional listwise reranking places the full text of candidates into the context window, limiting the number of candidates that can be considered. As a result, many relevant documents are excluded before reranking, constraining overall retrieval performance. To address these challenges, we explore semantic-feature-based compact document representations (e.g., categories, sections, and keywords) and propose CoRank, a training-free, model-agnostic reranking framework for scientific retrieval. It presents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Mathematics, Computing, and Information Processing · Library Science and Information Systems
