Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang, Tianqi Du, Jizhe Zhang, Mingqing Xiao, Yifei Wang, Yisen Wang, Zhouchen Lin

TL;DR
This paper introduces Language Ranker, a lightweight reranking framework for LLM decoding that achieves reward-model-like performance with minimal additional parameters, reducing computational costs significantly.
Contribution
It presents a novel, efficient reranking method inspired by recommender systems, requiring less than 0.5 million extra parameters, improving decoding efficiency.
Findings
Achieves comparable performance to large reward models
Requires less than 0.5 million additional parameters
Reduces computational overhead during training and inference
Abstract
Conventional research on large language models (LLMs) has primarily focused on refining output distributions, while paying less attention to the decoding process that transforms these distributions into final responses. Recent advances, such as scaling the computation of inference time with reward models, have underscored the importance of decoding, but these methods often suffer from high computational costs and limited applicability. In this paper, we revisit LLM generation through the lens of recommender systems, conceptualizing the decoding process as analogous to the ranking stage in recommendation pipelines. From this perspective, we observe that both traditional decoding methods and reward models exhibit clear limitations such as redundancy. Motivated by this insight, we propose Language Ranker, a novel framework that introduces a lightweight module to rerank candidate responses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
