K-order Ranking Preference Optimization for Large Language Models

Shihao Cai; Chongming Gao; Yang Zhang; Wentao Shi; Jizhi Zhang; Keqin Bao; Qifan Wang; Fuli Feng

arXiv:2506.00441·cs.IR·June 3, 2025

K-order Ranking Preference Optimization for Large Language Models

Shihao Cai, Chongming Gao, Yang Zhang, Wentao Shi, Jizhi Zhang, Keqin Bao, Qifan Wang, Fuli Feng

PDF

Open Access

TL;DR

This paper introduces K-order Ranking Preference Optimization (KPO), a novel method focusing on optimizing top-K ranking accuracy for large language models, improving relevance and robustness in real-world applications.

Contribution

The paper extends the DPO model to prioritize top-K rankings, dynamically adjusts K per query, and employs curriculum learning for efficient training, addressing practical ranking needs.

Findings

01

KPO outperforms existing methods in ranking accuracy.

02

KPO demonstrates high sample efficiency and robustness to noise.

03

Dynamic K adjustment improves ranking relevance.

Abstract

To adapt large language models (LLMs) to ranking tasks, existing list-wise methods, represented by list-wise Direct Preference Optimization (DPO), focus on optimizing partial-order or full-order list ranking consistency for LLMs to enhance their ranking abilities. However, we argue that optimizing top-K ranking consistency could be more appropriate for real-world applications. There are two main reasons: (1) users are typically concerned with only the top-K results, making top-K ranking more important, and (2) tail items often lack precise feedback, making top-K ranking more reliable. Based on this, we propose K-order Ranking Preference Optimization (KPO) by extending the DPO's Plackett-Luce model to accommodate top-K rankings. Additionally, recognizing that the number of important items can vary across queries, we extend KPO to dynamically determine appropriate K for different samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Data Management and Algorithms