PLOT: Enhancing Preference Learning via Optimal Transport

Liang Zhu; Yuelin Bai; Xiankun Ren; Jiaxi Yang; Lei Zhang; Feiteng Fang; Hamid Alinejad-Rokny; Minghuan Tan; Min Yang

arXiv:2604.01837·cs.CL·April 3, 2026

PLOT: Enhancing Preference Learning via Optimal Transport

Liang Zhu, Yuelin Bai, Xiankun Ren, Jiaxi Yang, Lei Zhang, Feiteng Fang, Hamid Alinejad-Rokny, Minghuan Tan, Min Yang

PDF

TL;DR

PLOT introduces a novel token-level preference learning method for LLMs using Optimal Transport, improving alignment with human preferences while maintaining stability and semantic coherence.

Contribution

It formulates preference learning as an Optimal Transport problem, enabling globally informed, stable, and robust fine-tuning of language models.

Findings

01

PLOT outperforms existing methods in preference alignment tasks.

02

It maintains fluency and coherence in generated text.

03

Experiments cover diverse preference categories and subpreferences.

Abstract

Preference learning in Large Language Models (LLMs) has advanced significantly, yet existing methods remain limited by modest performance gains, high computational costs, hyperparameter sensitivity, and insufficient modeling of global token-level relationships. We introduce PLOT, which enhances Preference Learning in fine-tuning-based alignment through a token-level loss derived from Optimal Transport. By formulating preference learning as an Optimal Transport Problem, PLOT aligns model outputs with human preferences while preserving the original distribution of LLMs, ensuring stability and robustness. Furthermore, PLOT leverages token embeddings to capture semantic relationships, enabling globally informed optimization. Experiments across two preference categories - Human Values and Logic & Problem Solving - spanning seven subpreferences demonstrate that PLOT consistently improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.