On Symmetric Losses for Robust Policy Optimization with Noisy Preferences

Soichiro Nishimori; Yu-Jie Zhang; Thanawat Lodkaew; and Masashi Sugiyama

arXiv:2505.24709·cs.LG·June 2, 2025

On Symmetric Losses for Robust Policy Optimization with Noisy Preferences

Soichiro Nishimori, Yu-Jie Zhang, Thanawat Lodkaew, and Masashi Sugiyama

PDF

1 Repo

TL;DR

This paper introduces SymPO, a robust policy optimization framework that uses symmetric losses to effectively handle noisy human preference data in reinforcement learning, ensuring reliable policy improvement.

Contribution

It proposes a novel approach applying symmetric losses to reward modeling, providing theoretical guarantees for robustness against noisy preferences in policy optimization.

Findings

01

SymPO outperforms traditional methods on noisy preference data.

02

Symmetric losses preserve reward ranking under label noise.

03

Theoretical analysis confirms robustness of SymPO.

Abstract

Optimizing policies based on human preferences is key to aligning language models with human intent. This work focuses on reward modeling, a core component in reinforcement learning from human feedback (RLHF), and offline preference optimization, such as direct preference optimization. Conventional approaches typically assume accurate annotations. However, real-world preference data often contains noise due to human errors or biases. We propose a principled framework for robust policy optimization under noisy preferences, viewing reward modeling as a classification problem. This allows us to leverage symmetric losses, known for their robustness to label noise in classification, leading to our Symmetric Preference Optimization (SymPO) method. We prove that symmetric losses enable successful policy optimization even under noisy labels, as the resulting reward remains rank-preserving -- a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nissymori/sympo
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.