Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future

Yidong Wang; Xin Wang; Cunxiang Wang; Junfeng Fang; Qiufeng Wang; Jianing Chu; Xuran Meng; Shuxun Yang; Libo Qin; Yue Zhang; Wei Ye; Shikun Zhang

arXiv:2508.06026·cs.CL·August 11, 2025

Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future

Yidong Wang, Xin Wang, Cunxiang Wang, Junfeng Fang, Qiufeng Wang, Jianing Chu, Xuran Meng, Shuxun Yang, Libo Qin, Yue Zhang, Wei Ye, Shikun Zhang

PDF

Open Access

TL;DR

This paper introduces Temporal Self-Rewarding Language Models that decouple chosen and rejected responses over time, leading to improved preference learning and better performance across multiple tasks.

Contribution

It proposes a novel dual-phase framework that coordinates past, present, and future model generations to enhance self-rewarding training methods.

Findings

01

Significant performance improvements on benchmark tasks.

02

Better out-of-distribution generalization across diverse tasks.

03

Outperforms baseline self-rewarding methods with same resources.

Abstract

Self-Rewarding Language Models propose an architecture in which the Large Language Models(LLMs) both generates responses and evaluates its own outputs via LLM-as-a-Judge prompting, dynamically improving its generative capabilities through iterative Direct Preference Optimization (DPO). However, our analysis reveals a critical limitation in existing Self-Rewarding paradigms: the synchronized improvement of chosen and rejected responses progressively narrows the representational difference between contrasting samples, undermining effective preference learning. We propose \textbf{Temporal Self-Rewarding Language Models} that strategically coordinate past, present, and future model generations to sustain learning signals. Our dual-phase framework introduces: (1) \textit{Anchored Rejection} - fixing rejected responses using the past initial model's outputs and (2) \textit{Future-Guided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques