A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Miaomiao Ji; Yanqiu Wu; Zhibin Wu; Shoujin Wang; Jian Yang; Mark Dras; Usman Naseem

arXiv:2505.02666·cs.CL·September 3, 2025

A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Miaomiao Ji, Yanqiu Wu, Zhibin Wu, Shoujin Wang, Jian Yang, Mark Dras, Usman Naseem

PDF

Open Access

TL;DR

This survey reviews recent advances in large language model (LLM) alignment focusing on reward design, highlighting shifts from RL-based methods to RL-free approaches and from single-task to multi-objective settings, providing a structured taxonomy and practical insights.

Contribution

It offers a comprehensive taxonomy of reward mechanisms and clarifies the evolution of reward design strategies in LLM alignment research.

Findings

01

Shift from RL-based to RL-free optimization methods

02

Transition from single-task to multi-objective alignment

03

Development of a macro-level reward mechanism taxonomy

Abstract

Reward design plays a pivotal role in aligning large language models (LLMs) with human values, serving as the bridge between feedback signals and model optimization. This survey provides a structured organization of reward modeling and addresses three key aspects: mathematical formulation, construction practices, and interaction with optimization paradigms. Building on this, it develops a macro-level taxonomy that characterizes reward mechanisms along complementary dimensions, thereby offering both conceptual clarity and practical guidance for alignment research. The progression of LLM alignment can be understood as a continuous refinement of reward design strategies, with recent developments highlighting paradigm shifts from reinforcement learning (RL)-based to RL-free optimization and from single-task to multi-objective and complex settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Reforms and Innovations