Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models

Hao Xiang; Bowen Yu; Hongyu Lin; Keming Lu; Yaojie Lu; Xianpei Han; Ben He; Le Sun; Jingren Zhou; Junyang Lin

arXiv:2410.17131·cs.CL·June 12, 2025

Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models

Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, Junyang Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces Self-Steering Optimization ($SSO$), an autonomous method for generating high-quality preference data to improve large language model alignment without manual labeling.

Contribution

The paper presents $SSO$, a novel algorithm that autonomously produces on-policy preference data, enhancing alignment and reward optimization for large language models.

Findings

01

$SSO$ outperforms baselines in human preference alignment.

02

$SSO$ improves reward optimization across models.

03

The framework is scalable and effective for automated alignment.

Abstract

The key to effective alignment lies in high-quality preference data. Recent research has focused on automated alignment, which involves developing alignment systems with minimal human intervention. However, prior research has predominantly focused on developing data generation methods, while insufficient attention has been paid to quality control mechanisms, which often produce inaccurate and unhelpful data, leading to unpredictable benefits during iterative optimization. In this paper, we present Self-Steering Optimization ( $S S O$ ), an algorithm that autonomously generates high-quality preference data, eliminating manual annotation requirements. $S S O$ employs a specialized optimization objective to build a data generator from the policy model itself, which is used to produce accurate and on-policy data. We demonstrate $S S O$ 's effectiveness through comprehensive experiments on two series…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

icip-cas/sso
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques