Self-Generative Adversarial Fine-Tuning for Large Language Models

Shiguang Wu; Yaqing Wang; Quanming Yao

arXiv:2602.01137·cs.LG·February 3, 2026

Self-Generative Adversarial Fine-Tuning for Large Language Models

Shiguang Wu, Yaqing Wang, Quanming Yao

PDF

Open Access

TL;DR

This paper introduces SGALM, a novel framework that uses a generative adversarial approach within a single large language model to improve alignment without external rewards, achieving state-of-the-art results.

Contribution

SGALM is the first unified adversarial fine-tuning method that jointly evolves generation and discrimination in a single LLM for alignment tasks.

Findings

01

Achieves state-of-the-art alignment performance.

02

Operates without external reward models.

03

Serves as a robust synthetic data generator.

Abstract

Fine-tuning large language models (LLMs) for alignment typically relies on supervised fine-tuning or reinforcement learning from human feedback, both limited by the cost and scarcity of high-quality annotations. Recent self-play and synthetic data approaches reduce this dependence but often rely on heuristic assumptions or ungrounded self-evaluation, which can cause bias accumulation and performance drift. In this paper, we propose Self-Generative Adversarial LLM (SGALM), a unified fine-tuning framework that formulates alignment as a generative adversarial game within a single LLM. SGALM jointly evolves generation and discrimination capabilities without external reward models. Theoretical and empirical results demonstrate that SGALM achieves state-of-the-art performance, serves as an effective alignment algorithm and a robust synthetic data engine.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education