Learning Robust Reasoning through Guided Adversarial Self-Play

Shuozhe Li; Vaishnav Tadiparthi; Kwonjoon Lee; Nakul Agarwal; Hossein Nourkhiz Mahjoub; Ehsan Moradi Pari; Lizhang Chen; Amy Zhang; Liu Leqi

arXiv:2602.00173·cs.LG·February 3, 2026

Learning Robust Reasoning through Guided Adversarial Self-Play

Shuozhe Li, Vaishnav Tadiparthi, Kwonjoon Lee, Nakul Agarwal, Hossein Nourkhiz Mahjoub, Ehsan Moradi Pari, Lizhang Chen, Amy Zhang, Liu Leqi

PDF

Open Access

TL;DR

GASP is a novel training method that enhances the robustness of reasoning models by enabling them to detect, diagnose, and repair corrupted reasoning processes through an adversarial self-play framework, improving reliability under challenging conditions.

Contribution

The paper introduces GASP, a self-supervised adversarial training approach that significantly improves the robustness of reasoning models without external labels or supervision.

Findings

01

GASP improves robustness of models against corrupted contexts.

02

Adversarial corruptions create an effective curriculum for training.

03

In-distribution repair guidance accelerates recovery learning.

Abstract

Reinforcement learning from verifiable rewards (RLVR) produces strong reasoning models, yet they can fail catastrophically when the conditioning context is fallible (e.g., corrupted chain-of-thought, misleading partial solutions, or mild input perturbations), since standard RLVR optimizes final-answer correctness only under clean conditioning. We introduce GASP (Guided Adversarial Self-Play), a robustification method that explicitly trains detect-and-repair capabilities using only outcome verification. Without human labels or external teachers, GASP forms an adversarial self-play game within a single model: a polluter learns to induce failure via locally coherent corruptions, while an agent learns to diagnose and recover under the same corrupted conditioning. To address the scarcity of successful recoveries early in training, we propose in-distribution repair guidance, an imitation term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning