MARS: Margin-Aware Reward-Modeling with Self-Refinement

Payel Bhattacharjee; Osvaldo Simeone; Ravi Tandon

arXiv:2602.17658·cs.LG·February 20, 2026

MARS: Margin-Aware Reward-Modeling with Self-Refinement

Payel Bhattacharjee, Osvaldo Simeone, Ravi Tandon

PDF

Open Access

TL;DR

MARS introduces an adaptive augmentation method that focuses on uncertain preference pairs to improve reward model robustness, backed by theoretical guarantees and empirical gains.

Contribution

It proposes a novel margin-aware augmentation strategy that targets ambiguous data points, enhancing reward model training efficiency and robustness.

Findings

01

Consistent performance improvements over uniform augmentation.

02

Theoretical guarantees on increased loss function curvature.

03

Enhanced reward model robustness through targeted data augmentation.

Abstract

Reward modeling is a core component of modern alignment pipelines including RLHF and RLAIF, underpinning policy optimization methods including PPO and TRPO. However, training reliable reward models relies heavily on human-labeled preference data, which is costly and limited, motivating the use of data augmentation. Existing augmentation approaches typically operate at the representation or semantic level and remain agnostic to the reward model's estimation difficulty. In this paper, we propose MARS, an adaptive, margin-aware augmentation and sampling strategy that explicitly targets ambiguous and failure modes of the reward model. Our proposed framework, MARS, concentrates augmentation on low-margin (ambiguous) preference pairs where the reward model is most uncertain, and iteratively refines the training distribution via hard-sample augmentation. We provide theoretical guarantees…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)