TL;DR
RewardHarness introduces a self-evolving, agentic reward framework that improves image editing evaluation with minimal human data, surpassing existing models in accuracy and efficiency.
Contribution
It reframes reward modeling as context evolution, enabling iterative tool and skill development from limited preference demonstrations without large-scale annotations.
Findings
Achieves 47.4% accuracy on image-editing benchmarks using only 0.05% of preference data.
Surpasses GPT-5 by 5.3 points in evaluation accuracy.
Enhances reward modeling efficiency through self-evolving tool and skill libraries.
Abstract
Evaluating instruction-guided image edits requires rewards that reflect subtle human preferences, yet current reward models typically depend on large-scale preference annotation and additional model training. This creates a data-efficiency gap: humans can often infer the target evaluation criteria from only a few examples, while models are usually trained on hundreds of thousands of comparisons. We present RewardHarness, a self-evolving agentic reward framework that reframes reward modeling as context evolution rather than weight optimization. Instead of learning from large-scale annotations, RewardHarness aligns with human preferences by iteratively evolving a library of tools and skills from as few as 100 preference demonstrations. Given a source image, candidate edited images, and an editing instruction, an Orchestrator selects the most relevant subset of tools and skills from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
