Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun, Longhui Yu, Yikang Shen, Weiyang Liu, Yiming Yang, Sean, Welleck, Chuang Gan

TL;DR
This paper introduces a scalable alignment method where reward models trained on easy tasks are used to evaluate and improve performance on harder tasks, enabling AI systems to surpass human-level capabilities in complex reasoning.
Contribution
The paper proposes a novel easy-to-hard generalization approach using reward models trained on simple tasks to evaluate and enhance performance on difficult tasks, advancing AI beyond human supervision.
Findings
Reward models trained on easy tasks effectively evaluate harder tasks.
The approach achieves 34.0% accuracy on MATH500 with minimal supervision.
Enables AI systems to surpass human capabilities in complex reasoning.
Abstract
Current AI alignment methodologies rely on human-provided demonstrations or judgments, and the learned capabilities of AI systems would be upper-bounded by human capabilities as a result. This raises a challenging research question: How can we keep improving the systems when their capabilities have surpassed the levels of humans? This paper answers this question in the context of tackling hard reasoning tasks (e.g., level 4-5 MATH problems) via learning from human annotations on easier tasks (e.g., level 1-3 MATH problems), which we term as easy-to-hard generalization. Our key insight is that an evaluator (reward model) trained on supervisions for easier tasks can be effectively used for scoring candidate solutions of harder tasks and hence facilitating easy-to-hard generalization over different levels of tasks. Based on this insight, we propose a novel approach to scalable alignment,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques
