The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration
Wei Yao, Wenkai Yang, Gengze Xu, Ziqiao Wang, Yankai Lin, Yong Liu

TL;DR
This paper provides theoretical analysis of weak-to-strong generalization, highlighting the importance of weak model quality and calibration, and demonstrates conditions under which strong models can outperform weak teachers in classification and regression tasks.
Contribution
It offers new theoretical bounds on generalization and calibration errors, and extends existing work to KL divergence-based loss functions, with experimental validation.
Findings
Strong weak models should have good generalization and calibration.
Over-optimization can harm the strong model's performance.
Strong models can outperform weak teachers when disagreement is bounded.
Abstract
Weak-to-strong generalization, where weakly supervised strong models outperform their weaker teachers, offers a promising approach to aligning superhuman models with human values. To deepen the understanding of this approach, we provide theoretical insights into its capabilities and limitations. First, in the classification setting, we establish upper and lower generalization error bounds for the strong model, identifying the primary limitations as stemming from the weak model's generalization error and the optimization objective itself. Additionally, we derive lower and upper bounds on the calibration error of the strong model. These theoretical bounds reveal two critical insights: (1) the weak model should demonstrate strong generalization performance and maintain well-calibrated predictions, and (2) the strong model's training process must strike a careful balance, as excessive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Neural Networks and Applications · Image Processing Techniques and Applications
