Teaching by Failure: Counter-Example-Driven Curricula for Transformer Self-Improvement
Harshil Vejendla

TL;DR
This paper presents a novel automated curriculum learning framework called Counter-Example-Driven Curricula (CEDC) that enhances Transformer models' robustness by iteratively training on their own failure cases, significantly improving extrapolation and efficiency.
Contribution
The paper introduces CEDC, a new automated, verifier-guided curriculum learning method that improves Transformer robustness without manual difficulty heuristics.
Findings
Up to 30x greater length extrapolation
3.75x more computationally efficient than uniform augmentation
No manual difficulty heuristics needed
Abstract
Transformer models often exhibit brittle extrapolation, failing on inputs that are longer or structurally more complex than those seen during training. We introduce Counter-Example-Driven Curricula (CEDC), an automated framework that improves model robustness by iteratively focusing on its own failures. At each step, CEDC uses the current model to generate a diverse set of candidate problems, employs a fast, executable verifier to identify incorrect predictions (counter-examples), and then fine-tunes the model on a dataset enriched with these discovered failures. We evaluate CEDC on a suite of algorithmic and natural language tasks, including integer addition, sorting, Dyck-2 language recognition, and three text classification benchmarks. Compared to static training and standard curriculum learning baselines, CEDC achieves up to 30x greater length extrapolation, is 3.75x more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)
