Generating Sequences by Learning to Self-Correct
Sean Welleck, Ximing Lu, Peter West, Faeze Brahman, Tianxiao Shen,, Daniel Khashabi, Yejin Choi

TL;DR
Self-Correction is a method that enhances sequence generation by training a separate corrector to iteratively fix outputs from an imperfect generator, improving adherence to constraints across diverse tasks without updating the base model.
Contribution
The paper introduces Self-Correction, a novel approach that decouples generation and correction, enabling iterative refinement using feedback, applicable to large or inaccessible models.
Findings
Improves sequence generation quality across multiple tasks
Effective even with a smaller corrector model
Uses online training with scalar or natural language feedback
Abstract
Sequence generation applications require satisfying semantic constraints, such as ensuring that programs are correct, using certain keywords, or avoiding undesirable content. Language models, whether fine-tuned or prompted with few-shot demonstrations, frequently violate these constraints, and lack a mechanism to iteratively revise their outputs. Moreover, some powerful language models are of extreme scale or inaccessible, making it inefficient, if not infeasible, to update their parameters for task-specific adaptation. We present Self-Correction, an approach that decouples an imperfect base generator (an off-the-shelf language model or supervised sequence-to-sequence model) from a separate corrector that learns to iteratively correct imperfect generations. To train the corrector, we propose an online training procedure that can use either scalar or natural language feedback on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
MethodsBalanced Selection
