Enhancing Generalization in Chain of Thought Reasoning for Smaller Models
Maxwell J. Yin, Dingyi Jiang, Yongbing Chen, Boyu Wang, Charles Ling

TL;DR
This paper introduces PRADA, a novel fine-tuning framework that enhances the generalization of chain-of-thought reasoning in smaller language models by integrating domain-adversarial techniques, leading to improved performance and explainability.
Contribution
PRADA is the first to combine domain-adversarial fine-tuning with prompt engineering to improve CoT reasoning in smaller models, addressing knowledge distillation limitations.
Findings
PRADA significantly outperforms existing methods across various tasks.
Smaller LLMs with PRADA better align with domain knowledge.
PRADA improves the explainability of reasoning processes.
Abstract
Chain-of-Thought (CoT) reasoning in smaller language models is a challenging natural language process problem yet highly desirable in many real-life applications. Existing CoT knowledge distillation methods often suffer from overly conservative memorization in smaller LLMs, leading to low generalization confidence. As fully preserving the CoT ability of teacher model is impossible, we hypothesize that adversarial CoT fine-tuning is crucial for developing smaller LLM with robust CoT generalization. To this end, we propose \textit{PRompt-Assisted Domain-Adversarial fine-tuning} (PRADA), a principled fine-tuning framework that integrates diverse CoT domains. Specifically, PRADA pioneers two CoT improvements in smaller LLM: (1) Recovering the domain-invariant feature insight which typically lost during distillation with domain adversarial fine-tuning; (2) Enhancing the domain adaptability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
MethodsKnowledge Distillation
