TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Haorui Wang (1), Rongzhi Zhang (1), Yinghao Li (1), Lingkai Kong (1),, Yuchen Zhuang (1), Xiusi Chen (2), Chao Zhang (1) ((1) College of Computing,, Georgia Institute of Technology, (2) Department of Computer Science,, University of California, Los Angeles)

TL;DR
The paper introduces TPD, a principle-based teacher-student framework that enhances smaller language models' reasoning by mimicking human learning, leading to significant performance improvements without ongoing teacher intervention.
Contribution
The paper proposes a novel principle discovery-based teaching framework that improves student LLM reasoning without continuous teacher guidance or extensive fine-tuning.
Findings
TPD achieves a 6.2% average performance boost over chain-of-thought prompting.
The framework effectively guides student models using error-based principles.
Extensive experiments across eight reasoning tasks validate TPD's effectiveness.
Abstract
Large Language Models (LLMs) have recently showcased remarkable reasoning abilities. However, larger models often surpass their smaller counterparts in reasoning tasks, posing the challenge of effectively transferring these capabilities from larger models. Existing approaches heavily rely on extensive fine-tuning data or continuous interactions with a superior teacher LLM during inference. We introduce a principle-based teacher-student framework called ``Teaching via Principle Discovery'' (TPD) to address these limitations. Inspired by human learning mechanisms, TPD mimics the interaction between a teacher and a student using a principle-based approach. The teacher LLM generates problem-solving instructions and corrective principles based on the student LLM's errors. These principles guide the refinement of instructions and the selection of instructive examples from a validation set.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research
