Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
Emmanuel Aboah Boateng, Cassiano O. Becker, Nabiha Asghar, Kabir, Walia, Ashwin Srinivasan, Ehi Nosakhare, Soundar Srinivasan, Victor Dibia

TL;DR
Concept Distillation (CD) is an automatic prompt optimization method that improves weaker language models by leveraging strong models to generate and filter rules, significantly boosting performance on complex tasks like code generation and mathematical reasoning.
Contribution
This paper introduces Concept Distillation, a novel automated prompt enhancement technique that effectively transfers knowledge from strong to weak models for complex tasks.
Findings
Mistral-7B's accuracy on Multi-Arith increased by 20%.
Phi-3-mini-3.8B's accuracy on HumanEval rose by 34%.
CD outperforms other automated prompt optimization methods.
Abstract
Hand-crafting high quality prompts to optimize the performance of language models is a complicated and labor-intensive process. Furthermore, when migrating to newer, smaller, or weaker models (possibly due to latency or cost gains), prompts need to be updated to re-optimize the task performance. We propose Concept Distillation (CD), an automatic prompt optimization technique for enhancing weaker models on complex tasks. CD involves: (1) collecting mistakes made by weak models with a base prompt (initialization), (2) using a strong model to generate reasons for these mistakes and create rules/concepts for weak models (induction), and (3) filtering these rules based on validation set performance and integrating them into the base prompt (deduction/verification). We evaluated CD on NL2Code and mathematical reasoning tasks, observing significant performance boosts for small and weaker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Data Stream Mining Techniques · Machine Learning and Algorithms
MethodsSparse Evolutionary Training · Balanced Selection
