Didactic to Constructive: Turning Expert Solutions into Learnable Reasoning
Ethan Mendes, Jungsoo Park, Alan Ritter

TL;DR
This paper introduces DAIL, a method that transforms expert solutions into detailed reasoning traces to improve large language models' reasoning abilities efficiently, achieving significant performance gains with limited expert data.
Contribution
The paper proposes DAIL, a novel two-step approach that converts expert solutions into in-distribution reasoning traces and applies contrastive learning, enabling better reasoning with fewer expert samples.
Findings
Achieves 10-25% pass@k improvements on Qwen models.
Improves reasoning efficiency by 2x to 4x.
Enables out-of-domain generalization.
Abstract
Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model's ability to sample a correct solution to be reinforced or on the existence of a stronger model able to solve the problem. However, many difficult problems remain intractable for even current frontier models, preventing the extraction of valid training signals. A promising alternative is to leverage high-quality expert human solutions, yet naive imitation of this data fails because it is fundamentally out of distribution: expert solutions are typically didactic, containing implicit reasoning gaps intended for human readers rather than computational models. Furthermore, high-quality expert solutions are expensive, necessitating generalizable sample-efficient training methods. We propose Distribution Aligned Imitation Learning (DAIL), a two-step method that bridges the distributional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
