Socratic Students: Teaching Language Models to Learn by Asking Questions
Rajeev Bhatt Ambati, Tianyi Niu, Aashu Singh, Shlok Mishra, Snigdha Chaturvedi, Shashank Srivastava

TL;DR
This paper introduces a method for training language models to ask questions interactively, improving their reasoning skills and efficiency in solving complex tasks without human labels.
Contribution
It proposes ODQS, a training framework that optimizes question asking based on task outcomes, enhancing model performance in reasoning-heavy domains.
Findings
Significant performance improvements on GSM8K, HumanEval, and OpenCoder datasets.
Boosted Pass@5 by up to 54.7% on math and 22.9% on coding tasks.
Reduced the number of interaction turns needed to reach baseline performance.
Abstract
Large language Models (LLMs) are usually used to answer questions, but many high-stakes applications (e.g., tutoring, clinical support) require the complementary skill of asking questions: detecting missing information, requesting clarifications, and using them to solve tasks. We study this skill in reasoning-heavy domains where progress depends on inquiry rather than factual recall. We define an interactive protocol where a student model engages a stronger teacher under a small turn budget. After each teacher reply, we evaluate the student on the original task with Pass@k. We propose Outcome-Driven Question optimization Strategy (ODQS ), a training framework that learns a questioning policy from downstream task outcomes. At each turn, we sample multiple candidate questions; query the teacher with each, then score the student's resulting performance. Using these scores, we train the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
