Designing and Evaluating Chain-of-Hints for Scientific Question Answering

Anubhav Jangra; Smaranda Muresan

arXiv:2510.21087·cs.HC·February 23, 2026

Designing and Evaluating Chain-of-Hints for Scientific Question Answering

Anubhav Jangra, Smaranda Muresan

PDF

TL;DR

This paper evaluates 18 open-source large language models for generating chain-of-hints in scientific question answering, comparing static and dynamic hinting strategies to improve educational engagement and understanding.

Contribution

It introduces and compares static and dynamic hinting strategies using open-source LLMs, providing insights into their effectiveness and user preferences in educational contexts.

Findings

01

Dynamic hints adapt to learner progress, enhancing engagement.

02

Automatic metrics have limitations in capturing user preferences.

03

User preferences vary across hinting strategies.

Abstract

LLMs are reshaping education, with students increasingly relying on them for learning. Implemented using general-purpose models, these systems are likely to give away the answers, potentially undermining conceptual understanding and critical thinking. Prior work shows that hints can effectively promote cognitive engagement. Building on this insight, we evaluate 18 open-source LLMs on chain-of-hints generation that scaffold users toward the correct answer. We compare two distinct hinting strategies: static hints, pre-generated for each problem, and dynamic hints, adapted to a learners' progress. We evaluate these systems on five pedagogically grounded automatic metrics for hint quality. Using the best performing LLM as the backbone of a quantitative study with 41 participants, we uncover distinct user preferences across hinting strategies, and identify the limitations of automatic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.