Language Bottleneck Models for Qualitative Knowledge State Modeling
Antonin Berthon, Mihaela van der Schaar

TL;DR
This paper introduces Language Bottleneck Models that use large language models to generate interpretable textual summaries of student knowledge states, capturing nuanced insights and misconceptions while maintaining competitive predictive accuracy.
Contribution
The paper presents a novel LLM-based framework for qualitative knowledge state modeling that enhances interpretability and insightfulness over traditional CD and KT models.
Findings
LBMs produce interpretable knowledge summaries.
LBMs achieve competitive accuracy with fewer data.
Fine-tuning improves summary quality and prediction performance.
Abstract
Accurately assessing student knowledge is central to education. Cognitive Diagnosis (CD) models estimate student proficiency at a fixed point in time, while Knowledge Tracing (KT) methods model evolving knowledge states to predict future performance. However, existing approaches either provide quantitative concept mastery estimates with limited expressivity (CD, probabilistic KT) or prioritize predictive accuracy at the cost of interpretability (deep learning KT). We propose Language Bottleneck Models (LBMs), where an encoder LLM produces textual knowledge state summaries, which a decoder LLM uses to predict future performance. This produces interpretable summaries that can express nuanced insights--such as misconceptions--that CD and KT models cannot capture. Extensive validation across synthetic and real-world datasets shows LBMs reveal qualitative insights beyond what CD and KT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
