# Text embedding models yield detailed conceptual knowledge maps derived from short multiple-choice quizzes

**Authors:** Paxton C. Fitzpatrick, Andrew C. Heusser, Jeremy R. Manning

PMC · DOI: 10.1038/s41467-026-69746-w · 2026-03-24

## TL;DR

This paper shows how text embedding models and short quizzes can create detailed maps of learners' conceptual knowledge and track how it changes over time.

## Contribution

The novel use of text embeddings to track conceptual knowledge acquisition from quiz responses and video content.

## Key findings

- A mathematical framework using text embeddings captures detailed conceptual knowledge.
- Quiz responses combined with video transcripts reveal how learners' knowledge evolves.
- The method predicts quiz success and identifies gaps in learners' understanding.

## Abstract

Real-world conceptual knowledge is complex, multifaceted, and substantially over-simplified in most laboratory studies. Here we develop a mathematical framework, based on natural language processing models, for tracking and characterizing the acquisition of real-world conceptual knowledge. Our approach embeds each concept in a high-dimensional representation space where nearby coordinates reflect similar or related concepts. We test our approach using behavioral data from participants who answered small sets of multiple-choice quiz questions interleaved between watching two Khan Academy course videos. We apply our framework to the videos’ transcripts and the text of the quiz questions to quantify the content of each moment of video and each quiz question. We use these embeddings, along with participants’ quiz responses, to track how the learners’ knowledge changed after watching each video and predict their success on individual quiz questions. Our findings show how a small set of quiz questions may be used to obtain rich and meaningful insights into what each learner knows, and how their knowledge changes over time as they learn.

Traditional quiz scores provide limited insight into what students know. Here, the authors show that text embedding models combined with short multiple-choice quizzes yield detailed maps of conceptual knowledge and learning.

## Full-text entities

- **Diseases:** hearing impairment (MESH:D034381), color blind (MESH:D003117)
- **Chemicals:** alcohol (MESH:D000438)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13013618/full.md

---
Source: https://tomesphere.com/paper/PMC13013618