Problem-Oriented Segmentation and Retrieval: Case Study on Tutoring Conversations
Rose E. Wang, Pawan Wirawarn, Kenny Lam, Omar Khattab, Dorottya, Demszky

TL;DR
This paper introduces Problem-Oriented Segmentation & Retrieval (POSR), a new framework for analyzing conversations around reference materials, demonstrated through a novel dataset of tutoring lessons and evaluated with various modeling approaches.
Contribution
The paper presents the first dataset of tutoring lessons linked to reference problems and demonstrates the effectiveness of joint POSR modeling over separate segmentation and retrieval methods.
Findings
Joint POSR models outperform independent methods by up to 76%.
POSR improves segmentation accuracy by up to 78%.
Practical applications include insights into lesson structure and language use.
Abstract
Many open-ended conversations (e.g., tutoring lessons or business meetings) revolve around pre-defined reference materials, like worksheets or meeting bullets. To provide a framework for studying such conversation structure, we introduce Problem-Oriented Segmentation & Retrieval (POSR), the task of jointly breaking down conversations into segments and linking each segment to the relevant reference item. As a case study, we apply POSR to education where effectively structuring lessons around problems is critical yet difficult. We present LessonLink, the first dataset of real-world tutoring lessons, featuring 3,500 segments, spanning 24,300 minutes of instruction and linked to 116 SAT math problems. We define and evaluate several joint and independent approaches for POSR, including segmentation (e.g., TextTiling), retrieval (e.g., ColBERT), and large language models (LLMs) methods. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
