Pedagogy-R1: Pedagogically-Aligned Reasoning Model with Balanced Educational Benchmark

Unggi Lee; Jaeyong Lee; Jiyeong Bae; Yeil Jeong; Junbo Koh; Gyeonggeon Lee; Gunho Lee; Taekyung Ahn; Hyeoncheol Kim

arXiv:2505.18467·cs.AI·May 27, 2025

Pedagogy-R1: Pedagogically-Aligned Reasoning Model with Balanced Educational Benchmark

Unggi Lee, Jaeyong Lee, Jiyeong Bae, Yeil Jeong, Junbo Koh, Gyeonggeon Lee, Gunho Lee, Taekyung Ahn, Hyeoncheol Kim

PDF

TL;DR

This paper introduces Pedagogy-R1, a reasoning model adapted for educational settings, featuring a new training pipeline, an educational benchmark, and a prompting strategy to enhance pedagogical coherence and evaluate teaching-related skills.

Contribution

The paper presents Pedagogy-R1, a novel framework that aligns large reasoning models with pedagogical tasks through innovative training, evaluation, and prompting methods.

Findings

01

Pedagogy-R1 demonstrates improved pedagogical reasoning over baseline models.

02

The Well-balanced Educational Benchmark effectively evaluates multiple teaching-related skills.

03

The Chain-of-Pedagogy prompting enhances the model's ability to generate teacher-like reasoning.

Abstract

Recent advances in large reasoning models (LRMs) show strong performance in structured domains such as mathematics and programming; however, they often lack pedagogical coherence and realistic teaching behaviors. To bridge this gap, we introduce Pedagogy-R1, a framework that adapts LRMs for classroom use through three innovations: (1) a distillation-based pipeline that filters and refines model outputs for instruction-tuning, (2) the Well-balanced Educational Benchmark (WBEB), which evaluates performance across subject knowledge, pedagogical knowledge, tracing, essay scoring, and teacher decision-making, and (3) a Chain-of-Pedagogy (CoP) prompting strategy for generating and eliciting teacher-style reasoning. Our mixed-method evaluation combines quantitative metrics with qualitative analysis, providing the first systematic assessment of LRMs' pedagogical strengths and limitations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.