CODE-GEN: A Human-in-the-Loop RAG-Based Agentic AI System for Multiple-Choice Question Generation
Xiaojing Duan, Frederick Nwanganga, Chaoli Wang

TL;DR
CODE-GEN is a human-in-the-Loop, retrieval-augmented AI system that generates and validates multiple-choice coding questions aligned with educational objectives, demonstrating high agreement with human experts.
Contribution
Introduces a novel agentic AI architecture with specialized tools for generating and validating educational questions, emphasizing human-AI collaboration in content creation.
Findings
High agreement (79.9%-98.6%) between AI-generated questions and human expert validation.
Effective validation on dimensions like question clarity, code validity, and concept alignment.
Human expertise remains crucial for complex pedagogical judgments.
Abstract
We present CODE-GEN, a human-in-the-Loop, retrieval-augmented generation (RAG)-based agentic AI system for generating context-aligned multiple-choice questions to develop student code reasoning and comprehension abilities. CODE-GEN employs an agentic AI architecture in which a Generator agent produces multiple-choice coding comprehension questions aligned with course-specific learning objectives, while a Validator agent independently assesses content quality across seven pedagogical dimensions. Both agents are augmented with specialized tools that enhance computational accuracy and verify code outputs. To evaluate the effectiveness of CODE-GEN, we conducted an evaluation study involving six human subject-matter experts (SMEs) who judged 288 AI-generated questions. The SMEs produced a total of 2,016 human-AI rating pairs, indicating agreement or disagreement with the assessments of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
