Self-hosted Lecture-to-Quiz: Local LLM MCQ Generation with Deterministic Quality Control
Seine A. Shintani

TL;DR
This paper introduces a self-hosted, privacy-preserving pipeline that converts lecture PDFs into multiple-choice questions using local LLMs and deterministic quality control, ensuring high-quality, deployable question banks without external API calls.
Contribution
The work presents a fully self-hosted MCQ generation pipeline with explicit quality control, enabling private, accountable, and Green AI educational workflows.
Findings
120 accepted MCQ candidates with QC conformance
8 flagged items with residual quality risks
Final 24-question set released as JSONL/CSV
Abstract
We present an end-to-end self-hosted (API-free) pipeline, where API-free means that lecture content is not sent to any external LLM service, that converts lecture PDFs into multiple-choice questions (MCQs) using a local LLM plus deterministic quality control (QC). The pipeline is designed for black-box minimization: LLMs may assist drafting, but the final released artifacts are plain-text question banks with an explicit QC trace and without any need to call an LLM at deployment time. We run a seed sweep on three short "dummy lectures" (information theory, thermodynamics, and statistical mechanics), collecting 15 runs x 8 questions = 120 accepted candidates (122 attempts total under bounded retries). All 120 accepted candidates satisfy hard QC checks (JSON schema conformance, a single marked correct option, and numeric/constant equivalence tests); however, the warning layer flags 8/120…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism · Topic Modeling · Teaching and Learning Programming
