Automated Generation of Curriculum-Aligned Multiple-Choice Questions for Malaysian Secondary Mathematics Using Generative AI

Rohaizah Abdul Wahid; Muhamad Said Nizamuddin Nadim; Suliana Sulaiman; Syahmi Akmal Shaharudin; Muhammad Danial Jupikil; Iqqwan Jasman Su Azlan Su

arXiv:2508.04442·cs.CL·August 7, 2025

Automated Generation of Curriculum-Aligned Multiple-Choice Questions for Malaysian Secondary Mathematics Using Generative AI

Rohaizah Abdul Wahid, Muhamad Said Nizamuddin Nadim, Suliana Sulaiman, Syahmi Akmal Shaharudin, Muhammad Danial Jupikil, Iqqwan Jasman Su Azlan Su

PDF

TL;DR

This paper presents a novel RAG-based pipeline for generating curriculum-aligned multiple-choice questions in Bahasa Melayu using GPT-4o, with automated evaluation methods demonstrating improved factual accuracy and curriculum relevance.

Contribution

It introduces and compares four incremental pipelines, including RAG approaches, for generating curriculum-specific MCQs in a low-resource language, with a new evaluation framework.

Findings

01

RAG pipelines outperform non-grounded prompting in curriculum alignment

02

Automated evaluation effectively measures factual validity and curriculum relevance

03

Manual pipeline offers fine-grained control with comparable quality

Abstract

This paper addresses the critical need for scalable and high-quality educational assessment tools within the Malaysian education system. It highlights the potential of Generative AI (GenAI) while acknowledging the significant challenges of ensuring factual accuracy and curriculum alignment, especially for low-resource languages like Bahasa Melayu. This research introduces and compares four incremental pipelines for generating Form 1 Mathematics multiple-choice questions (MCQs) in Bahasa Melayu using OpenAI's GPT-4o. The methods range from non-grounded prompting (structured and basic) to Retrieval-Augmented Generation (RAG) approaches (one using the LangChain framework, one implemented manually). The system is grounded in official curriculum documents, including teacher-prepared notes and the yearly teaching plan (RPT). A dual-pronged automated evaluation framework is employed to assess…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.