Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights
Giorgio Biancini, Alessio Ferrato, Carla Limongelli

TL;DR
This study compares three large language models for generating multiple-choice questions, highlighting GPT-3.5's superior performance and exploring educator perspectives on AI adoption in education.
Contribution
It provides a novel comparative analysis of Llama 2, Mistral, and GPT-3.5 for MCQ generation, emphasizing prompt-based knowledge injection and educator insights.
Findings
GPT-3.5 produces the most effective MCQs
Educators show reluctance to adopt AI tools
Prompt-based knowledge injection enhances control over question quality
Abstract
Integrating Artificial Intelligence (AI) in educational settings has brought new learning approaches, transforming the practices of both students and educators. Among the various technologies driving this transformation, Large Language Models (LLMs) have emerged as powerful tools for creating educational materials and question answering, but there are still space for new applications. Educators commonly use Multiple-Choice Questions (MCQs) to assess student knowledge, but manually generating these questions is resource-intensive and requires significant time and cognitive effort. In our opinion, LLMs offer a promising solution to these challenges. This paper presents a novel comparative analysis of three widely known LLMs - Llama 2, Mistral, and GPT-3.5 - to explore their potential for creating informative and challenging MCQs. In our approach, we do not rely on the knowledge of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Layer Normalization · Linear Warmup With Cosine Annealing · Attention Dropout · Byte Pair Encoding · Softmax · Dropout · Dense Connections
