Comparison of Large Language Models for Generating Contextually Relevant Questions
Ivo Lodovico Molina, Valdemar \v{S}v\'abensk\'y, Tsubasa Minematsu, Li, Chen, Fumiya Okubo, Atsushi Shimada

TL;DR
This paper compares three large language models for automatic question generation from educational slide text, evaluating their effectiveness in producing relevant, clear, and well-aligned questions without fine-tuning.
Contribution
It provides an analysis of LLMs' capabilities for automatic question generation in educational contexts, highlighting their relative performance and strengths.
Findings
GPT-3.5 and Llama 2-Chat outperform Flan T5 XXL in key metrics.
GPT-3.5 excels at tailoring questions to answers.
Questions generated are suitable for educational use.
Abstract
This study explores the effectiveness of Large Language Models (LLMs) for Automatic Question Generation in educational settings. Three LLMs are compared in their ability to create questions from university slide text without fine-tuning. Questions were obtained in a two-step pipeline: first, answer phrases were extracted from slides using Llama 2-Chat 13B; then, the three models generated questions for each answer. To analyze whether the questions would be suitable in educational applications for students, a survey was conducted with 46 students who evaluated a total of 246 questions across five metrics: clarity, relevance, difficulty, slide relation, and question-answer alignment. Results indicate that GPT-3.5 and Llama 2-Chat 13B outperform Flan T5 XXL by a small margin, particularly in terms of clarity and question-answer alignment. GPT-3.5 especially excels at tailoring questions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Expert finding and Q&A systems
Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Inverse Square Root Schedule · Dropout · Cosine Annealing · Adafactor · Attention Dropout · SentencePiece · Adam · Linear Layer
