(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and Challenges
Mohamed Hisham Abdellatif

TL;DR
This paper demonstrates how fine-tuning the compact PHI-3 model significantly improves its ability to answer multiple-choice questions accurately, addressing challenges like hallucinations and prompt clarity in educational AI applications.
Contribution
The study introduces a fine-tuning methodology for PHI-3 on MCQ tasks, with optimized prompts and evaluation metrics, advancing the application of efficient LLMs in education.
Findings
Perplexity decreased from 4.68 to 2.27 after fine-tuning.
Model accuracy increased from 62% to 90.8%.
Highlights the potential of compact LLMs in educational assessments.
Abstract
Large Language Models (LLMs) have become essential tools across various domains due to their impressive capabilities in understanding and generating human-like text. The ability to accurately answer multiple-choice questions (MCQs) holds significant value in education, particularly in automated tutoring systems and assessment platforms. However, adapting LLMs to handle MCQ tasks effectively remains challenging due to the hallucinations and unclear prompts. This work explores the potential of Microsoft's PHI-3\cite{Abdin2024}, a compact yet efficient LLM, for MCQ answering. Our contributions include fine-tuning the model on the TruthfulQA dataset, designing optimized prompts to enhance model performance, and evaluating using perplexity and traditional metrics like accuracy and F1 score. Results show a remarkable improvement in PHI-3.5's MCQ handling post-fine-tuning, with perplexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Seismology and Earthquake Studies
