\llinstruct: An Instruction-tuned model for English Language Proficiency Assessments
Debanjan Ghosh, Sophia Chan

TL;DR
This paper introduces \\llinstruct, an instruction-tuned 8B model for English Language Proficiency Assessments, trained on a new dataset, showing promising results but still requiring human review for real-world readiness.
Contribution
The creation of a new 70K instruction dataset for ELPA and the fine-tuning of Llama-3 8B models to improve assessment content generation.
Findings
SFT-70K model produces the most valid assessment outputs
All SFT models outperform larger models like GPT-3.5 in explanation quality
Outputs often need human intervention for real-world assessment readiness
Abstract
We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications. Our work involves creating a new dataset of 70K instructions and explanations in the ELPA domain and using these to fine-tune Llama-3 8B models (SFT) of different sizes (e.g., SFT-17K, SFT-50K and SFT-70K). Human evaluations are conducted over unseen instructions to compare these SFT models against SOTA models (e.g., Dolly-2, Mistral, Llama-3 base version, and GPT-3.5). The findings show although all three SFT models perform comparably, the model trained on largest instruction dataset -- SFT-70K - leads to the most valid outputs ready for assessments. However, although the SFT models perform better than larger model, e.g., GPT 3.5 on the aspect of explanations of outputs, many outputs still need human interventions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Residual Connection · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing · Adam · Byte Pair Encoding
