\llinstruct: An Instruction-tuned model for English Language Proficiency   Assessments

Debanjan Ghosh; Sophia Chan

arXiv:2410.09314·cs.CL·October 15, 2024

\llinstruct: An Instruction-tuned model for English Language Proficiency Assessments

Debanjan Ghosh, Sophia Chan

PDF

Open Access

TL;DR

This paper introduces \\llinstruct, an instruction-tuned 8B model for English Language Proficiency Assessments, trained on a new dataset, showing promising results but still requiring human review for real-world readiness.

Contribution

The creation of a new 70K instruction dataset for ELPA and the fine-tuning of Llama-3 8B models to improve assessment content generation.

Findings

01

SFT-70K model produces the most valid assessment outputs

02

All SFT models outperform larger models like GPT-3.5 in explanation quality

03

Outputs often need human intervention for real-world assessment readiness

Abstract

We present \llinstruct: An 8B instruction-tuned model that is designed to generate content for English Language Proficiency Assessments (ELPA) and related applications. Our work involves creating a new dataset of 70K instructions and explanations in the ELPA domain and using these to fine-tune Llama-3 8B models (SFT) of different sizes (e.g., SFT-17K, SFT-50K and SFT-70K). Human evaluations are conducted over unseen instructions to compare these SFT models against SOTA models (e.g., Dolly-2, Mistral, Llama-3 base version, and GPT-3.5). The findings show although all three SFT models perform comparably, the model trained on largest instruction dataset -- SFT-70K - leads to the most valid outputs ready for assessments. However, although the SFT models perform better than larger model, e.g., GPT 3.5 on the aspect of explanations of outputs, many outputs still need human interventions to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Residual Connection · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing · Adam · Byte Pair Encoding