Leveraging LLM-Respondents for Item Evaluation: a Psychometric Analysis
Yunting Liu, Shreya Bhandari, Zachary A. Pardos

TL;DR
This study investigates using multiple large language models to simulate human respondents for item calibration in educational measurement, showing ensemble approaches can effectively replicate human psychometric properties and improve calibration accuracy.
Contribution
It introduces a novel ensemble LLM approach for item calibration, demonstrating comparable psychometric properties to human respondents and enhanced calibration accuracy through augmentation strategies.
Findings
Some LLMs outperform college students in algebra proficiency.
Ensemble of LLMs better mimics human ability distribution.
Resampling augmentation improves calibration correlation from 0.89 to 0.93.
Abstract
Effective educational measurement relies heavily on the curation of well-designed item pools (i.e., possessing the right psychometric properties). However, item calibration is time-consuming and costly, requiring a sufficient number of respondents for the response process. We explore using six different LLMs (GPT-3.5, GPT-4, Llama 2, Llama 3, Gemini-Pro, and Cohere Command R Plus) and various combinations of them using sampling methods to produce responses with psychometric properties similar to human answers. Results show that some LLMs have comparable or higher proficiency in College Algebra than college students. No single LLM mimics human respondents due to narrow proficiency distributions, but an ensemble of LLMs can better resemble college students' ability distribution. The item parameters calibrated by LLM-Respondents have high correlations (e.g. > 0.8 for GPT-3.5) compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI and HR Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Label Smoothing · Linear Layer · Adam · Dropout · Weight Decay
