Quantifying and Mitigating Socially Desirable Responding in LLMs: A Desirability-Matched Graded Forced-Choice Psychometric Study

Kensuke Okada; Yui Furukawa; Kyosuke Bunji

arXiv:2602.17262·cs.CL·April 29, 2026

Quantifying and Mitigating Socially Desirable Responding in LLMs: A Desirability-Matched Graded Forced-Choice Psychometric Study

Kensuke Okada, Yui Furukawa, Kyosuke Bunji

PDF

TL;DR

This paper introduces a psychometric framework to measure and reduce socially desirable responding in LLM evaluations, improving the accuracy of questionnaire-based assessments.

Contribution

It proposes a method to quantify SDR using IRT and develops a desirability-matched GFC approach to mitigate SDR in LLM assessments.

Findings

01

Likert-style questionnaires exhibit large SDR in LLMs.

02

Desirability-matched GFC significantly reduces SDR.

03

Trade-off observed between SDR mitigation and persona profile recovery.

Abstract

Human self-report questionnaires are increasingly used in NLP to benchmark and audit large language models (LLMs), from persona consistency to safety and bias assessments. Yet these instruments presume honest responding; in evaluative contexts, LLMs can instead gravitate toward socially preferred answers-a form of socially desirable responding (SDR)-biasing questionnaire-derived scores and downstream conclusions. We propose a psychometric framework to quantify and mitigate SDR in questionnaire-based evaluation of LLMs. To quantify SDR, the same inventory is administered under HONEST versus FAKE-GOOD instructions, and SDR is computed as a direction-corrected standardized effect size from item response theory (IRT)-estimated latent scores. This enables comparisons across constructs and response formats, as well as against human instructed-faking benchmarks. For mitigation, we construct a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.