To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs

Saurabh Kumar Pandey; Sougata Saha; Monojit Choudhury

arXiv:2601.02858·cs.CL·January 7, 2026

To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs

Saurabh Kumar Pandey, Sougata Saha, Monojit Choudhury

PDF

Open Access

TL;DR

This paper compares socio-demographic prompting and inverse socio-demographic prompting to assess cultural alignment in LLMs, revealing that models perform better with actual user behaviors and highlighting limitations in personalization.

Contribution

It introduces inverse socio-demographic prompting (ISDP) as a new method to better evaluate LLMs' cultural understanding by discriminating demographic proxies from user behaviors.

Findings

01

Models perform better with actual behaviors than simulated ones.

02

Performance diminishes and converges at the individual level, indicating limits to personalization.

03

ISDP provides clearer insights into LLMs' cultural competency than SDP.

Abstract

Socio-demographic prompting (SDP) - prompting Large Language Models (LLMs) using demographic proxies to generate culturally aligned outputs - often shows LLM responses as stereotypical and biased. While effective in assessing LLMs' cultural competency, SDP is prone to confounding factors such as prompt sensitivity, decoding parameters, and the inherent difficulty of generation over discrimination tasks due to larger output spaces. These factors complicate interpretation, making it difficult to determine if the poor performance is due to bias or the task design. To address this, we use inverse socio-demographic prompting (ISDP), where we prompt LLMs to discriminate and predict the demographic proxy from actual and simulated user behavior from different users. We use the Goodreads-CSI dataset (Saha et al., 2025), which captures difficulty in understanding English book reviews for users…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Computational and Text Analysis Methods · Topic Modeling