Evaluating LLM Behavior in Hiring: Implicit Weights, Fairness Across Groups, and Alignment with Human Preferences

Morgane Hoffmann; Emma Jouffroy; Warren Jouanneau; Marc Palyart; Charles Pebereau

arXiv:2601.11379·cs.CL·January 19, 2026

Evaluating LLM Behavior in Hiring: Implicit Weights, Fairness Across Groups, and Alignment with Human Preferences

Morgane Hoffmann, Emma Jouffroy, Warren Jouanneau, Marc Palyart, Charles Pebereau

PDF

Open Access

TL;DR

This paper introduces a framework for evaluating how large language models prioritize different attributes in recruitment tasks, comparing their decision logic to human norms and societal expectations.

Contribution

It develops a novel economic-based evaluation framework for LLMs in hiring, analyzing attribute importance, demographic biases, and alignment with human decision-making.

Findings

01

LLMs prioritize skills and experience in evaluations.

02

Certain features are interpreted beyond explicit matching value.

03

Minimal average discrimination against minorities, but intersectional effects exist.

Abstract

General-purpose Large Language Models (LLMs) show significant potential in recruitment applications, where decisions require reasoning over unstructured text, balancing multiple criteria, and inferring fit and competence from indirect productivity signals. Yet, it is still uncertain how LLMs assign importance to each attribute and whether such assignments are in line with economic principles, recruiter preferences or broader societal norms. We propose a framework to evaluate an LLM's decision logic in recruitment, by drawing on established economic methodologies for analyzing human hiring behavior. We build synthetic datasets from real freelancer profiles and project descriptions from a major European online freelance marketplace and apply a full factorial design to estimate how a LLM weighs different match-relevant criteria when evaluating freelancer-project fit. We identify which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Expert finding and Q&A systems · Mobile Crowdsensing and Crowdsourcing