Evaluating LLM Behavior in Hiring: Implicit Weights, Fairness Across Groups, and Alignment with Human Preferences
Morgane Hoffmann, Emma Jouffroy, Warren Jouanneau, Marc Palyart, Charles Pebereau

TL;DR
This paper introduces a framework for evaluating how large language models prioritize different attributes in recruitment tasks, comparing their decision logic to human norms and societal expectations.
Contribution
It develops a novel economic-based evaluation framework for LLMs in hiring, analyzing attribute importance, demographic biases, and alignment with human decision-making.
Findings
LLMs prioritize skills and experience in evaluations.
Certain features are interpreted beyond explicit matching value.
Minimal average discrimination against minorities, but intersectional effects exist.
Abstract
General-purpose Large Language Models (LLMs) show significant potential in recruitment applications, where decisions require reasoning over unstructured text, balancing multiple criteria, and inferring fit and competence from indirect productivity signals. Yet, it is still uncertain how LLMs assign importance to each attribute and whether such assignments are in line with economic principles, recruiter preferences or broader societal norms. We propose a framework to evaluate an LLM's decision logic in recruitment, by drawing on established economic methodologies for analyzing human hiring behavior. We build synthetic datasets from real freelancer profiles and project descriptions from a major European online freelance marketplace and apply a full factorial design to estimate how a LLM weighs different match-relevant criteria when evaluating freelancer-project fit. We identify which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Expert finding and Q&A systems · Mobile Crowdsensing and Crowdsourcing
