JobResQA: A Benchmark for LLM Machine Reading Comprehension on Multilingual R\'esum\'es and JDs

Casimiro Pio Carrino; Paula Estrella; Rabih Zbib; Carlos Escolano; Jos\'e A. R. Fonollosa

arXiv:2601.23183·cs.CL·February 2, 2026

JobResQA: A Benchmark for LLM Machine Reading Comprehension on Multilingual R\'esum\'es and JDs

Casimiro Pio Carrino, Paula Estrella, Rabih Zbib, Carlos Escolano, Jos\'e A. R. Fonollosa

PDF

Open Access

TL;DR

JobResQA is a multilingual benchmark for evaluating large language models' reading comprehension on HR-related tasks involving resumes and job descriptions across five languages, highlighting performance gaps and enabling fairness studies.

Contribution

The paper introduces JobResQA, a novel multilingual MRC benchmark with synthetic data, bias control, and a cost-effective translation pipeline for HR applications.

Findings

01

Higher LLM performance on English and Spanish

02

Significant performance gaps in other languages

03

Benchmark facilitates fairness and bias analysis

Abstract

We introduce JobResQA, a multilingual Question Answering benchmark for evaluating Machine Reading Comprehension (MRC) capabilities of LLMs on HR-specific tasks involving r\'esum\'es and job descriptions. The dataset comprises 581 QA pairs across 105 synthetic r\'esum\'e-job description pairs in five languages (English, Spanish, Italian, German, and Chinese), with questions spanning three complexity levels from basic factual extraction to complex cross-document reasoning. We propose a data generation pipeline derived from real-world sources through de-identification and data synthesis to ensure both realism and privacy, while controlled demographic and professional attributes (implemented via placeholders) enable systematic bias and fairness studies. We also present a cost-effective, human-in-the-loop translation pipeline based on the TEaR methodology, incorporating MQM error annotations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Text Readability and Simplification