WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain

Matthias De Lange; Warre Veys; Federico Retyk; Daniel Deniz; Warren Jouanneau; Mike Zhang; Aleksander Bielinski; Emma Jouffroy; Nicole Clobes; Nina Baranowska; David Graus; Marc Palyart; Rabih Zbib; Dimitra Gkatzia; Thomas Demeester; Tijl De Bie; Toine Bogers; Jens-Joris Decorte; Jeroen Van Hautte

arXiv:2604.13055·cs.CL·April 16, 2026

WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain

Matthias De Lange, Warre Veys, Federico Retyk, Daniel Deniz, Warren Jouanneau, Mike Zhang, Aleksander Bielinski, Emma Jouffroy, Nicole Clobes, Nina Baranowska, David Graus, Marc Palyart, Rabih Zbib, Dimitra Gkatzia, Thomas Demeester, Tijl De Bie, Toine Bogers, Jens-Joris Decorte

PDF

1 Repo 1 Models

TL;DR

WorkRB is an open-source, community-driven benchmark designed to evaluate AI models across diverse work-related NLP and recommendation tasks, addressing fragmentation and reproducibility issues in employment AI research.

Contribution

It introduces the first comprehensive, modular benchmark tailored to the work domain, enabling standardized evaluation across multiple tasks and multilingual settings.

Findings

01

Organizes 13 diverse work-related tasks into a unified benchmark.

02

Supports monolingual and cross-lingual evaluation with multilingual ontologies.

03

Facilitates contributions and integration of proprietary data through modular design.

Abstract

Today's evolving labor markets rely increasingly on recommender systems for hiring, talent management, and workforce analytics, with natural language processing (NLP) capabilities at the core. Yet, research in this area remains highly fragmented. Studies employ divergent ontologies (ESCO, O*NET, national taxonomies), heterogeneous task formulations, and diverse model families, making cross-study comparison and reproducibility exceedingly difficult. General-purpose benchmarks lack coverage of work-specific tasks, and the inherent sensitivity of employment data further limits open evaluation. We present \textbf{WorkRB} (Work Research Benchmark), the first open-source, community-driven benchmark tailored to work-domain AI. WorkRB organizes 13 diverse tasks from 7 task groups as unified recommendation and NLP tasks, including job/skill recommendation, candidate recommendation, similar item…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

techwolf-ai/WorkRB
github

Models

🤗
Aleksandruz/skillmatch-mpnet-curriculum-retriever
model· 1.2k dl
1.2k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.