KARRIEREWEGE: A Large Scale Career Path Prediction Dataset
Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank

TL;DR
KARRIEREWEGE is a large, publicly available dataset of over 500,000 career paths linked to ESCO taxonomy, enabling improved career trajectory prediction, especially from free-text resume data, through benchmarking and synthesized data enhancement.
Contribution
The paper introduces KARRIEREWEGE, the largest career path dataset to date, and KARRIEREWEGE+ with synthesized data, advancing research in career prediction from unstructured resume information.
Findings
Enhanced prediction accuracy with synthesized data.
Improved robustness of models on free-text inputs.
Benchmarking shows state-of-the-art models perform better on the new dataset.
Abstract
Accurate career path prediction can support many stakeholders, like job seekers, recruiters, HR, and project managers. However, publicly available data and tools for career path prediction are scarce. In this work, we introduce KARRIEREWEGE, a comprehensive, publicly available dataset containing over 500k career paths, significantly surpassing the size of previously available datasets. We link the dataset to the ESCO taxonomy to offer a valuable resource for predicting career trajectories. To tackle the problem of free-text inputs typically found in resumes, we enhance it by synthesizing job titles and descriptions resulting in KARRIEREWEGE+. This allows for accurate predictions from unstructured data, closely aligning with real-world application challenges. We benchmark existing state-of-the-art (SOTA) models on our dataset and a prior benchmark and observe improved performance and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultidisciplinary Science and Engineering Research · Software Testing and Debugging Techniques · Railway Systems and Energy Efficiency
