LLM4Jobs: Unsupervised occupation extraction and standardization   leveraging Large Language Models

Nan Li; Bo Kang; Tijl De Bie

arXiv:2309.09708·cs.CL·September 20, 2023·5 cites

LLM4Jobs: Unsupervised occupation extraction and standardization leveraging Large Language Models

Nan Li, Bo Kang, Tijl De Bie

PDF

Open Access 1 Repo

TL;DR

This paper presents LLM4Jobs, an unsupervised approach using large language models for extracting and standardizing occupations from free-text job data, outperforming existing benchmarks across various datasets.

Contribution

Introduces LLM4Jobs, a novel unsupervised method leveraging LLMs for occupation coding, with new datasets and superior performance over state-of-the-art methods.

Findings

01

LLM4Jobs outperforms existing unsupervised benchmarks.

02

The approach is versatile across diverse datasets.

03

New synthetic and real-world datasets are provided.

Abstract

Automated occupation extraction and standardization from free-text job postings and resumes are crucial for applications like job recommendation and labor market policy formation. This paper introduces LLM4Jobs, a novel unsupervised methodology that taps into the capabilities of large language models (LLMs) for occupation coding. LLM4Jobs uniquely harnesses both the natural language understanding and generation capacities of LLMs. Evaluated on rigorous experimentation on synthetic and real-world datasets, we demonstrate that LLM4Jobs consistently surpasses unsupervised state-of-the-art benchmarks, demonstrating its versatility across diverse datasets and granularities. As a side result of our work, we present both synthetic and real-world datasets, which may be instrumental for subsequent research in this domain. Overall, this investigation highlights the promise of contemporary LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aida-ugent/skillgpt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques