Large Language Models as Batteries-Included Zero-Shot ESCO Skills   Matchers

Benjamin Clavi\'e; Guillaume Souli\'e

arXiv:2307.03539·cs.CL·August 31, 2023·6 cites

Large Language Models as Batteries-Included Zero-Shot ESCO Skills Matchers

Benjamin Clavi\'e, Guillaume Souli\'e

PDF

Open Access

TL;DR

This paper presents a zero-shot skills extraction system using large language models that generates synthetic data, employs retrieval and re-ranking techniques, and outperforms previous methods without requiring human annotations.

Contribution

The work introduces an end-to-end LLM-based zero-shot skills extraction framework that leverages synthetic data and re-ranking, significantly improving accuracy over prior approaches.

Findings

01

Synthetic data improves skills extraction accuracy.

02

GPT-4 re-ranking enhances performance by over 22 points RP@10.

03

Framing as mock programming prompts yields better results with weaker LLMs.

Abstract

Understanding labour market dynamics requires accurately identifying the skills required for and possessed by the workforce. Automation techniques are increasingly being developed to support this effort. However, automatically extracting skills from job postings is challenging due to the vast number of existing skills. The ESCO (European Skills, Competences, Qualifications and Occupations) framework provides a useful reference, listing over 13,000 individual skills. However, skills extraction remains difficult and accurately matching job posts to the ESCO taxonomy is an open problem. In this work, we propose an end-to-end zero-shot system for skills extraction from job descriptions based on large language models (LLMs). We generate synthetic training data for the entirety of ESCO skills and train a classifier to extract skill mentions from job posts. We also employ a similarity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Dropout