Standard Occupation Classifier -- A Natural Language Processing Approach
Sidharth Rony, Jack Patman

TL;DR
This paper develops an NLP-based classifier for occupational codes using ensemble models combining BERT and neural networks, achieving up to 72% accuracy in classifying jobs from advertisements.
Contribution
It introduces a novel ensemble NLP model for classifying UK and US occupational codes from job ads, improving prediction accuracy over previous methods.
Findings
Ensemble model with BERT and neural network achieved 72% accuracy.
Model effectively classifies jobs into multiple SOC tiers.
Approach enables real-time labour market analysis from job ads.
Abstract
Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLabor market dynamics and wage inequality · Occupational and Professional Licensing Regulation · Information Systems Education and Curriculum Development
