LABOR-LLM: Language-Based Occupational Representations with Large Language Models
Susan Athey, Herman Brunborg, Tianyu Du, Ayush Kanodia, Keyon Vafa

TL;DR
This paper introduces LABOR-LLM, a large language model fine-tuned on occupational data to accurately predict workers' next jobs, outperforming previous models in labor transition predictions.
Contribution
It presents a novel approach of fine-tuning large language models with career data converted into text, enhancing occupational transition predictions beyond existing methods.
Findings
Fine-tuned LLMs outperform prior models in predicting next occupations.
Adding more diverse career data improves model performance.
Replacing occupational titles with codes reduces predictive accuracy.
Abstract
This paper builds an empirical model that predicts a worker's next occupation as a function of the worker's occupational history. Because histories are sequences of occupations, the covariate space is high-dimensional, and further, the outcome (the next occupation) is a discrete choice that can take on many values. To estimate the parameters of the model, we leverage an approach from generative artificial intelligence. Estimation begins from a ``foundation model'' trained on non-representative data and then ``fine-tunes'' the estimation using data about careers from a representative survey. We convert tabular data from the survey into text files that resemble resumes and fine-tune the parameters of the foundation model, a large language model (LLM), using these text files with the objective of predicting the next token (word). The resulting fine-tuned LLM is used to calculate estimates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOccupational Health and Safety Research
MethodsALIGN
