A Machine Learning Approach to Improving Occupational Income Scores
Martin Saavedra, Tate Twinam

TL;DR
This paper introduces a machine learning method to improve occupational income scores, reducing bias in earnings estimates and better capturing true income disparities across race, gender, and generations.
Contribution
It develops a novel machine learning-based adjusted income score that enhances the accuracy of labor market outcome measurements from historical and modern census data.
Findings
Adjusted scores reduce bias in earnings gap estimates
New scores produce estimates closer to actual earnings regressions
Improves measurement of intergenerational mobility
Abstract
Historical studies of labor markets frequently lack data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. We consider the consequences of using OCCSCORE when researchers are interested in earnings regressions. We estimate race and gender earnings gaps in modern decennial Censuses as well as the 1915 Iowa State Census. Using OCCSCORE biases results towards zero and can result in estimated gaps of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and demographics. The new income score provides estimates closer to earnings regressions. Lastly, we consider the consequences for estimates of intergenerational mobility elasticities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntergenerational and Educational Inequality Studies · Labor market dynamics and wage inequality · Urban, Neighborhood, and Segregation Studies
