Measuring Corporate Human Capital Disclosures: Lexicon, Data, Code, and Research Opportunities
Elizabeth Demers, Victor Xiaoqi Wang, Kean Wu

TL;DR
This paper develops a machine learning-based lexicon to measure and analyze corporate human capital disclosures, providing tools and data for researchers to improve understanding and reporting of human capital management.
Contribution
It introduces a comprehensive HC-related lexicon, shares the data and code, and demonstrates how to use these tools for research and analysis of corporate disclosures.
Findings
Created a multidimensional HC lexicon with five subcategories.
Shared datasets, code, and examples for applying the lexicon.
Facilitated future research on HC disclosure and management.
Abstract
Human capital (HC) is increasingly important to corporate value creation. Unlike other assets, however, HC is not currently subject to well-defined measurement or disclosure rules. We use a machine learning algorithm (word2vec) trained on a confirmed set of HC disclosures to develop a comprehensive list of HC-related keywords classified into five subcategories (DEI; health and safety; labor relations and culture; compensation and benefits; and demographics and other) that capture the multidimensional nature of HC management. We share our lexicon, corporate HC disclosures, and the Python code used to develop the lexicon, and we provide detailed examples of using our data and code, including for fine-tuning a BERT model. Researchers can use our HC lexicon (or modify the code to capture another construct of interest) with their samples of corporate communications to address pertinent HC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLayer Normalization · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Softmax · Linear Layer · Dropout · Dense Connections · Attention Is All You Need · WordPiece
