Automatically assembling a full census of an academic field
Allison C. Morgan, Samuel F. Way, Aaron Clauset

TL;DR
This paper presents an automated web crawler system that efficiently constructs a comprehensive and accurate census of computer science faculty in North America, enabling detailed workforce analysis and policy impact studies.
Contribution
The authors introduce a novel topical web crawler that automates faculty data collection, achieving high precision and recall, significantly reducing manual effort in creating academic censuses.
Findings
Achieved over 99% precision and recall in census data collection.
Enabled analysis of faculty turnover and retention over six years.
Provided insights into gender disparities in computer science faculty.
Abstract
The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
