Lessons From the Processing and Sharing of Public Data Sets for the Study of Structural Racism
Erik Westlund, Boeun Kim, Sierra Grey-Coker, Karen Bandeen-Roche, Sarah Szanton

TL;DR
This paper discusses challenges in using public data sets to study structural racism and introduces tools to standardize and share these data.
Contribution
The paper introduces standardized metadata and data transformation pipelines to facilitate the study of structural racism using public data.
Findings
Public data sets for studying structural racism face issues like recency bias and missingness.
Standardized metadata and transformation pipelines improve data usability for researchers.
Creating a public repository helps address challenges in open science practices.
Abstract
As part of a larger study of structural racism, we collected 50 publicly available data sets containing geographic measures. These data sets covered over 100 years of history, six geographic units, and nine domains of inquiry (civics, credit/income/wealth, education, employment, environment, healthcare, media/marketing, neighborhoods, and policing). Structured metadata about each data set were compiled and used to standardize data files with shared conventions, allowing analysts to combine data files to study structural racism. To allow researchers to assess the potential value of these data in relation to their areas of inquiry, we created dashboards that summarize key measures in each data set, including the geographic level of measurement, the years covered by the data, and the extent of missingness. This process made clear several problems researchers seeking to use public data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRacial and Ethnic Identity Research · Urban, Neighborhood, and Segregation Studies · Race, Genetics, and Society
