An Investigation of Experiences Engaging the Margins in Data-Centric Innovation
Gabriella Thompson, Ebtesam Al Haque, Paulette Blanc, Meme Styles,, Denae Ford, Angela D.R. Smith, and Brittany Johnson

TL;DR
This paper explores barriers to equitable data-centric innovation, highlighting how systemic inequities influence dataset representation and affect diverse demographics in technological research and development.
Contribution
It provides initial insights from a survey on how age and identity impact the pursuit of representative datasets in data-centric fields.
Findings
Age significantly influences dataset selection.
Identity factors affect data representation challenges.
Further research needed on systemic barriers.
Abstract
Data-centric technologies provide exciting opportunities, but recent research has shown how lack of representation in datasets, often as a result of systemic inequities and socioeconomic disparities, can produce inequitable outcomes that can exclude or harm certain demographics. In this paper, we discuss preliminary insights from an ongoing effort aimed at better understanding barriers to equitable data-centric innovation. We report findings from a survey of 261 technologists and researchers who use data in their work regarding their experiences seeking adequate, representative datasets. Our findings suggest that age and identity play a significant role in the seeking and selection of representative datasets, warranting further investigation into these aspects of data-centric research and development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
