Data Representativeness in Accessibility Datasets: A Meta-Analysis
Rie Kamikubo, Lining Wang, Crystal Marte, Amnah Mahmood, Hernisa, Kacorri

TL;DR
This paper analyzes the demographic representation in accessibility datasets sourced from people with disabilities, highlighting gaps in gender and race representation and discussing challenges in demographic classification to promote more inclusive AI systems.
Contribution
It provides a comprehensive review of 190 accessibility datasets, identifying representation gaps and discussing the complexities of demographic labeling for marginalized communities.
Findings
Accessibility datasets represent diverse ages.
Gender and race representation gaps exist.
Demographic classification is complex and inconsistent.
Abstract
As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets - datasets sourced from people with disabilities and older adults - that can potentially play an important role in mitigating bias for inclusive AI-infused applications. We examine the current state of representation within datasets sourced by people with disabilities by reviewing publicly-available information of 190 datasets, we call these accessibility datasets. We find that accessibility datasets represent diverse ages,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
