From Reflection to Repair: A Scoping Review of Dataset Documentation Tools
Pedro Reynolds-Cu\'ellar (Robotics, AI Institute), Marisol Wong-Villacres (Escuela Superior Polit\'ecnica del Litoral), Adriana Alvarado Garcia (IBM Research), Heila Precel (Robotics, AI Institute)

TL;DR
This paper systematically reviews dataset documentation tools, revealing persistent conceptualization issues that hinder adoption, and advocates for institutional solutions to improve responsible AI practices.
Contribution
It provides a comprehensive analysis of motivations, conceptualizations, and barriers in dataset documentation tool design, proposing a shift towards institutional approaches.
Findings
Four patterns impede documentation adoption: unclear value, decontextualized design, labor demands, and future integration.
Highlights the need for institutional solutions over individual efforts.
Offers actionable recommendations for the HCI community to support sustainable documentation.
Abstract
Dataset documentation is widely recognized as essential for the responsible development of automated systems. Despite growing efforts to support documentation through different kinds of artifacts, little is known about the motivations shaping documentation tool design or the factors hindering their adoption. We present a systematic review supported by mixed-methods analysis of 59 dataset documentation publications to examine the motivations behind building documentation tools, how authors conceptualize documentation practices, and how these tools connect to existing systems, regulations, and cultural norms. Our analysis shows four persistent patterns in dataset documentation conceptualization that potentially impede adoption and standardization: unclear operationalizations of documentation's value, decontextualized designs, unaddressed labor demands, and a tendency to treat integration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Software Engineering Research · Data Visualization and Analytics
