AI data transparency: an exploration through the lens of AI incidents
Sophia Worth, Ben Snaith, Arunav Das, Gefion Thuermer, Elena Simperl

TL;DR
This paper investigates the current state of data transparency in AI systems, revealing persistent gaps and barriers that hinder responsible deployment and addressing public concerns, and emphasizes the need for systematic monitoring approaches.
Contribution
It provides an empirical analysis of data transparency practices in AI, highlighting the gaps and proposing the development of systematic monitoring methods.
Findings
Low data transparency is widespread across AI systems.
Transparency issues hinder investigation and accountability.
Systematic monitoring of data transparency is needed.
Abstract
Knowing more about the data used to build AI systems is critical for allowing different stakeholders to play their part in ensuring responsible and appropriate deployment and use. Meanwhile, a 2023 report shows that data transparency lags significantly behind other areas of AI transparency in popular foundation models. In this research, we sought to build on these findings, exploring the status of public documentation about data practices within AI systems generating public concern. Our findings demonstrate that low data transparency persists across a wide range of systems, and further that issues of transparency and explainability at model- and system- level create barriers for investigating data transparency information to address public concerns about AI systems. We highlight a need to develop systematic ways of monitoring AI data transparency that account for the diversity of AI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Anomaly Detection Techniques and Applications
