Data-Centric Safety and Ethical Measures for Data and AI Governance

Srija Chakraborty

arXiv:2506.10217·cs.CY·July 2, 2025

Data-Centric Safety and Ethical Measures for Data and AI Governance

Srija Chakraborty

PDF

Open Access

TL;DR

This paper proposes a comprehensive, domain-agnostic framework for responsible dataset design aimed at enhancing safety, reducing risks, and promoting ethical practices in AI data management throughout the AI lifecycle.

Contribution

It introduces a novel, multi-stage dataset design framework focused on safety and ethics, addressing a gap in responsible AI data practices.

Findings

01

Framework promotes safer AI model development

02

Reduces risks associated with unsafe or unethical data

03

Applicable across various AI domains

Abstract

Datasets play a key role in imparting advanced capabilities to artificial intelligence (AI) foundation models that can be adapted to various downstream tasks. These downstream applications can introduce both beneficial and harmful capabilities -- resulting in dual use AI foundation models, with various technical and regulatory approaches to monitor and manage these risks. However, despite the crucial role of datasets, responsible dataset design and ensuring data-centric safety and ethical practices have received less attention. In this study, we pro-pose responsible dataset design framework that encompasses various stages in the AI and dataset lifecycle to enhance safety measures and reduce the risk of AI misuse due to low quality, unsafe and unethical data content. This framework is domain agnostic, suitable for adoption for various applications and can promote responsible practices in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)