Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance

Alireza Abbaspour; Tejaskumar Balgonda Patil; B Ravi Kiran; Russel Mohr; Senthil Yogamani

arXiv:2511.08439·cs.AI·April 15, 2026

Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance

Alireza Abbaspour, Tejaskumar Balgonda Patil, B Ravi Kiran, Russel Mohr, Senthil Yogamani

PDF

TL;DR

This paper introduces a structured framework for ensuring dataset safety in autonomous driving AI systems, emphasizing safety standards, hazard mitigation, and verification processes.

Contribution

It presents a comprehensive dataset safety framework aligned with ISO standards, including safety analyses, lifecycle management, and validation strategies for autonomous driving datasets.

Findings

01

Framework aligns dataset safety with ISO/PAS 88000 standards.

02

Introduces AI Data Flywheel and dataset lifecycle for safety management.

03

Reviews recent research and trends in dataset safety for autonomous vehicles.

Abstract

Dataset integrity is fundamental to the safety and reliability of AI systems, especially in autonomous driving. This paper presents a structured framework for developing safe datasets aligned with ISO/PAS 8800 guidelines. Using AI-based perception systems as the primary use case, it introduces the AI Data Flywheel and the dataset lifecycle, covering data collection, annotation, curation, and maintenance. The framework incorporates rigorous safety analyses to identify hazards and mitigate risks caused by dataset insufficiencies. It also defines processes for establishing dataset safety requirements and proposes verification and validation strategies to ensure compliance with safety standards. In addition to outlining best practices, the paper reviews recent research and emerging trends in dataset safety and autonomous vehicle development, providing insights into current challenges and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.