Is your AI Model Accurate Enough? The Difficult Choices Behind Rigorous AI Development and the EU AI Act

Lucas G. Uberti-Bona Marin; Bram Rijsbosch; Kristof Meding; Gerasimos Spanakis; Gijs van Dijck; Konrad Kollnig

arXiv:2604.03254·cs.CY·April 29, 2026

Is your AI Model Accurate Enough? The Difficult Choices Behind Rigorous AI Development and the EU AI Act

Lucas G. Uberti-Bona Marin, Bram Rijsbosch, Kristof Meding, Gerasimos Spanakis, Gijs van Dijck, Konrad Kollnig

PDF

TL;DR

This paper argues that AI accuracy evaluation is inherently normative and context-dependent, influenced by legal and ethical choices, especially in the context of the EU AI Act.

Contribution

It provides a legal-technical analysis of how accuracy is defined and measured, highlighting four key choices affecting AI performance assessment.

Findings

01

Identifies four central choices in accuracy evaluation: metrics selection, balancing, data representation, thresholds.

02

Analyzes how these choices relate to the EU AI Act's requirements and documentation obligations.

03

Discusses implications for regulators, auditors, and developers in implementing AI safety standards.

Abstract

Technical and legal debates frequently suggest that "accuracy" is an objective, measurable, and purely technical property. We challenge this view, showing that evaluating AI performance fundamentally depends on context-dependent normative decisions. These techno-normative choices are crucial for rigorous AI deployment, as they determine which errors are prioritised, how risks are distributed, and how trade-offs between competing objectives are resolved. This paper provides a legal-technical analysis of the choices that shape how accuracy is defined, measured, and assessed, using the 2024 European Union AI Act -- which mandates an "appropriate level of accuracy" for high-risk systems -- as a primary case study. We identify and analyse four choices central to any robust performance evaluation: (1) selecting metrics, (2) balancing multiple metrics, (3) measuring metrics against…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.