Specification Overfitting in Artificial Intelligence
Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia, Kaltenbrunner, Christoph Korab

TL;DR
This paper introduces the concept of specification overfitting in AI, highlighting how excessive focus on specific metrics can undermine overall system performance and high-level requirements.
Contribution
It provides a comprehensive literature survey categorizing how AI research addresses, measures, and optimizes specification metrics, revealing gaps in understanding and application.
Findings
Most papers implicitly address specification overfitting
Few papers explicitly define scope and assumptions of metrics
Research often reports multiple metrics without clear role definition
Abstract
Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology's potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, imperfect proxies that capture isolated aspects of the underlying requirements. Given possible trade-offs between different metrics and their vulnerability to over-optimization, integrating specification metrics in system development processes is not trivial. This paper defines specification overfitting, a scenario where systems focus excessively on specified metrics to the detriment of high-level requirements and task performance. We present an extensive literature survey to categorize how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Formal Methods in Verification
MethodsFocus
