Making Intelligence: Ethical Values in IQ and ML Benchmarks
Borhane Blili-Hamelin, Leif Hancox-Li

TL;DR
This paper explores the ethical implications of designing ML benchmarks, highlighting their similarities with human intelligence tests and emphasizing the importance of documenting values in benchmark creation.
Contribution
It reveals the entanglement of ethics with technical decisions in ML benchmark design and advocates for transparency and ethical considerations in benchmark development.
Findings
ML benchmarks share structural similarities with IQ tests.
Values influence the design and interpretation of ML benchmarks.
Recommendations for ethical practices in ML benchmark research.
Abstract
In recent years, ML researchers have wrestled with defining and improving machine learning (ML) benchmarks and datasets. In parallel, some have trained a critical lens on the ethics of dataset creation and ML research. In this position paper, we highlight the entanglement of ethics with seemingly ``technical'' or ``scientific'' decisions about the design of ML benchmarks. Our starting point is the existence of multiple overlooked structural similarities between human intelligence benchmarks and ML benchmarks. Both types of benchmarks set standards for describing, evaluating, and comparing performance on tasks relevant to intelligence -- standards that many scholars of human intelligence have long recognized as value-laden. We use perspectives from feminist philosophy of science on IQ benchmarks and thick concepts in social science to argue that values need to be considered and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
