Is General-Purpose AI Reasoning Sensitive to Data-Induced Cognitive Biases? Dynamic Benchmarking on Typical Software Engineering Dilemmas
Francesco Sovrano, Gabriele Dominici, Rita Sevastjanova, Alessandra Stramiglio, Alberto Bacchelli

TL;DR
This paper introduces a dynamic benchmarking framework to evaluate whether general-purpose AI systems exhibit data-induced cognitive biases in software engineering tasks, revealing a tendency to rely on superficial cues over complex reasoning.
Contribution
The study presents the first dynamic benchmark for assessing cognitive biases in GPAI within software engineering, including an augmentation pipeline for realistic, bias-preserving task variants.
Findings
All evaluated GPAI systems show bias sensitivity (6-35%).
Bias sensitivity increases with task complexity (up to 49%).
Systems tend to rely on shallow heuristics rather than deep reasoning.
Abstract
Human cognitive biases in software engineering can lead to costly errors. While general-purpose AI (GPAI) systems may help mitigate these biases due to their non-human nature, their training on human-generated data raises a critical question: Do GPAI systems themselves exhibit cognitive biases? To investigate this, we present the first dynamic benchmarking framework to evaluate data-induced cognitive biases in GPAI within software engineering workflows. Starting with a seed set of 16 hand-crafted realistic tasks, each featuring one of 8 cognitive biases (e.g., anchoring, framing) and corresponding unbiased variants, we test whether bias-inducing linguistic cues unrelated to task logic can lead GPAI systems from correct to incorrect conclusions. To scale the benchmark and ensure realism, we develop an on-demand augmentation pipeline relying on GPAI systems to generate task variants that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Data Visualization and Analytics
