Same Performance, Hidden Bias: Evaluating Hypothesis- and Recommendation-Driven AI
Michaela Benk, Tim Miller

TL;DR
This study reveals that recommendation-driven AI systems can subtly bias user decision processes without affecting overall performance, emphasizing the importance of examining underlying decision strategies rather than just outcomes.
Contribution
It introduces a framework for analyzing decision process biases in AI-assisted systems and demonstrates systemic shifts in user judgment thresholds across expertise levels.
Findings
Recommendation-driven designs lower evidence thresholds.
Biases affect both novices and experts equally.
Performance metrics alone can mask underlying decision biases.
Abstract
The HCI community commonly evaluates decision support systems based on whether they improve task performance or promote appropriate user reliance. In this work, we look beyond decision outcomes to examine the process through which users develop decision-making strategies. Through a web-based experiment (N = 290) comparing recommendation-driven and hypothesis-driven interaction designs, and using Signal Detection Theory as a theoretical framework, we show that even when performance remains identical, recommendation-driven designs lower participants' thresholds for sufficient evidence and introduce a "hidden bias" in their judgments, resulting in a shifted distribution of errors. Furthermore, we find that experts are just as susceptible to these systemic shifts as novices. We conclude by advocating for a shift in focus: prioritizing decision processes and the preservation of stable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Human-Technology Interaction · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
