Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems
Shun Ide, Allison Blunt, Djallel Bouneffouf

TL;DR
This paper introduces the 'random guesser test' as a simple yet effective method to evaluate AI systems' decision-making quality, revealing that sophisticated AI can underperform compared to random choices in certain scenarios.
Contribution
The paper proposes a novel quantitative assessment approach for AI decision-making by comparing AI performance to a random guesser, highlighting its potential to identify biases and improve exploration.
Findings
Sophisticated AI systems can underperform random guessers in sequential decision tasks.
Recommender systems may overly favor low-risk options, reducing utility.
The random guesser test can serve as a benchmark for AI utility evaluation.
Abstract
We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the utility of AI actions, and also points towards increasing exploration as a potential improvement to such systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
