Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI
Payam Saeedi, Mahsa Goodarzi, M Abdullah Canbaz

TL;DR
This study evaluates cognitive biases in three large language models, revealing varying levels of bias and inconsistency, and highlights the need for improved reasoning and bias mitigation in developing responsible AGI.
Contribution
It provides a comprehensive experimental analysis of biases in LLMs, highlighting their strengths and weaknesses, and underscores the importance of addressing biases for responsible AGI development.
Findings
GPT-4o performed best overall
Gemma 2 effectively addressed certain biases
Llama 3.1 showed frequent inconsistencies
Abstract
We investigate the presence of cognitive biases in three large language models (LLMs): GPT-4o, Gemma 2, and Llama 3.1. The study uses 1,500 experiments across nine established cognitive biases to evaluate the models' responses and consistency. GPT-4o demonstrated the strongest overall performance. Gemma 2 showed strengths in addressing the sunk cost fallacy and prospect theory, however its performance varied across different biases. Llama 3.1 consistently underperformed, relying on heuristics and exhibiting frequent inconsistencies and contradictions. The findings highlight the challenges of achieving robust and generalizable reasoning in LLMs, and underscore the need for further development to mitigate biases in artificial general intelligence (AGI). The study emphasizes the importance of integrating statistical reasoning and ethical considerations in future AI development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Probability and Statistical Research · Advanced Bandit Algorithms Research
MethodsLLaMA
