Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure
Nattavudh Powdthavee

TL;DR
This study shows that large language models are more consistent and resistant to motivated investor pressure in fraud detection than humans, providing more reliable warnings across various investment scenarios.
Contribution
The paper demonstrates that current large language models outperform humans in detecting and resisting motivated investor pressure in fraud scenarios, with fewer endorsement reversals.
Findings
LLMs rarely reverse fraud warnings under motivated framing.
Humans endorse fraudulent investments at higher baseline rates.
AI warnings are more consistent than human advisors.
Abstract
Large language models trained on human feedback may suppress fraud warnings when investors arrive already persuaded of a fraudulent opportunity. We tested this in a preregistered experiment across seven leading LLMs and twelve investment scenarios covering legitimate, high-risk, and objectively fraudulent opportunities, combining 3,360 AI advisory conversations with a 1,201-participant human benchmark. Contrary to predictions, motivated investor framing did not suppress AI fraud warnings; if anything, it marginally increased them. Endorsement reversal occurred in fewer than 3 in 1,000 observations. Human advisors endorsed fraudulent investments at baseline rates of 13-14%, versus 0% across all LLMs, and suppressed warnings under pressure at two to four times the AI rate. AI systems currently provide more consistent fraud warnings than lay humans in an identical advisory role.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
