How Random is Random? Evaluating the Randomness and Humaness of LLMs'   Coin Flips

Katherine Van Koevering; Jon Kleinberg

arXiv:2406.00092·cs.AI·June 4, 2024

How Random is Random? Evaluating the Randomness and Humaness of LLMs' Coin Flips

Katherine Van Koevering, Jon Kleinberg

PDF

Open Access

TL;DR

This paper investigates how large language models generate binary sequences, revealing that GPT 4 and Llama 3 show human-like biases while GPT 3.5 behaves more randomly, raising questions about the nature of randomness and humanness in AI.

Contribution

The study provides a comparative analysis of LLMs' ability to produce random sequences, highlighting differences in bias and randomness among GPT models and Llama 3.

Findings

01

GPT 4 and Llama 3 exhibit human biases in randomness tasks

02

GPT 3.5 demonstrates more random, less biased behavior

03

The dichotomy raises questions about the utility of human-like versus random outputs

Abstract

One uniquely human trait is our inability to be random. We see and produce patterns where there should not be any and we do so in a predictable way. LLMs are supplied with human data and prone to human biases. In this work, we explore how LLMs approach randomness and where and how they fail through the lens of the well studied phenomena of generating binary random sequences. We find that GPT 4 and Llama 3 exhibit and exacerbate nearly every human bias we test in this context, but GPT 3.5 exhibits more random behavior. This dichotomy of randomness or humaness is proposed as a fundamental question of LLMs and that either behavior may be useful in different circumstances.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArt History and Market Analysis