Advancements and limitations of LLMs in replicating human color-word associations
Makoto Fukushima, Shusuke Eshita, Hiroshige Fukuhara

TL;DR
This study evaluates the progress and limitations of large language models in replicating human color-word associations, revealing improvements over generations but persistent gaps compared to human performance, especially in complex categories.
Contribution
It provides a comprehensive comparison of LLMs' ability to mimic human color-word associations across generations and categories, highlighting both advancements and systematic differences.
Findings
GPT-4o achieves highest accuracy among LLMs but only about 50% median performance.
Performance varies significantly across color categories, with better results in Rhythm and Landscape.
Color discrimination patterns in LLMs correlate with human data, yet semantic associations differ systematically.
Abstract
Color-word associations play a fundamental role in human cognition and design applications. Large Language Models (LLMs) have become widely available and have demonstrated intelligent behaviors in various benchmarks with natural conversation skills. However, their ability to replicate human color-word associations remains understudied. We compared multiple generations of LLMs (from GPT-3 to GPT-4o) against human color-word associations using data collected from over 10,000 Japanese participants, involving 17 colors and 80 words (10 word from eight categories) in Japanese. Our findings reveal a clear progression in LLM performance across generations, with GPT-4o achieving the highest accuracy in predicting the best voted word for each color and category. However, the highest median performance was approximately 50% even for GPT-4o with visual inputs (chance level of 10%). Moreover, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCategorization, perception, and language
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Layer Normalization · Adam · Attention Dropout · Multi-Head Attention · Residual Connection
