Medal Matters: Probing LLMs' Failure Cases Through Olympic Rankings

Juhwan Choi; Seunguk Yu; JungMin Yun; YoungBin Kim

arXiv:2409.06518·cs.CL·January 23, 2026

Medal Matters: Probing LLMs' Failure Cases Through Olympic Rankings

Juhwan Choi, Seunguk Yu, JungMin Yun, YoungBin Kim

PDF

Open Access 1 Repo

TL;DR

This paper investigates how large language models understand and organize historical Olympic medal data, revealing strengths in fact recall but weaknesses in ranking tasks, which reflect their internal knowledge limitations.

Contribution

It introduces a novel evaluation framework using Olympic data to analyze LLMs' internal knowledge structures and highlights their difficulty in ranking tasks compared to fact retrieval.

Findings

01

LLMs excel at recalling medal counts

02

LLMs struggle with ranking teams

03

Reveals limitations in LLMs' internal knowledge organization

Abstract

Large language models (LLMs) have achieved remarkable success in natural language processing tasks, yet their internal knowledge structures remain poorly understood. This study examines these structures through the lens of historical Olympic medal tallies, evaluating LLMs on two tasks: (1) retrieving medal counts for specific teams and (2) identifying rankings of each team. While state-of-the-art LLMs excel in recalling medal counts, they struggle with providing rankings, highlighting a key difference between their knowledge organization and human reasoning. These findings shed light on the limitations of LLMs' internal knowledge integration and suggest directions for improvement. To facilitate further research, we release our code, dataset, and model outputs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

c-juhwan/olympics_analysis
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling