SportQA: A Benchmark for Sports Understanding in Large Language Models
Haotian Xia, Zhengbang Yang, Yuqing Wang, Rhys Tracy, Yun Zhao,, Dongdong Huang, Zezhi Chen, Yan Zhu, Yuan-fang Wang, Weining Shen

TL;DR
SportQA is a new benchmark with over 70,000 questions designed to evaluate and improve large language models' understanding of sports, from basic facts to complex reasoning, highlighting current limitations in LLMs' sports comprehension.
Contribution
We introduce SportQA, the first comprehensive sports knowledge benchmark for LLMs, enabling detailed assessment of their reasoning and understanding in sports contexts.
Findings
LLMs perform well on basic sports facts
LLMs struggle with complex, scenario-based sports reasoning
SportQA reveals gaps in current LLM capabilities
Abstract
A deep understanding of sports, a field rich in strategic and dynamic content, is crucial for advancing Natural Language Processing (NLP). This holds particular significance in the context of evaluating and advancing Large Language Models (LLMs), given the existing gap in specialized benchmarks. To bridge this gap, we introduce SportQA, a novel benchmark specifically designed for evaluating LLMs in the context of sports understanding. SportQA encompasses over 70,000 multiple-choice questions across three distinct difficulty levels, each targeting different aspects of sports knowledge from basic historical facts to intricate, scenario-based reasoning tasks. We conducted a thorough evaluation of prevalent LLMs, mainly utilizing few-shot learning paradigms supplemented by chain-of-thought (CoT) prompting. Our results reveal that while LLMs exhibit competent performance in basic sports…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Video Analysis and Summarization
