LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models
Aida Kostikova, Zhipin Wang, Deidamea Bajri, Ole P\"utz, Benjamin Paa{\ss}en, Steffen Eger

TL;DR
This paper presents a comprehensive, data-driven survey of research on limitations of large language models from 2022 to 2025, analyzing over 14,000 papers to identify key trends and topics.
Contribution
It introduces a novel semi-automated methodology for large-scale literature review and provides a dataset of annotated abstracts on LLM limitations.
Findings
Research on LLLMs has grown rapidly, reaching over 30% of LLM papers by 2025.
Reasoning, generalization, hallucination, bias, and security are the most studied limitations.
Topics in ACL remain stable, while arXiv shifts towards security, alignment, and multimodality.
Abstract
Large language model (LLM) research has grown rapidly, along with increasing concern about their limitations. In this survey, we conduct a data-driven, semi-automated review of research on limitations of LLMs (LLLMs) from 2022 to early 2025 using a bottom-up approach. From a corpus of 250,000 ACL and arXiv papers, we identify 14,648 relevant papers using keyword filtering, LLM-based classification, validated against expert labels, and topic clustering (via two approaches, HDBSCAN+BERTopic and LlooM). We find that the share of LLM-related papers increases over fivefold in ACL and nearly eightfold in arXiv between 2022 and 2025. Since 2022, LLLMs research grows even faster, reaching over 30% of LLM papers by 2025. Reasoning remains the most studied limitation, followed by generalization, hallucination, bias, and security. The distribution of topics in the ACL dataset stays relatively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
