Large Language Models: A Survey

Shervin Minaee; Tomas Mikolov; Narjes Nikzad; Meysam Chenaghlu,; Richard Socher; Xavier Amatriain; Jianfeng Gao

arXiv:2402.06196·cs.CL·March 25, 2025·213 cites

Large Language Models: A Survey

Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu,, Richard Socher, Xavier Amatriain, Jianfeng Gao

PDF

Open Access

TL;DR

This survey reviews the development, characteristics, datasets, evaluation metrics, and benchmarks of large language models, highlighting recent advances, limitations, and future research directions in the rapidly evolving field.

Contribution

It provides a comprehensive overview of prominent LLMs, their techniques, datasets, evaluation methods, and performance comparisons, offering insights into current challenges and future directions.

Findings

01

Comparison of LLM performance on benchmarks

02

Analysis of datasets used for training and evaluation

03

Discussion of open challenges and future research areas

Abstract

Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data, as predicted by scaling laws \cite{kaplan2020scaling,hoffmann2022training}. The research area of LLMs, while very recent, is evolving rapidly in many different ways. In this paper, we review some of the most prominent LLMs, including three popular LLM families (GPT, LLaMA, PaLM), and discuss their characteristics, contributions and limitations. We also give an overview of techniques developed to build, and augment LLMs. We then survey popular datasets prepared for LLM training, fine-tuning, and evaluation, review widely used LLM evaluation metrics, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsSparse Evolutionary Training