A Comprehensive Overview of Large Language Models
Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar,, Muhammad Usman, Naveed Akhtar, Nick Barnes, Ajmal Mian

TL;DR
This paper offers a comprehensive overview of recent developments in Large Language Models, covering architectural, training, and application advancements to help researchers grasp the field's current landscape.
Contribution
It provides a systematic survey and a quick reference of diverse LLM topics, consolidating recent research to facilitate understanding and future progress.
Findings
Summarizes key advancements in LLM architectures and training strategies.
Highlights recent progress in multi-modal LLMs and efficiency improvements.
Provides a structured overview to aid researchers and practitioners.
Abstract
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
