A Survey on Large Language Models from Concept to Implementation
Chen Wang, Jin Zhao, Jiaqi Gong

TL;DR
This survey comprehensively reviews the development, applications, and impact of large language models based on Transformer architectures, emphasizing their versatility across NLP and other domains.
Contribution
It provides an in-depth overview of recent research on Transformer-based LLMs, highlighting their applications, advancements, and potential future directions.
Findings
Transformers have revolutionized NLP applications.
LLMs like GPT enable diverse tasks beyond traditional NLP.
Transformer models are key to future AI innovations.
Abstract
Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series. This exploration focuses on the transformative impact of artificial intelligence (AI) driven tools in revolutionizing traditional tasks like coding and problem-solving, while also paving new paths in research and development across diverse industries. From code interpretation and image captioning to facilitating the construction of interactive systems and advancing computational domains, Transformer models exemplify a synergy of deep learning, data analysis, and neural network design. This survey provides an in-depth look at the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Cosine Annealing · Linear Warmup With Cosine Annealing · Softmax · Dropout
