Generative Pre-trained Transformer: A Comprehensive Review on Enabling   Technologies, Potential Applications, Emerging Challenges, and Future   Directions

Gokul Yenduri; Ramalingam M; Chemmalar Selvi G; Supriya Y; Gautam; Srivastava; Praveen Kumar Reddy Maddikunta; Deepti Raj G; Rutvij H Jhaveri,; Prabadevi B; Weizheng Wang; Athanasios V. Vasilakos; and Thippa Reddy; Gadekallu

arXiv:2305.10435·cs.CL·May 23, 2023·61 cites

Generative Pre-trained Transformer: A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions

Gokul Yenduri, Ramalingam M, Chemmalar Selvi G, Supriya Y, Gautam, Srivastava, Praveen Kumar Reddy Maddikunta, Deepti Raj G, Rutvij H Jhaveri,, Prabadevi B, Weizheng Wang, Athanasios V. Vasilakos, and Thippa Reddy, Gadekallu

PDF

Open Access

TL;DR

This comprehensive review explores the architecture, applications, challenges, and future directions of Generative Pre-trained Transformers (GPT), highlighting their transformative impact on natural language processing and related fields.

Contribution

It provides an in-depth analysis of GPT's architecture, training, enabling technologies, and discusses emerging challenges and future research directions.

Findings

01

GPT has revolutionized natural language processing tasks.

02

Significant challenges include model bias and computational requirements.

03

Future directions involve improving model efficiency and addressing ethical concerns.

Abstract

The Generative Pre-trained Transformer (GPT) represents a notable breakthrough in the domain of natural language processing, which is propelling us toward the development of machines that can understand and communicate using language in a manner that closely resembles that of humans. GPT is based on the transformer architecture, a deep neural network designed for natural language processing tasks. Due to their impressive performance on natural language processing tasks and ability to effectively converse, GPT have gained significant popularity among researchers and industrial communities, making them one of the most widely used and effective models in natural language processing and related fields, which motivated to conduct this review. This review provides a detailed overview of the GPT, including its architecture, working process, training procedures, enabling technologies, and its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Topic Modeling · COVID-19 diagnosis using AI

MethodsAttention Is All You Need · Cosine Annealing · Weight Decay · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · GPT · Linear Layer · Position-Wise Feed-Forward Layer