AMMUS : A Survey of Transformer-based Pretrained Models in Natural   Language Processing

Katikapalli Subramanyam Kalyan; Ajit Rajasekharan; Sivanesan Sangeetha

arXiv:2108.05542·cs.CL·August 31, 2021·38 cites

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews transformer-based pretrained language models in NLP, covering their evolution, core concepts, taxonomy, benchmarks, libraries, and future research directions.

Contribution

It introduces a new taxonomy of T-PTLMs and provides an extensive overview of concepts, benchmarks, and tools, serving as a valuable reference for researchers.

Findings

01

Summarizes various core concepts and pretraining methods.

02

Provides a taxonomy of T-PTLMs.

03

Highlights future research directions.

Abstract

Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised learning and transfer learning. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning and transfer this knowledge to downstream tasks. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch. In this comprehensive survey paper, we initially give a brief overview of self-supervised learning. Next, we explain various core concepts like pretraining, pretraining methods, pretraining tasks, embeddings and downstream adaptation methods. Next, we present a new taxonomy of T-PTLMs and then give brief overview of various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tunib-ai/parallelformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Softmax · Discriminative Fine-Tuning · Dense Connections · WordPiece · Byte Pair Encoding