Is neural language acquisition similar to natural? A chronological   probing study

Ekaterina Voloshina; Oleg Serikov; Tatiana Shavrina

arXiv:2207.00560·cs.CL·July 4, 2022

Is neural language acquisition similar to natural? A chronological probing study

Ekaterina Voloshina, Oleg Serikov, Tatiana Shavrina

PDF

1 Repo

TL;DR

This study uses chronological probing to analyze how transformer language models acquire linguistic knowledge during training, revealing early acquisition of various language features and inconsistencies in task performance.

Contribution

It introduces a novel chronological probing methodology and an open-source framework to analyze linguistic knowledge acquisition in transformer models over training time.

Findings

01

Linguistic information is acquired early in training.

02

Models capture features from morphology to discourse.

03

Models sometimes fail on easy tasks.

Abstract

The probing methodology allows one to obtain a partial representation of linguistic phenomena stored in the inner layers of the neural network, using external classifiers and statistical analysis. Pre-trained transformer-based language models are widely used both for natural language understanding (NLU) and natural language generation (NLG) tasks making them most commonly used for downstream applications. However, little analysis was carried out, whether the models were pre-trained enough or contained knowledge correlated with linguistic theory. We are presenting the chronological probing study of transformer English models such as MultiBERT and T5. We sequentially compare the information about the language learned by the models in the process of training on corpora. The results show that 1) linguistic information is acquired in the early stages of training 2) both language models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ekaterinavoloshina/chronological_probing
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Softmax · Multi-Head Attention · Residual Connection · SentencePiece · Attention Dropout