What do Transformers Know about Government?
Jue Hou, Anisia Katinskaia, Lari Kotilainen, Sathianpong, Trangcasanchai, Anh-Duc Vu, Roman Yangarber

TL;DR
This study explores how transformer models like BERT encode grammatical government relations, revealing that early layers and specific attention heads contain significant information, and introduces a new dataset for linguistic research.
Contribution
The paper demonstrates that transformer models encode government relations across layers, identifies key attention heads, and provides a new dataset for grammatical construction analysis.
Findings
Government information is encoded mainly in early layers.
Few attention heads are sufficient to identify government relations.
Introduces the Government Bank dataset for linguistic research.
Abstract
This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. We use several probing classifiers, and data from two morphologically rich languages. Our experiments show that information about government is encoded across all transformer layers, but predominantly in the early layers of the model. We find that, for both languages, a small number of attention heads encode enough information about the government relations to enable us to train a classifier capable of discovering new, previously unknown types of government, never seen in the training data. Currently, data is lacking for the research community working on grammatical constructions, and government in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCorruption and Economic Development
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Dense Connections · Residual Connection · Softmax · Adam · Linear Warmup With Linear Decay · Layer Normalization · Attention Dropout
