Analysis of Argument Structure Constructions in the Large Language Model BERT
Pegah Ramezani, Achim Schilling, Patrick Krauss

TL;DR
This paper explores how BERT encodes Argument Structure Constructions by analyzing token embeddings across layers, revealing layered processing patterns and the importance of specific tokens for construction differentiation.
Contribution
It extends previous LSTM analyses by systematically examining BERT's internal representations of various argument structure types using visualization, clustering, and probe methods.
Findings
CLS embeddings cluster in early layers (2-4)
Over 90% probe accuracy from layer 2 onward
OBJ tokens are crucial for differentiating argument structures
Abstract
This study investigates how BERT processes and represents Argument Structure Constructions (ASCs), extending previous LSTM analyses. Using a dataset of 2000 sentences across four ASC types (transitive, ditransitive, caused-motion, resultative), we analyzed BERT's token embeddings across 12 layers. Visualizations with MDS and t-SNE and clustering quantified by Generalized Discrimination Value (GDV) were used. Feedforward classifiers (probes) predicted construction categories from embeddings. CLS token embeddings clustered best in layers 2-4, decreased in intermediate layers, and slightly increased in final layers. DET and SUBJ embeddings showed consistent clustering in intermediate layers, VERB embeddings increased in clustering from layer 1 to 12, and OBJ embeddings peaked in layer 10. Probe accuracies indicated low construction information in layer 1, with over 90 percent accuracy from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Softmax · Dense Connections · Dropout · Linear Layer · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · WordPiece
