TL;DR
This study evaluates how well transformer-based language models understand complex linguistic phenomena like relative clauses in American English, revealing strengths in grammatical knowledge but weaknesses in semantic understanding.
Contribution
It provides a detailed comparison of BERT, RoBERTa, and ALBERT in capturing linguistic knowledge, emphasizing the importance of diverse evaluation methods beyond probing.
Findings
Models capture grammaticality well
Semantic knowledge remains weak in models
Evaluation methods should include diagnostic and masked prediction tasks
Abstract
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance. Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models' performance. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Linear Layer · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Weight Decay · Adam · Residual Connection
