Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT
Anmol Nayak, Hari Prasad Timmapathini

TL;DR
This paper investigates how BERT makes decisions on sentence grammaticality by using Layer Integrated Gradients and constituency parse trees, revealing insights into its interpretability and potential improvements.
Contribution
It introduces a method combining Layer Integrated Gradients with constituency parse trees to explain BERT's decisions on linguistic acceptability.
Findings
LIG scores are smaller for acceptable sentences.
Specific parse tree subtrees contribute more to LIG.
High correlation between positive LIG and correct classifications.
Abstract
Linguistic Acceptability is the task of determining whether a sentence is grammatical or ungrammatical. It has applications in several use cases like Question-Answering, Natural Language Generation, Neural Machine Translation, where grammatical correctness is crucial. In this paper we aim to understand the decision-making process of BERT (Devlin et al., 2019) in distinguishing between Linguistically Acceptable sentences (LA) and Linguistically Unacceptable sentences (LUA). We leverage Layer Integrated Gradients Attribution Scores (LIG) to explain the Linguistic Acceptability criteria that are learnt by BERT on the Corpus of Linguistic Acceptability (CoLA) (Warstadt et al., 2018) benchmark dataset. Our experiments on 5 categories of sentences lead to the following interesting findings: 1) LIG for LA are significantly smaller in comparison to LUA, 2) There are specific subtrees of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Linear Layer · Attention Is All You Need · Adam · Linear Warmup With Linear Decay · Residual Connection · WordPiece · Attention Dropout · Dense Connections
