Augmenting BERT Carefully with Underrepresented Linguistic Features
Aparna Balagopalan, Jekaterina Novikova

TL;DR
This paper enhances BERT-based models for Alzheimer's detection by identifying underrepresented linguistic features and supplementing them with handcrafted features, leading to improved classification accuracy.
Contribution
It introduces a method to identify linguistic features poorly captured by BERT and demonstrates that augmenting BERT with these features improves AD detection performance.
Findings
Augmenting BERT with handcrafted features improves AD classification accuracy by up to 5%.
Probing tasks reveal linguistic features inadequately represented in BERT layers.
Joint fine-tuning with additional features enhances model performance.
Abstract
Fine-tuned Bidirectional Encoder Representations from Transformers (BERT)-based sequence classification models have proven to be effective for detecting Alzheimer's Disease (AD) from transcripts of human speech. However, previous research shows it is possible to improve BERT's performance on various tasks by augmenting the model with additional information. In this work, we use probing tasks as introspection techniques to identify linguistic information not well-represented in various layers of BERT, but important for the AD detection task. We supplement these linguistic features in which representations from BERT are found to be insufficient with hand-crafted features externally, and show that jointly fine-tuning BERT in combination with these features improves the performance of AD classification by upto 5\% over fine-tuned BERT alone.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsLinear Layer · Residual Connection · Dense Connections · WordPiece · Layer Normalization · Attention Is All You Need · Adam · Linear Warmup With Linear Decay · Weight Decay · Dropout
