TL;DR
This paper presents an ensemble system using fine-tuned BERT and ALBERT models for predicting missing abstract words in reading comprehension tasks, achieving strong results across multiple subtasks.
Contribution
It introduces an ensemble approach combining BERT and ALBERT for abstract word prediction in reading comprehension, demonstrating the effectiveness of MLM-based methods.
Findings
Ensemble of BERT and ALBERT improves prediction accuracy.
MLM-based approach outperforms other methods.
ALBERT alone performs best on Subtask 3.
Abstract
This paper describes our system for Task 4 of SemEval-2021: Reading Comprehension of Abstract Meaning (ReCAM). We participated in all subtasks where the main goal was to predict an abstract word missing from a statement. We fine-tuned the pre-trained masked language models namely BERT and ALBERT and used an Ensemble of these as our submitted system on Subtask 1 (ReCAM-Imperceptibility) and Subtask 2 (ReCAM-Nonspecificity). For Subtask 3 (ReCAM-Intersection), we submitted the ALBERT model as it gives the best results. We tried multiple approaches and found that Masked Language Modeling(MLM) based approach works the best.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · LAMB · Linear Warmup With Linear Decay · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · Dense Connections · ALBERT
