TL;DR
This paper presents a novel approach for lexical complexity prediction using ELECTRA and a combination of classification and regression models, incorporating weak supervision signals to improve accuracy.
Contribution
It introduces a hybrid modeling approach for lexical complexity regression tasks and leverages weak supervision signals from Gloss-BERT to enhance performance.
Findings
Achieved MAE of 0.0654 on Sub-Task 1
Achieved MAE of 0.0811 on Sub-Task 2
Weak supervision from Gloss-BERT significantly improved results
Abstract
This paper describes our contribution to SemEval 2021 Task 1: Lexical Complexity Prediction. In our approach, we leverage the ELECTRA model and attempt to mirror the data annotation scheme. Although the task is a regression task, we show that we can treat it as an aggregation of several classification and regression models. This somewhat counter-intuitive approach achieved an MAE score of 0.0654 for Sub-Task 1 and MAE of 0.0811 on Sub-Task 2. Additionally, we used the concept of weak supervision signals from Gloss-BERT in our work, and it significantly improved the MAE score in Sub-Task 1.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Linear Warmup With Linear Decay · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · Dense Connections · Softmax · WordPiece
