IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity   Regression Task

Neil Rajiv Shirude; Sagnik Mukherjee; Tushar Shandhilya; Ananta; Mukherjee; Ashutosh Modi

arXiv:2104.01046·cs.CL·April 5, 2021

IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity Regression Task

Neil Rajiv Shirude, Sagnik Mukherjee, Tushar Shandhilya, Ananta, Mukherjee, Ashutosh Modi

PDF

1 Repo

TL;DR

This paper presents a novel approach for lexical complexity prediction using ELECTRA and a combination of classification and regression models, incorporating weak supervision signals to improve accuracy.

Contribution

It introduces a hybrid modeling approach for lexical complexity regression tasks and leverages weak supervision signals from Gloss-BERT to enhance performance.

Findings

01

Achieved MAE of 0.0654 on Sub-Task 1

02

Achieved MAE of 0.0811 on Sub-Task 2

03

Weak supervision from Gloss-BERT significantly improved results

Abstract

This paper describes our contribution to SemEval 2021 Task 1: Lexical Complexity Prediction. In our approach, we leverage the ELECTRA model and attempt to mirror the data annotation scheme. Although the task is a regression task, we show that we can treat it as an aggregation of several classification and regression models. This somewhat counter-intuitive approach achieved an MAE score of 0.0654 for Sub-Task 1 and MAE of 0.0811 on Sub-Task 2. Additionally, we used the concept of weak supervision signals from Gloss-BERT in our work, and it significantly improved the MAE score in Sub-Task 1.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neilrs123/Lexical-Complexity-Prediction
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Linear Warmup With Linear Decay · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · Dense Connections · Softmax · WordPiece