BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with   Assembly Models

Aadil Islam; Weicheng Ma; Soroush Vosoughi

arXiv:2104.09040·cs.CL·July 29, 2021

BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models

Aadil Islam, Weicheng Ma, Soroush Vosoughi

PDF

1 Repo

TL;DR

This paper presents a system combining feature engineering and BERT-based neural networks for lexical complexity prediction, achieving competitive results and providing insights into model interpretability.

Contribution

The novel integration of handcrafted lexical, semantic, syntactic, and phonological features with BERT enhances prediction accuracy, especially in difficult cases.

Findings

01

Feature engineering improves extreme case predictions.

02

BERT attention maps reveal learned features.

03

Ensembled models perform well on multiple subtasks.

Abstract

This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context. We assemble a feature engineering-based model with a deep neural network model founded on BERT. While BERT itself performs competitively, our feature engineering-based model helps in extreme cases, eg. separating instances of easy and neutral difficulty. Our handcrafted features comprise a breadth of lexical, semantic, syntactic, and novel phonological measures. Visualizations of BERT attention maps offer insight into potential features that Transformers models may learn when fine-tuned for lexical complexity prediction. Our ensembled predictions score reasonably well for the single word subtask, and we demonstrate how they can be harnessed to perform well on the multi word expression subtask too.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Aadil101/BigGreen-at-LCP-2021
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Adam · Dense Connections · Softmax · Linear Warmup With Linear Decay · WordPiece · Attention Dropout