OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and   Robustness for Lexical Complexity Prediction

Yuki Taya; Lis Kanashiro Pereira; Fei Cheng; Ichiro Kobayashi

arXiv:2105.05535·cs.CL·June 16, 2021·1 cites

OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

Yuki Taya, Lis Kanashiro Pereira, Fei Cheng, Ichiro Kobayashi

PDF

Open Access

TL;DR

This paper introduces an ensemble approach using transformer-based models and various training techniques to improve lexical complexity prediction, addressing data scarcity and enhancing model robustness.

Contribution

The authors develop a novel ensemble model combining BERT and RoBERTa with advanced training strategies and handcrafted features for better lexical complexity prediction.

Findings

01

Achieved top-10 ranking in both sub-tasks of SemEval-2021 Task 1

02

Enhanced model robustness through multi-step fine-tuning and adversarial training

03

Improved generalization with handcrafted feature integration

Abstract

We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWEand outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models (i.e., BERT and RoBERTa), and on a variety of training methods for further enhancing model generalization and robustness:multi-step fine-tuning and multi-task learning, and adversarial training. Additionally, we propose to enrich contextual representations by adding hand-crafted features during training. Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Linear Warmup With Linear Decay · Softmax · Multi-Head Attention · Residual Connection · WordPiece · Weight Decay