OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction
Yuki Taya, Lis Kanashiro Pereira, Fei Cheng, Ichiro Kobayashi

TL;DR
This paper introduces an ensemble approach using transformer-based models and various training techniques to improve lexical complexity prediction, addressing data scarcity and enhancing model robustness.
Contribution
The authors develop a novel ensemble model combining BERT and RoBERTa with advanced training strategies and handcrafted features for better lexical complexity prediction.
Findings
Achieved top-10 ranking in both sub-tasks of SemEval-2021 Task 1
Enhanced model robustness through multi-step fine-tuning and adversarial training
Improved generalization with handcrafted feature integration
Abstract
We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWEand outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models (i.e., BERT and RoBERTa), and on a variety of training methods for further enhancing model generalization and robustness:multi-step fine-tuning and multi-task learning, and adversarial training. Additionally, we propose to enrich contextual representations by adding hand-crafted features during training. Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Linear Warmup With Linear Decay · Softmax · Multi-Head Attention · Residual Connection · WordPiece · Weight Decay
