IBERT: Idiom Cloze-style reading comprehension with Attention
Ruiyang Qin, Haozheng Luo, Zheheng Fan, Ziang Ren

TL;DR
This paper introduces IBERT, a BERT-based model that improves idiom cloze tasks by effectively capturing both local and global contexts, outperforming previous Seq2Seq approaches.
Contribution
It proposes a novel BERT-based embedding Seq2Seq model using XLNET and RoBERTa to better understand idiomatic expressions in context.
Findings
Outperforms existing state-of-the-art models on EPIE Static Corpus
Effectively captures both local and global context in idiom understanding
Demonstrates improved accuracy in idiom prediction
Abstract
Idioms are special fixed phrases usually derived from stories. They are commonly used in casual conversations and literary writings. Their meanings are usually highly non-compositional. The idiom cloze task is a challenge problem in Natural Language Processing (NLP) research problem. Previous approaches to this task are built on sequence-to-sequence (Seq2Seq) models and achieved reasonably well performance on existing datasets. However, they fall short in understanding the highly non-compositional meaning of idiomatic expressions. They also do not consider both the local and global context at the same time. In this paper, we proposed a BERT-based embedding Seq2Seq model that encodes idiomatic expressions and considers them in both global and local context. Our model uses XLNET as the encoder and RoBERTa for choosing the most probable idiom for a given context. Experiments on the EPIE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining
MethodsAttention Is All You Need · Linear Layer · Attention Dropout · Weight Decay · WordPiece · Byte Pair Encoding · Tanh Activation · Multi-Head Attention · BERT · Sigmoid Activation
