A BERT-based Dual Embedding Model for Chinese Idiom Prediction

Minghuan Tan; Jing Jiang

arXiv:2011.02378·cs.CL·November 5, 2020·1 cites

A BERT-based Dual Embedding Model for Chinese Idiom Prediction

Minghuan Tan, Jing Jiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a BERT-based dual embedding model for Chinese idiom prediction, effectively matching idiom embeddings with context representations to improve accuracy on a Chinese idiom cloze test dataset.

Contribution

The paper presents a novel dual embedding approach combined with context pooling in BERT for Chinese idiom prediction, outperforming previous methods.

Findings

01

Outperforms existing state-of-the-art on Chinese idiom cloze test

02

Both context pooling and dual embeddings significantly improve performance

03

Ablation studies confirm the effectiveness of each component

Abstract

Chinese idioms are special fixed phrases usually derived from ancient stories, whose meanings are oftentimes highly idiomatic and non-compositional. The Chinese idiom prediction task is to select the correct idiom from a set of candidate idioms given a context with a blank. We propose a BERT-based dual embedding model to encode the contextual words as well as to learn dual embeddings of the idioms. Specifically, we first match the embedding of each candidate idiom with the hidden representation corresponding to the blank in the context. We then match the embedding of each candidate idiom with the hidden representations of all the tokens in the context thorough context pooling. We further propose to use two separate idiom embeddings for the two kinds of matching. Experiments on a recently released Chinese idiom cloze test dataset show that our proposed method performs better than the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VisualJoyce/ChengyuBERT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining · Topic Modeling