Improving Sequence Modeling Ability of Recurrent Neural Networks via   Sememes

Yujia Qin; Fanchao Qi; Sicong Ouyang; Zhiyuan Liu; Cheng Yang; Yasheng; Wang; Qun Liu; Maosong Sun

arXiv:1910.08910·cs.CL·August 20, 2020

Improving Sequence Modeling Ability of Recurrent Neural Networks via Sememes

Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng, Wang, Qun Liu, Maosong Sun

PDF

1 Repo

TL;DR

This paper introduces methods to incorporate sememes into RNNs, significantly enhancing their sequence modeling capabilities and robustness across various NLP tasks.

Contribution

It proposes three novel sememe incorporation techniques for RNNs and demonstrates their effectiveness across multiple benchmark datasets.

Findings

01

Sememe-incorporated RNNs outperform vanilla models in language modeling and NLP tasks.

02

Models with sememes show higher robustness against adversarial attacks.

03

The proposed methods are effective across different RNN architectures, including LSTM and GRU.

Abstract

Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications. However, most existing studies exploit sememes in specific tasks and few efforts are made to utilize sememes more fundamentally. In this paper, we propose to incorporate sememes into recurrent neural networks (RNNs) to improve their sequence modeling ability, which is beneficial to all kinds of downstream tasks. We design three different sememe incorporation methods and employ them in typical RNNs including LSTM, GRU and their bidirectional variants. In evaluation, we use several benchmark datasets involving PTB and WikiText-2 for language modeling, SNLI for natural language inference and another two datasets for sentiment analysis and paraphrase detection. Experimental results show evident and consistent improvement of our sememe-incorporated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/SememeRNN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Gated Recurrent Unit