Language Modeling with Sparse Product of Sememe Experts

Yihong Gu; Jun Yan; Hao Zhu; Zhiyuan Liu; Ruobing Xie; Maosong Sun,; Fen Lin; Leyu Lin

arXiv:1810.12387·cs.CL·October 31, 2018·6 cites

Language Modeling with Sparse Product of Sememe Experts

Yihong Gu, Jun Yan, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun,, Fen Lin, Leyu Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces SDLM, a sememe-driven language model that predicts words based on their underlying semantic units, enhancing interpretability and robustness over traditional word-based models.

Contribution

The paper proposes a novel sememe-based approach to language modeling, leveraging sememes as semantic experts to improve interpretability and performance.

Findings

01

SDLM outperforms traditional models in language modeling tasks.

02

SDLM improves headline generation quality.

03

Sememe-based modeling enhances model robustness.

Abstract

Most language modeling methods rely on large-scale data to statistically learn the sequential patterns of words. In this paper, we argue that words are atomic language units but not necessarily atomic semantic units. Inspired by HowNet, we use sememes, the minimum semantic units in human languages, to represent the implicit semantics behind words for language modeling, named Sememe-Driven Language Model (SDLM). More specifically, to predict the next word, SDLM first estimates the sememe distribution gave textual context. Afterward, it regards each sememe as a distinct semantic expert, and these experts jointly identify the most probable senses and the corresponding word. In this way, SDLM enables language models to work beyond word-level manipulation to fine-grained sememe-level semantics and offers us more powerful tools to fine-tune language models and improve the interpretability as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/SDLM-pytorch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsInterpretability