The Analysis about Building Cross-lingual Sememe Knowledge Base Based on   Deep Clustering Network

Xiaoran Li; Toshiaki Takano

arXiv:2208.05462·cs.CL·August 11, 2022

The Analysis about Building Cross-lingual Sememe Knowledge Base Based on Deep Clustering Network

Xiaoran Li, Toshiaki Takano

PDF

Open Access

TL;DR

This paper introduces an unsupervised deep clustering network approach to construct a multilingual sememe knowledge base, reducing reliance on manual annotations and capturing core semantic features across languages.

Contribution

It proposes a novel unsupervised method using deep clustering networks for building sememe KBs applicable to any language, leveraging multilingual word representations.

Findings

01

Low-dimensional sememe space retains main semantic features

02

Unsupervised approach reduces manual annotation biases

03

Method effective across multiple languages

Abstract

A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks, and we believe that by learning the smallest unit of meaning, computers can more easily understand human language. However, Existing sememe KBs are built on only manual annotation, human annotations have personal understanding biases, and the meaning of vocabulary will be constantly updated and changed with the times, and artificial methods are not always practical. To address the issue, we propose an unsupervised method based on a deep clustering network (DCN) to build a sememe KB, and you can use any language to build a KB through this method. We first learn the distributed representation of multilingual words, use MUSE to align them in a single vector space, learn the multi-layer meaning of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsALIGN