LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short   Text Matching

Boer Lyu; Lu Chen; Su Zhu; Kai Yu

arXiv:2102.12671·cs.CL·February 26, 2021

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

Boer Lyu, Lu Chen, Su Zhu, Kai Yu

PDF

1 Repo 1 Video

TL;DR

This paper introduces LET, a graph transformer model that leverages external linguistic knowledge and multi-granularity input to improve Chinese short text matching, addressing polysemy and segmentation issues.

Contribution

The paper proposes a novel LET model that integrates HowNet knowledge and lattice graphs, enhancing semantic understanding and robustness in Chinese text matching.

Findings

01

Outperforms existing text matching methods on two datasets

02

Both semantic knowledge and multi-granularity info are crucial

03

Model is complementary to pre-trained language models

Abstract

Chinese short text matching is a fundamental task in natural language processing. Existing approaches usually take Chinese characters or words as input tokens. They have two limitations: 1) Some Chinese words are polysemous, and semantic information is not fully utilized. 2) Some models suffer potential issues caused by word segmentation. Here we introduce HowNet as an external knowledge base and propose a Linguistic knowledge Enhanced graph Transformer (LET) to deal with word ambiguity. Additionally, we adopt the word lattice graph as input to maintain multi-granularity information. Our model is also complementary to pre-trained language models. Experimental results on two Chinese datasets show that our models outperform various typical text matching approaches. Ablation study also indicates that both semantic information and multi-granularity information are important for text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lbe0613/LET
mxnetOfficial

Videos

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching· underline

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Laplacian EigenMap · Label Smoothing · Softmax · Multi-Head Attention · Adam · Dense Connections