To Understand Representation of Layer-aware Sequence Encoders as   Multi-order-graph

Sufeng Duan; Hai Zhao

arXiv:2101.06397·cs.CL·March 15, 2023

To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph

Sufeng Duan, Hai Zhao

PDF

Open Access

TL;DR

This paper offers a graph-structure explanation for self-attention network encoders, introduces a Multi-order-Graph model, and demonstrates improved performance in neural machine translation tasks.

Contribution

It proposes a novel graph-based explanation for SAN encoders and introduces the Graph-Transformer leveraging multi-order subgraphs for enhanced encoding.

Findings

01

Graph-Transformer improves translation performance

02

Multi-order-Graph effectively models SAN structures

03

Explanation links model depth and sentence length to graph properties

Abstract

In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG. Based on our explanation, we further introduce a Graph-Transformer by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Bioinformatics