To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph
Sufeng Duan, Hai Zhao

TL;DR
This paper offers a graph-structure explanation for self-attention network encoders, introduces a Multi-order-Graph model, and demonstrates improved performance in neural machine translation tasks.
Contribution
It proposes a novel graph-based explanation for SAN encoders and introduces the Graph-Transformer leveraging multi-order subgraphs for enhanced encoding.
Findings
Graph-Transformer improves translation performance
Multi-order-Graph effectively models SAN structures
Explanation links model depth and sentence length to graph properties
Abstract
In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG. Based on our explanation, we further introduce a Graph-Transformer by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Bioinformatics
