Power Law Graph Transformer for Machine Translation and Representation   Learning

Burc Gokden

arXiv:2107.02039·cs.CL·July 6, 2021

Power Law Graph Transformer for Machine Translation and Representation Learning

Burc Gokden

PDF

Open Access 3 Repos

TL;DR

This paper introduces the Power Law Graph Transformer, a novel model that learns graph structures with power law distributions for improved machine translation, demonstrating competitive BLEU scores on Turkish-English and Portuguese-English datasets.

Contribution

The paper proposes a new transformer architecture incorporating power law graph learning for both deductive and inductive tasks, enhancing representation learning in translation tasks.

Findings

01

Achieved BLEU scores of 17.79 and 28.33 on Turkish-English and Portuguese-English translation.

02

Demonstrated the model's ability to learn dataset-level and instance-level graph structures.

03

Showed how duality between quantization and manifold representations enables transformation between local and global outputs.

Abstract

We present the Power Law Graph Transformer, a transformer model with well defined deductive and inductive tasks for prediction and representation learning. The deductive task learns the dataset level (global) and instance level (local) graph structures in terms of learnable power law distribution parameters. The inductive task outputs the prediction probabilities using the deductive task output, similar to a transductive model. We trained our model with Turkish-English and Portuguese-English datasets from TED talk transcripts for machine translation and compared the model performance and characteristics to a transformer model with scaled dot product attention trained on the same experimental setup. We report BLEU scores of $17.79$ and $28.33$ on the Turkish-English and Portuguese-English translation tasks with our model, respectively. We also show how a duality between a quantization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Layer Normalization · Byte Pair Encoding · Dropout · Label Smoothing