Systematic Generalization with Edge Transformers

Leon Bergen; Timothy J. O'Donnell; Dzmitry Bahdanau

arXiv:2112.00578·cs.CL·December 2, 2021

Systematic Generalization with Edge Transformers

Leon Bergen, Timothy J. O'Donnell, Dzmitry Bahdanau

PDF

Open Access 1 Repo 1 Video

TL;DR

Edge Transformer introduces a novel model combining vector edge states and triangular attention, significantly improving systematic generalization in natural language understanding tasks over traditional Transformers.

Contribution

The paper proposes Edge Transformer, integrating edge-based vector states and a triangular attention mechanism inspired by logic programming for enhanced generalization.

Findings

01

Outperforms baseline Transformers in relational reasoning

02

Achieves better results in semantic parsing tasks

03

Improves dependency parsing accuracy

Abstract

Recent research suggests that systematic generalization in natural language understanding remains a challenge for state-of-the-art neural models such as Transformers and Graph Neural Networks. To tackle this challenge, we propose Edge Transformer, a new model that combines inspiration from Transformers and rule-based symbolic AI. The first key idea in Edge Transformers is to associate vector states with every edge, that is, with every pair of input nodes -- as opposed to just every node, as it is done in the Transformer model. The second major innovation is a triangular attention mechanism that updates edge representations in a way that is inspired by unification from logic programming. We evaluate Edge Transformer on compositional generalization benchmarks in relational reasoning, semantic parsing, and dependency parsing. In all three settings, the Edge Transformer outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bergen/edgetransformer
pytorchOfficial

Videos

Systematic Generalization with Edge Transformers· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Label Smoothing · Dense Connections · Absolute Position Encodings · Multi-Head Attention · Residual Connection · Softmax