Abstractors and relational cross-attention: An inductive bias for   explicit relational reasoning in Transformers

Awni Altabaa; Taylor Webb; Jonathan Cohen; John Lafferty

arXiv:2304.00195·stat.ML·April 16, 2024·6 cites

Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers

Awni Altabaa, Taylor Webb, Jonathan Cohen, John Lafferty

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces the Abstractor, a novel Transformer module with relational cross-attention that enhances explicit relational reasoning, improving generalization and sample efficiency across various relational tasks.

Contribution

The paper presents the Abstractor, a new Transformer extension with relational cross-attention, enabling explicit relational reasoning and better generalization from limited data.

Findings

01

Improved performance on simple discriminative relational tasks.

02

Dramatic sample efficiency gains on relational sequence-to-sequence tasks.

03

Consistent performance improvements on mathematical problem-solving tasks.

Abstract

An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awni00/abstractor
tfOfficial

Videos

Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers· slideslive

Taxonomy

TopicsAdvanced Text Analysis Techniques · Cognitive Science and Mapping · Child and Animal Learning Development

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Adam · Residual Connection