Relational Weight Priors in Neural Networks for Abstract Pattern   Learning and Language Modelling

Radha Kopparti; Tillman Weyde

arXiv:2103.06198·cs.CL·March 11, 2021

Relational Weight Priors in Neural Networks for Abstract Pattern Learning and Language Modelling

Radha Kopparti, Tillman Weyde

PDF

Open Access

TL;DR

This paper introduces ERBP, a Bayesian relational prior that enhances neural networks' ability to learn abstract patterns, improving generalisation and performance in NLP and sequence tasks.

Contribution

ERBP provides a novel relational inductive bias as a Bayesian prior, improving neural networks' systematic generalisation on abstract pattern learning tasks.

Findings

01

ERBP achieves near-perfect generalisation on synthetic abstract pattern tasks.

02

ERBP improves natural language and melody prediction tasks.

03

ERBP outperforms RBP and standard networks across multiple benchmarks.

Abstract

Deep neural networks have become the dominant approach in natural language processing (NLP). However, in recent years, it has become apparent that there are shortcomings in systematicity that limit the performance and data efficiency of deep learning in NLP. These shortcomings can be clearly shown in lower-level artificial tasks, mostly on synthetic data. Abstract patterns are the best known examples of a hard problem for neural networks in terms of generalisation to unseen data. They are defined by relations between items, such as equality, rather than their values. It has been argued that these low-level problems demonstrate the inability of neural networks to learn systematically. In this study, we propose Embedded Relation Based Patterns (ERBP) as a novel way to create a relational inductive bias that encourages learning equality and distance-based relations for abstract patterns.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques

MethodsTanh Activation · Sigmoid Activation · Gated Recurrent Unit · Long Short-Term Memory