Recoding latent sentence representations -- Dynamic gradient-based   activation modification in RNNs

Dennis Ulmer

arXiv:2101.00674·cs.CL·January 5, 2021

Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs

Dennis Ulmer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a gradient-based correction mechanism for RNNs to dynamically refine sentence representations during inference, inspired by human language correction, aiming to improve robustness and accuracy.

Contribution

It proposes a novel gradient-based activation modification method for RNNs that allows dynamic representation correction during inference, enhancing model flexibility.

Findings

01

Minor improvements over baseline achieved

02

Effectiveness varies with different error signals

03

Insights into model confidence and error cases

Abstract

In Recurrent Neural Networks (RNNs), encoding information in a suboptimal or erroneous way can impact the quality of representations based on later elements in the sequence and subsequently lead to wrong predictions and a worse model performance. In humans, challenging cases like garden path sentences (an instance of this being the infamous "The horse raced past the barn fell") can lead their language understanding astray. However, they are still able to correct their representation accordingly and recover when new information is encountered. Inspired by this, I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism: This way I hope to enable such models to dynamically adapt their inner representation of a sentence, adding a way to correct deviations as soon as they occur. This could therefore lead to more robust models using more flexible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kaleidophon/tenacious-toucan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory