Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs
Dennis Ulmer

TL;DR
This paper introduces a gradient-based correction mechanism for RNNs to dynamically refine sentence representations during inference, inspired by human language correction, aiming to improve robustness and accuracy.
Contribution
It proposes a novel gradient-based activation modification method for RNNs that allows dynamic representation correction during inference, enhancing model flexibility.
Findings
Minor improvements over baseline achieved
Effectiveness varies with different error signals
Insights into model confidence and error cases
Abstract
In Recurrent Neural Networks (RNNs), encoding information in a suboptimal or erroneous way can impact the quality of representations based on later elements in the sequence and subsequently lead to wrong predictions and a worse model performance. In humans, challenging cases like garden path sentences (an instance of this being the infamous "The horse raced past the barn fell") can lead their language understanding astray. However, they are still able to correct their representation accordingly and recover when new information is encountered. Inspired by this, I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism: This way I hope to enable such models to dynamically adapt their inner representation of a sentence, adding a way to correct deviations as soon as they occur. This could therefore lead to more robust models using more flexible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
