Occam's Gates
Jonathan Raiman, Szymon Sidor

TL;DR
This paper introduces a regularization technique for attention-based RNNs using L1 penalties on gating units, improving overfitting and interpretability across various sequence classification tasks.
Contribution
It proposes a novel L1 regularization method for gating units in RNNs, enhancing model interpretability and reducing overfitting.
Findings
Reduced overfitting on multiple tasks
Improved interpretability of model inputs
Effective regularization technique for attention-based RNNs
Abstract
We present a complimentary objective for training recurrent neural networks (RNN) with gating units that helps with regularization and interpretability of the trained model. Attention-based RNN models have shown success in many difficult sequence to sequence classification problems with long and short term dependencies, however these models are prone to overfitting. In this paper, we describe how to regularize these models through an L1 penalty on the activation of the gating units, and show that this technique reduces overfitting on a variety of tasks while also providing to us a human-interpretable visualization of the inputs used by the network. These tasks include sentiment analysis, paraphrase recognition, and question answering.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
MethodsInterpretability
