Occam's Gates

Jonathan Raiman; Szymon Sidor

arXiv:1506.08251·cs.LG·June 30, 2015·1 cites

Occam's Gates

Jonathan Raiman, Szymon Sidor

PDF

Open Access

TL;DR

This paper introduces a regularization technique for attention-based RNNs using L1 penalties on gating units, improving overfitting and interpretability across various sequence classification tasks.

Contribution

It proposes a novel L1 regularization method for gating units in RNNs, enhancing model interpretability and reducing overfitting.

Findings

01

Reduced overfitting on multiple tasks

02

Improved interpretability of model inputs

03

Effective regularization technique for attention-based RNNs

Abstract

We present a complimentary objective for training recurrent neural networks (RNN) with gating units that helps with regularization and interpretability of the trained model. Attention-based RNN models have shown success in many difficult sequence to sequence classification problems with long and short term dependencies, however these models are prone to overfitting. In this paper, we describe how to regularize these models through an L1 penalty on the activation of the gating units, and show that this technique reduces overfitting on a variety of tasks while also providing to us a human-interpretable visualization of the inputs used by the network. These tasks include sentiment analysis, paraphrase recognition, and question answering.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)

MethodsInterpretability