Semi-Supervised Learning for Text Classification by Layer Partitioning

Alexander Hanbo Li; Abhinav Sethy

arXiv:1911.11756·cs.LG·November 27, 2019

Semi-Supervised Learning for Text Classification by Layer Partitioning

Alexander Hanbo Li, Abhinav Sethy

PDF

TL;DR

This paper introduces a novel semi-supervised learning approach for text classification that decomposes neural networks into feature extraction and update components, enabling effective training on discrete text data.

Contribution

The method of layer partitioning with frozen feature extractors and trainable upper layers is a new approach tailored for semi-supervised text classification.

Findings

01

Improves accuracy over state-of-the-art methods on short texts.

02

Prevents catastrophic forgetting during training.

03

Effective with various SSL algorithms.

Abstract

Most recent neural semi-supervised learning algorithms rely on adding small perturbation to either the input vectors or their representations. These methods have been successful on computer vision tasks as the images form a continuous manifold, but are not appropriate for discrete input such as sentence. To adapt these methods to text input, we propose to decompose a neural network $M$ into two components $F$ and $U$ so that $M = U \circ F$ . The layers in $F$ are then frozen and only the layers in $U$ will be updated during most time of the training. In this way, $F$ serves as a feature extractor that maps the input to high-level representation and adds systematical noise using dropout. We can then train $U$ using any state-of-the-art SSL algorithms such as $Π$ -model, temporal ensembling, mean teacher, etc. Furthermore, this gradually unfreezing schedule also prevents a pretrained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.