Ruminating Word Representations with Random Noised Masker

Hwiyeol Jo; Byoung-Tak Zhang

arXiv:1911.03459·cs.LG·November 12, 2019

Ruminating Word Representations with Random Noised Masker

Hwiyeol Jo, Byoung-Tak Zhang

PDF

Open Access

TL;DR

This paper presents GROVER, a novel training method that iteratively adds noise to word embeddings to enhance their quality and improve text classification performance.

Contribution

GROVER introduces a gradual noise addition technique during training, leading to better word representations and improved model performance over traditional methods.

Findings

01

Improves performance on most of five text classification datasets.

02

Can be combined with other regularization techniques for further gains.

03

Produces more fine-tuned and task-specific word embeddings.

Abstract

We introduce a training method for both better word representation and performance, which we call GROVER (Gradual Rumination On the Vector with maskERs). The method is to gradually and iteratively add random noises to word embeddings while training a model. GROVER first starts from conventional training process, and then extracts the fine-tuned representations. Next, we gradually add random noises to the word representations and repeat the training process from scratch, but initialize with the noised word representations. Through the re-training process, we can mitigate some noises to be compensated and utilize other noises to learn better representations. As a result, we can get word representations further fine-tuned and specialized on the task. When we experiment with our method on 5 text classification datasets, our method improves model performances on most of the datasets.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Speech and dialogue systems