Universal Language Model Fine-tuning for Text Classification
Jeremy Howard, Sebastian Ruder

TL;DR
ULMFiT introduces a universal transfer learning approach for NLP that significantly improves text classification performance across various tasks with minimal labeled data.
Contribution
The paper presents ULMFiT, a novel fine-tuning method that applies to any NLP task, outperforming previous models and reducing data requirements.
Findings
Outperforms state-of-the-art on six text classification tasks
Reduces error by 18-24% on most datasets
Achieves comparable performance with only 100 labeled examples
Abstract
Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. Our method significantly outperforms the state-of-the-art on six text classification tasks, reducing the error by 18-24% on the majority of datasets. Furthermore, with only 100 labeled examples, it matches the performance of training from scratch on 100x more data. We open-source our pretrained models and code.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDropout · Adam · Sigmoid Activation · Tanh Activation · Temporal Activation Regularization · DropConnect · Long Short-Term Memory · Activation Regularization · Embedding Dropout · Variational Dropout
