Training Compact Models for Low Resource Entity Tagging using   Pre-trained Language Models

Peter Izsak; Shira Guskin; Moshe Wasserblat

arXiv:1910.06294·cs.CL·October 18, 2019

Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

Peter Izsak, Shira Guskin, Moshe Wasserblat

PDF

TL;DR

This paper proposes a semi-supervised method to train compact, efficient models for low-resource named entity recognition by leveraging pre-trained language models and unlabeled data, achieving high compression and fast inference.

Contribution

It introduces a novel semi-supervised training approach that combines transfer learning with model compression for low-resource NER tasks, enabling deployment on edge devices.

Findings

01

Achieves 36x model compression with competitive accuracy.

02

Runs significantly faster in inference compared to large pre-trained models.

03

Enables deployment of NER models in resource-constrained environments.

Abstract

Training models on low-resource named entity recognition tasks has been shown to be a challenge, especially in industrial applications where deploying updated models is a continuous effort and crucial for business operations. In such cases there is often an abundance of unlabeled data, while labeled data is scarce or unavailable. Pre-trained language models trained to extract contextual features from text were shown to improve many natural language processing (NLP) tasks, including scarcely labeled tasks, by leveraging transfer learning. However, such models impose a heavy memory and computational burden, making it a challenge to train and deploy such models for inference use. In this work-in-progress we combined the effectiveness of transfer learning provided by pre-trained masked language models with a semi-supervised approach to train a fast and compact model using labeled and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.