Language Model Pre-Training with Sparse Latent Typing

Liliang Ren; Zixuan Zhang; Han Wang; Clare R. Voss; Chengxiang Zhai,; Heng Ji

arXiv:2210.12582·cs.CL·October 28, 2022

Language Model Pre-Training with Sparse Latent Typing

Liliang Ren, Zixuan Zhang, Han Wang, Clare R. Voss, Chengxiang Zhai,, Heng Ji

PDF

Open Access 1 Repo

TL;DR

This paper introduces Sparse Latent Typing, a novel pre-training objective for language models that encourages learning interpretable, sparse, sentence-level latent types, significantly enhancing performance on information extraction tasks.

Contribution

The paper proposes a new pre-training objective, Sparse Latent Typing, enabling models to learn interpretable, sparse sentence-level latent types without external knowledge.

Findings

01

Learns interpretable latent type categories in a self-supervised manner.

02

Improves downstream information extraction tasks in supervised and few-shot settings.

03

Achieves significant performance gains over baseline models.

Abstract

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

renll/sparselt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification