Language Model Pre-Training with Sparse Latent Typing
Liliang Ren, Zixuan Zhang, Han Wang, Clare R. Voss, Chengxiang Zhai,, Heng Ji

TL;DR
This paper introduces Sparse Latent Typing, a novel pre-training objective for language models that encourages learning interpretable, sparse, sentence-level latent types, significantly enhancing performance on information extraction tasks.
Contribution
The paper proposes a new pre-training objective, Sparse Latent Typing, enabling models to learn interpretable, sparse sentence-level latent types without external knowledge.
Findings
Learns interpretable latent type categories in a self-supervised manner.
Improves downstream information extraction tasks in supervised and few-shot settings.
Achieves significant performance gains over baseline models.
Abstract
Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
