BIOptimus: Pre-training an Optimal Biomedical Language Model with   Curriculum Learning for Named Entity Recognition

Pavlova Vera; Mohammed Makhlouf

arXiv:2308.08625·cs.CL·August 21, 2023

BIOptimus: Pre-training an Optimal Biomedical Language Model with Curriculum Learning for Named Entity Recognition

Pavlova Vera, Mohammed Makhlouf

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces BIOptimus, a novel biomedical language model trained with curriculum learning and weight distillation, achieving state-of-the-art results on biomedical NER tasks by optimizing pre-training strategies.

Contribution

The paper proposes a new pre-training method combining curriculum learning and weight distillation, improving biomedical NER performance and pre-training efficiency.

Findings

01

BIOptimus outperforms existing biomedical LMs on NER tasks.

02

Pre-training with curriculum learning enhances model performance.

03

Weight distillation accelerates pre-training and boosts accuracy.

Abstract

Using language models (LMs) pre-trained in a self-supervised setting on large corpora and then fine-tuning for a downstream task has helped to deal with the problem of limited label data for supervised learning tasks such as Named Entity Recognition (NER). Recent research in biomedical language processing has offered a number of biomedical LMs pre-trained using different methods and techniques that advance results on many BioNLP tasks, including NER. However, there is still a lack of a comprehensive comparison of pre-training approaches that would work more optimally in the biomedical domain. This paper aims to investigate different pre-training methods, such as pre-training the biomedical LM from scratch and pre-training it in a continued fashion. We compare existing methods with our proposed pre-training method of initializing weights for new tokens by distilling existing weights from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rttl-ai/bioptimus
noneOfficial

Models

🤗
rttl-ai/BIOptimus
model· 5 dl· ♡ 2
5 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Attention Dropout · Linear Layer · Layer Normalization · Residual Connection · Dense Connections · Softmax · Weight Decay