Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models
Victor H. Cid, James Mork

TL;DR
This paper explores the use of Transformer-based models like BERT to improve the automatic tagging of MEDLINE citations with Medical Subject Headings Publication Types, aiming to enhance biomedical literature indexing.
Contribution
It introduces the application of pre-trained Transformer models for PT prediction, demonstrating improved accuracy over traditional NLP methods.
Findings
Transformer models significantly improve PT tagging accuracy
Binary classifier ensembles enhance retrieval performance
Scalable and efficient biomedical indexing is achievable
Abstract
We investigated the feasibility of predicting Medical Subject Headings (MeSH) Publication Types (PTs) from MEDLINE citation metadata using pre-trained Transformer-based models BERT and DistilBERT. This study addresses limitations in the current automated indexing process, which relies on legacy NLP algorithms. We evaluated monolithic multi-label classifiers and binary classifier ensembles to enhance the retrieval of biomedical literature. Results demonstrate the potential of Transformer models to significantly improve PT tagging accuracy, paving the way for scalable, efficient biomedical indexing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · WordPiece · Weight Decay · Linear Layer · Linear Warmup With Linear Decay · Adam · Dense Connections · BERT · Softmax
