BagBERT: BERT-based bagging-stacking for multi-topic classification
Lo\"ic Rakotoson, Charles Letaillieur, Sylvain Massip, Fr\'ejus, Laleye

TL;DR
BagBERT introduces a novel ensemble approach combining bagging and stacking of BERT-based models to improve multi-topic classification in COVID-19 literature, achieving high F1 scores.
Contribution
It presents a new two-stage ensemble method leveraging non-optimal weights and heterogeneous models for enhanced classification performance.
Findings
Achieved an Instance-based F1 of 92.96
Obtained a Label-based micro-F1 of 91.35
Outperformed classical models in COVID-19 literature classification
Abstract
This paper describes our submission on the COVID-19 literature annotation task at Biocreative VII. We proposed an approach that exploits the knowledge of the globally non-optimal weights, usually rejected, to build a rich representation of each label. Our proposed approach consists of two stages: (1) A bagging of various initializations of the training data that features weakly trained weights, (2) A stacking of heterogeneous vocabulary models based on BERT and RoBERTa Embeddings. The aggregation of these weak insights performs better than a classical globally efficient model. The purpose is the distillation of the richness of knowledge to a simpler and lighter model. Our system obtains an Instance-based F1 of 92.96 and a Label-based micro-F1 of 91.35.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · WordPiece · Dense Connections · Weight Decay · Softmax · Dropout · Residual Connection
