Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models
Yannis Papanikolaou, Grigorios Tsoumakas, Manos Laliotis, Nikos, Markantonatos, Ioannis Vlahavas

TL;DR
This paper introduces a multi-label ensemble method with statistical validation for large-scale biomedical article indexing, achieving top results in the BioASQ challenge without heuristics.
Contribution
It presents a novel ensemble approach incorporating McNemar tests for validation, tailored for large-scale biomedical multi-label classification tasks.
Findings
Ensemble method outperformed other approaches in experiments.
Achieved first place in BioASQ 2014 first batch.
Automated machine learning approach proved highly competitive.
Abstract
Background: In this paper we present the approaches and methods employed in order to deal with a large scale multi-label semantic indexing task of biomedical papers. This work was mainly implemented within the context of the BioASQ challenge of 2014. Methods: The main contribution of this work is a multi-label ensemble method that incorporates a McNemar statistical significance test in order to validate the combination of the constituent machine learning algorithms. Some secondary contributions include a study on the temporal aspects of the BioASQ corpus (observations apply also to the BioASQ's super-set, the PubMed articles collection) and the proper adaptation of the algorithms used to deal with this challenging classification task. Results: The ensemble method we developed is compared to other approaches in experimental scenarios with subsets of the BioASQ corpus giving positive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Advanced Text Analysis Techniques · Machine Learning in Bioinformatics
See pages 1-last of large-scale-online.pdf
