Chemical-protein relation extraction with ensembles of SVM, CNN, and RNN models
Yifan Peng, Anthony Rios, Ramakanth Kavuluru, Zhiyong Lu

TL;DR
This paper presents an ensemble machine learning approach combining SVM, CNN, and RNN models for extracting chemical-protein relations from biomedical texts, achieving top performance in a BioCreative challenge.
Contribution
The study introduces a novel ensemble system that effectively combines different machine learning models for chemical-protein relation extraction, outperforming previous methods.
Findings
Achieved an F-score of 0.6410 on the CHEMPROT dataset.
Outperformed other submissions in the 2017 BioCreative challenge.
Demonstrated the effectiveness of ensemble methods in biomedical relation extraction.
Abstract
Text mining the relations between chemicals and proteins is an increasingly important task. The CHEMPROT track at BioCreative VI aims to promote the development and evaluation of systems that can automatically detect the chemical-protein relations in running text (PubMed abstracts). This manuscript describes our submission, which is an ensemble of three systems, including a Support Vector Machine, a Convolutional Neural Network, and a Recurrent Neural Network. Their output is combined using a decision based on majority voting or stacking. Our CHEMPROT system obtained 0.7266 in precision and 0.5735 in recall for an f-score of 0.6410, demonstrating the effectiveness of machine learning-based approaches for automatic relation extraction from biomedical literature. Our submission achieved the highest performance in the task during the 2017 challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Natural Language Processing Techniques
