Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings
Dat Quoc Nguyen, Karin Verspoor

TL;DR
This paper demonstrates that integrating character-based word embeddings into CNN and LSTM models significantly enhances chemical-disease relation extraction, achieving state-of-the-art results on the BioCreative-V CDR corpus.
Contribution
It introduces the use of character-based word representations into neural relation extraction models, improving their performance over previous approaches.
Findings
Character-based embeddings improve relation extraction accuracy
Models achieve state-of-the-art results on BioCreative-V CDR corpus
Both CNN and LSTM architectures benefit from character-level information
Abstract
We investigate the incorporation of character-based word representations into a standard CNN-based relation extraction model. We experiment with two common neural architectures, CNN and LSTM, to learn word vector representations from character embeddings. Through a task on the BioCreative-V CDR corpus, extracting relationships between chemicals and diseases, we show that models exploiting the character-based word representations improve on models that do not use this information, obtaining state-of-the-art result relative to previous neural approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
