Convolutional neural networks for chemical-disease relation extraction   are improved with character-based word embeddings

Dat Quoc Nguyen; Karin Verspoor

arXiv:1805.10586·cs.CL·May 29, 2018

Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings

Dat Quoc Nguyen, Karin Verspoor

PDF

TL;DR

This paper demonstrates that integrating character-based word embeddings into CNN and LSTM models significantly enhances chemical-disease relation extraction, achieving state-of-the-art results on the BioCreative-V CDR corpus.

Contribution

It introduces the use of character-based word representations into neural relation extraction models, improving their performance over previous approaches.

Findings

01

Character-based embeddings improve relation extraction accuracy

02

Models achieve state-of-the-art results on BioCreative-V CDR corpus

03

Both CNN and LSTM architectures benefit from character-level information

Abstract

We investigate the incorporation of character-based word representations into a standard CNN-based relation extraction model. We experiment with two common neural architectures, CNN and LSTM, to learn word vector representations from character embeddings. Through a task on the BioCreative-V CDR corpus, extracting relationships between chemicals and diseases, we show that models exploiting the character-based word representations improve on models that do not use this information, obtaining state-of-the-art result relative to previous neural approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory