Assigning Species Information to Corresponding Genes by a Sequence   Labeling Framework

Ling Luo; Chih-Hsuan Wei; Po-Ting Lai; Qingyu Chen; Rezarta Islamaj; Do\u{g}an; Zhiyong Lu

arXiv:2205.03853·cs.CL·October 17, 2022

Assigning Species Information to Corresponding Genes by a Sequence Labeling Framework

Ling Luo, Chih-Hsuan Wei, Po-Ting Lai, Qingyu Chen, Rezarta Islamaj, Do\u{g}an, Zhiyong Lu

PDF

1 Repo

TL;DR

This paper introduces a deep learning sequence-labeling framework for assigning species information to genes in research articles, significantly improving accuracy over traditional heuristic methods.

Contribution

The paper presents a novel deep learning-based sequence-labeling approach for gene-species assignment, outperforming rule-based methods in accuracy.

Findings

01

Accuracy improved from 65.8% to 81.3%.

02

Sequence-labeling reduces the number of pairs evaluated.

03

Open-source code and data available.

Abstract

The automatic assignment of species information to the corresponding genes in a research article is a critically important step in the gene normalization task, whereby a gene mention is normalized and linked to a database record or identifier by a text-mining algorithm. Existing methods typically rely on heuristic rules based on gene and species co-occurrence in the article, but their accuracy is suboptimal. We therefore developed a high-performance method, using a novel deep learning-based framework, to classify whether there is a relation between a gene and a species. Instead of the traditional binary classification framework in which all possible pairs of genes and species in the same article are evaluated, we treat the problem as a sequence-labeling task such that only a fraction of the pairs needs to be considered. Our benchmarking results show that our approach obtains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ncbi/speciesassignment
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.