PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using   Human Phenotype Ontology

Ling Luo; Shankai Yan; Po-Ting Lai; Daniel Veltri; Andrew Oler,; Sandhya Xirasagar; Rajarshi Ghosh; Morgan Similuk; Peter N. Robinson; Zhiyong; Lu

arXiv:2009.08478·cs.CL·January 26, 2021

PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using Human Phenotype Ontology

Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler,, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N. Robinson, Zhiyong, Lu

PDF

TL;DR

PhenoTagger is a hybrid approach combining dictionary matching and deep learning to improve phenotype concept recognition in biomedical texts, reducing the need for manual annotation.

Contribution

It introduces a novel hybrid method that leverages distant supervision and deep learning for phenotype recognition, enhancing performance without extensive manual annotation.

Findings

01

Outperforms previous methods on HPO corpora

02

Achieves competitive results without manual training data

03

Demonstrates generalizability to disease ontology MEDIC

Abstract

Automatic phenotype concept recognition from unstructured text remains a challenging task in biomedical text mining research. Previous works that address the task typically use dictionary-based matching methods, which can achieve high precision but suffer from lower recall. Recently, machine learning-based methods have been proposed to identify biomedical concepts, which can recognize more unseen concept synonyms by automatic feature learning. However, most methods require large corpora of manually annotated data for model training, which is difficult to obtain due to the high cost of human annotation. In this paper, we propose PhenoTagger, a hybrid method that combines both dictionary and machine learning-based methods to recognize Human Phenotype Ontology (HPO) concepts in unstructured biomedical text. We first use all concepts and synonyms in HPO to construct a dictionary, which is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsHyper-parameter optimization