NILE: Fast Natural Language Processing for Electronic Health Records
Sheng Yu, Tianrun Cai, Tianxi Cai

TL;DR
NILE is a fast and accurate NLP tool for EHR analysis that significantly outperforms existing software in speed while maintaining high accuracy, facilitating medical informatics research.
Contribution
This paper introduces NILE, a novel NLP package for EHR analysis that combines a modified prefix-tree search with rule-based semantic analysis, achieving unprecedented speed and competitive accuracy.
Findings
NILE is hundreds to thousands times faster than existing NLP tools.
NILE's accuracy matches top models on the 2010 i2b2/VA NLP challenge.
NILE operates via API, enhancing usability in medical informatics.
Abstract
Objective: Narrative text in Electronic health records (EHR) contain rich information for medical and data science studies. This paper introduces the design and performance of Narrative Information Linear Extraction (NILE), a natural language processing (NLP) package for EHR analysis that we share with the medical informatics community. Methods: NILE uses a modified prefix-tree search algorithm for named entity recognition, which can detect prefix and suffix sharing. The semantic analyses are implemented as rule-based finite state machines. Analyses include negation, location, modification, family history, and ignoring. Result: The processing speed of NILE is hundreds to thousands times faster than existing NLP software for medical text. The accuracy of presence analysis of NILE is on par with the best performing models on the 2010 i2b2/VA NLP challenge data. Conclusion: The speed,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
