Black-Box Segmentation of Electronic Medical Records
Hongyi Yuan, Sheng Yu

TL;DR
This paper introduces a neural network-based black-box segmentation method for electronic medical records, achieving high accuracy in sectioning EMRs regardless of format variations, thus improving NLP tasks like concept extraction.
Contribution
The work presents a novel, adaptable segmentation approach using simple sentence embeddings and neural networks, outperforming existing methods in accuracy and robustness.
Findings
Achieves over 98% segmentation accuracy on test data.
Effective across diverse EMR section formats.
Outperforms several advanced NLP segmentation methods.
Abstract
Electronic medical records (EMRs) contain the majority of patients' healthcare details. It is an abundant resource for developing an automatic healthcare system. Most of the natural language processing (NLP) studies on EMR processing, such as concept extraction, are adversely affected by the inaccurate segmentation of EMR sections. At the same time, not enough attention has been given to the accurate sectioning of EMRs. The information that may occur in section structures is unvalued. This work focuses on the segmentation of EMRs and proposes a black-box segmentation method using a simple sentence embedding model and neural network, along with a proper training method. To achieve universal adaptivity, we train our model on the dataset with different section headings formats. We compare several advanced deep learning-based NLP methods, and our method achieves the best segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
MethodsSoftmax · Attention Is All You Need
