Classifying Cyber-Risky Clinical Notes by Employing Natural Language   Processing

Suzanna Schmeelk; Martins Samuel Dogo; Yifan Peng; Braja Gopal Patra

arXiv:2203.12781·cs.CL·March 25, 2022

Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing

Suzanna Schmeelk, Martins Samuel Dogo, Yifan Peng, Braja Gopal Patra

PDF

TL;DR

This paper develops NLP-based models to classify the cyber risk level of clinical notes, aiming to enhance patient data privacy and security in electronic health records.

Contribution

It introduces novel NLP classification methods specifically targeting sensitive information risk in clinical notes, addressing a gap in existing de-identification techniques.

Findings

01

SVM with word2vec features achieved an F1-score of 0.792

02

Models can identify risk areas within clinical notes

03

Supports improved privacy protection in health data sharing

Abstract

Clinical notes, which can be embedded into electronic medical records, document patient care delivery and summarize interactions between healthcare providers and patients. These clinical notes directly inform patient care and can also indirectly inform research and quality/safety metrics, among other indirect metrics. Recently, some states within the United States of America require patients to have open access to their clinical notes to improve the exchange of patient information for patient care. Thus, developing methods to assess the cyber risks of clinical notes before sharing and exchanging data is critical. While existing natural language processing techniques are geared to de-identify clinical notes, to the best of our knowledge, few have focused on classifying sensitive-information risk, which is a fundamental step toward developing effective, widespread protection of patient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSupport Vector Machine