# Automation of a problem list using natural language processing

**Authors:** Stephane Meystre, Peter J Haug

PMC · DOI: 10.1186/1472-6947-5-30 · 2005-08-31

## TL;DR

This paper describes a system that uses natural language processing to automatically create and maintain accurate medical problem lists for hospitalized patients.

## Contribution

The novel contribution is an automated problem list system using NLP to extract cardiovascular diagnoses from free-text medical records.

## Key findings

- The system achieved 100% sensitivity and positive predictive value in detecting document sections.
- Sentence detection had 89% sensitivity and 94% positive predictive value.
- The system targets 64% of cardiovascular diagnosis instances using 80 frequently used medical problems.

## Abstract

The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained.

For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list.

The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences.

The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information.

## Full-text entities

- **Diseases:** Respiratory Failure (MESH:D012131), Arthritis (MESH:D001168), anginal pain (MESH:D010146), Restless legs   Emphysema (MESH:D012148), Arthralgias  Ischemic Heart Disease (MESH:D017202), Fatigue (MESH:D005221), Peptic Ulcer (MESH:D010437), Venous insufficiency (MESH:D014689), Pulmonary edema (MESH:D011654), Gastroesophageal reflux  Sinusitis (MESH:D005764), Mitral valve insufficiency   Cancer  Myocardial Infarction    Cardiac arrest (MESH:D008944), Urinary tract infection (MESH:D014552), Heart failure (MESH:D006333), ICD (OMIM:252500), Hypovolemia (MESH:D020896), Tricuspid valve insufficiency (MESH:D014262), Pericarditis (MESH:D010493), Arrhythmia  Infectious Endocarditis (MESH:D001145), Dysphagia  Renal insufficiency    Dyspnea (MESH:D051437), hyperkalemia (MESH:D006947), Headache (MESH:D006261), Hemoptysis (MESH:D006469), Left bundle branch block (MESH:D002037), Pericardial tamponade (MESH:D002305), Back pain (MESH:D001416), Pulmonary embolus (MESH:D004617), Epistaxis (MESH:D004844), Obesity (MESH:D009765), Pneumothorax (MESH:D011030), Tobacco Use Disorder (MESH:D014029), Coronary artery disease (MESH:D003324), Atrial fibrillation (MESH:D001281), angina (MESH:D000787), Hypertension (MESH:D006973), APL (MESH:D019973), cardiovascular (MESH:D002318), Cardiogenic Shock  Pain    Cardiomyopathy (MESH:D012770), Heart Murmur (MESH:D006337), Pulmonary hypertension (MESH:D006976), Hematemesis Varicose veins   Hematuria (MESH:D014648), Aortic valve insufficiency (MESH:D001022), Diabetes mellitus (MESH:D003920), Rheumatic heart disease (MESH:D012214), Constipation (MESH:D003248), dyspnea (MESH:D004417), Peripheral vascular disease (MESH:D016491), Deep vein thrombosis (MESH:D020246), Myocardial Infarction (MESH:D009203), Hypercholesteremia  Wheeze (MESH:D006937), Paroxysmal supraventric (MESH:D002819), Chest Pain (MESH:D002637), Coma (MESH:D003128), Mitral stenosis (MESH:D008946), Anemia, aplastic (MESH:D000741), Depression (MESH:D003866), Congenital heart disease (MESH:D006330), Atrial Septal Defects (MESH:D006344), Hypothyroidism (MESH:D007037), Pneumonia (MESH:D011014), Ventricular ectopic beats (MESH:D018879)
- **Chemicals:** MMTx (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC1208893/full.md

---
Source: https://tomesphere.com/paper/PMC1208893