# Automatically Explaining Machine Learning Prediction Results: A   Demonstration on Type 2 Diabetes Risk Prediction

**Authors:** Gang Luo

arXiv: 1812.02852 · 2018-12-10

## TL;DR

This paper introduces a novel method for automatically explaining machine learning predictions in healthcare, demonstrated on type 2 diabetes risk prediction, achieving high explanation coverage without sacrificing accuracy.

## Contribution

The paper presents the first complete method for automatically explaining any machine learning model's predictions in healthcare without reducing its accuracy.

## Key findings

- Explained 87.4% of correct diabetes predictions
- Demonstrated applicability on real-world electronic medical records
- Maintained prediction accuracy while providing explanations

## Abstract

Background: Predictive modeling is a key component of solutions to many healthcare problems. Among all predictive modeling approaches, machine learning methods often achieve the highest prediction accuracy, but suffer from a long-standing open problem precluding their widespread use in healthcare. Most machine learning models give no explanation for their prediction results, whereas interpretability is essential for a predictive model to be adopted in typical healthcare settings. Methods: This paper presents the first complete method for automatically explaining results for any machine learning predictive model without degrading accuracy. We did a computer coding implementation of the method. Using the electronic medical record data set from the Practice Fusion diabetes classification competition containing patient records from all 50 states in the United States, we demonstrated the method on predicting type 2 diabetes diagnosis within the next year. Results: For the champion machine learning model of the competition, our method explained prediction results for 87.4% of patients who were correctly predicted by the model to have type 2 diabetes diagnosis within the next year. Conclusions: Our demonstration showed the feasibility of automatically explaining results for any machine learning predictive model without degrading accuracy.

---
Source: https://tomesphere.com/paper/1812.02852