Self-supervision for health insurance claims data: a Covid-19 use case

Emilia Apostolova; Fazle Karim; Guido Muscioni; Anubhav Rana; Jeffrey; Clyman

arXiv:2107.14591·cs.CL·August 2, 2021

Self-supervision for health insurance claims data: a Covid-19 use case

Emilia Apostolova, Fazle Karim, Guido Muscioni, Anubhav Rana, Jeffrey, Clyman

PDF

Open Access

TL;DR

This paper adapts self-supervised learning to health insurance claims data, demonstrating improved Covid-19 hospitalization prediction accuracy and increased model trustworthiness by pre-training on claims history.

Contribution

It introduces a novel self-supervised pre-training approach for medical claims data, enhancing predictive performance and model reliability in healthcare applications.

Findings

01

Pre-training on claims data improves Covid-19 hospitalization prediction.

02

Pre-training enhances model trustworthiness and stability.

03

Method outperforms baseline models in accuracy.

Abstract

In this work, we modify and apply self-supervision techniques to the domain of medical health insurance claims. We model patients' healthcare claims history analogous to free-text narratives, and introduce pre-trained `prior knowledge', later utilized for patient outcome predictions on a challenging task: predicting Covid-19 hospitalization, given a patient's pre-Covid-19 insurance claims history. Results suggest that pre-training on insurance claims not only produces better prediction performance, but, more importantly, improves the model's `clinical trustworthiness' and model stability/reliability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare · Big Data Technologies and Applications