Interactive exploration of population scale pharmacoepidemiology   datasets

Tengel Ekrem Skar; Einar Holsb{\o}; Kristian Svendsen; Lars Ailo Bongo

arXiv:2005.09890·q-bio.QM·June 2, 2021

Interactive exploration of population scale pharmacoepidemiology datasets

Tengel Ekrem Skar, Einar Holsb{\o}, Kristian Svendsen, Lars Ailo Bongo

PDF

1 Repo

TL;DR

This paper introduces an interactive tool that combines scalable data processing, machine learning, and visualization to analyze large pharmacoepidemiology datasets efficiently, enabling new insights into drug and adverse reaction patterns.

Contribution

The authors developed an integrated, open-source platform supporting scalable data analysis, machine learning, and visualization for population-scale pharmacoepidemiology data.

Findings

01

Preprocessed 384 million prescriptions in 2 minutes

02

Trained models in seconds, visualized results in milliseconds

03

Demonstrated effective analysis of large prescription datasets

Abstract

Population-scale drug prescription data linked with adverse drug reaction (ADR) data supports the fitting of models large enough to detect drug use and ADR patterns that are not detectable using traditional methods on smaller datasets. However, detecting ADR patterns in large datasets requires tools for scalable data processing, machine learning for data analysis, and interactive visualization. To our knowledge no existing pharmacoepidemiology tool supports all three requirements. We have therefore created a tool for interactive exploration of patterns in prescription datasets with millions of samples. We use Spark to preprocess the data for machine learning and for analyses using SQL queries. We have implemented models in Keras and the scikit-learn framework. The model results are visualized and interpreted using live Python coding in Jupyter. We apply our tool to explore a 384 million…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uit-hdl/norpd_prescription_analyses
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.