Computationally Assisted Quality Control for Public Health Data Streams

Ananya Joshi; Kathryn Mazaitis; Roni Rosenfeld; Bryan Wilder

arXiv:2306.16914·cs.AI·January 4, 2024·2 cites

Computationally Assisted Quality Control for Public Health Data Streams

Ananya Joshi, Kathryn Mazaitis, Roni Rosenfeld, Bryan Wilder

PDF

Open Access 2 Repos

TL;DR

This paper introduces FlaSH, a scalable outlier detection framework tailored for public health data streams, improving irregularity identification and aiding experts in decision-making.

Contribution

The paper presents FlaSH, a novel, scalable outlier detection method explicitly designed for public health data streams, outperforming existing approaches.

Findings

01

FlaSH scales effectively to large data volumes.

02

FlaSH matches or exceeds deep learning methods in accuracy.

03

FlaSH identifies more helpful outliers for experts.

Abstract

Irregularities in public health data streams (like COVID-19 Cases) hamper data-driven decision-making for public health stakeholders. A real-time, computer-generated list of the most important, outlying data points from thousands of daily-updated public health data streams could assist an expert reviewer in identifying these irregularities. However, existing outlier detection frameworks perform poorly on this task because they do not account for the data volume or for the statistical properties of public health streams. Accordingly, we developed FlaSH (Flagging Streams in public Health), a practical outlier detection framework for public health data users that uses simple, scalable models to capture these statistical properties explicitly. In an experiment where human experts evaluate FlaSH and existing methods (including deep learning approaches), FlaSH scales to the data volume of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Anomaly Detection Techniques and Applications · Data Stream Mining Techniques