Distributed Stratified Locality Sensitive Hashing for Critical Event Prediction in the Cloud
Alessandro De Palma, Erik Hemberg, Una-May O'Reilly

TL;DR
This paper presents a distributed stratified locality sensitive hashing system designed for fast, latency-prioritized similarity prediction on large-scale medical waveform data in cloud environments, demonstrated on ICU hypotensive episode prediction.
Contribution
It introduces a scalable distributed LSH system optimized for cloud-based medical data analysis, balancing speed and accuracy for critical event prediction.
Findings
Achieves 21x speedup over exhaustive search with 10% MCC loss.
Scales to 40 processors on large datasets.
Potential for up to 100x speedup with acceptable MCC loss.
Abstract
The availability of massive healthcare data repositories calls for efficient tools for data-driven medicine. We introduce a distributed system for Stratified Locality Sensitive Hashing to perform fast similarity-based prediction on large medical waveform datasets. Our implementation, for an ICU use case, prioritizes latency over throughput and is targeted at a cloud environment. We demonstrate our system on Acute Hypotensive Episode prediction from Arterial Blood Pressure waveforms. On a dataset of million points, we show scaling up to processors and a speedup in number of comparisons to parallel exhaustive search at the price of a Matthews correlation coefficient (MCC) loss. Furthermore, if additional MCC loss can be tolerated, our system achieves speedups up to two orders of magnitude.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Efficient Wireless Sensor Networks · Advanced Computing and Algorithms · Web Data Mining and Analysis
