Beyond Topics: Discovering Latent Healthcare Objectives from Event Sequences
Adrian Caruana, Madhushi Bandara, Daniel Catchpoole, Paul J Kennedy

TL;DR
This paper introduces CaSE, a sequence encoder that captures the order of events in EHR data to better identify underlying healthcare objectives, outperforming traditional topic models like LDA.
Contribution
The paper presents CaSE, a novel event-level sequence encoder that improves the discovery of latent healthcare objectives from EHR data by considering event order and individual events.
Findings
CaSE outperforms LDA by up to 37% in synthetic data.
CaSE identifies meaningful healthcare representations in MIMIC-III.
Sequence order improves healthcare objective detection.
Abstract
A meaningful understanding of clinical protocols and patient pathways helps improve healthcare outcomes. Electronic health records (EHR) reflect real-world treatment behaviours that are used to enhance healthcare management but present challenges; protocols and pathways are often loosely defined and with elements frequently not recorded in EHRs, complicating the enhancement. To solve this challenge, healthcare objectives associated with healthcare management activities can be indirectly observed in EHRs as latent topics. Topic models, such as Latent Dirichlet Allocation (LDA), are used to identify latent patterns in EHR data. However, they do not examine the ordered nature of EHR sequences, nor do they appraise individual events in isolation. Our novel approach, the Categorical Sequence Encoder (CaSE) addresses these shortcomings. The sequential nature of EHRs is captured by CaSE's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
MethodsLinear Discriminant Analysis
