ACES: Automatic Cohort Extraction System for Event-Stream Datasets
Justin Xu, Jack Gallifant, Alistair E. W. Johnson, Matthew B. A., McDermott

TL;DR
ACES is a system that simplifies the creation and reproduction of patient cohorts from event-stream healthcare data, enhancing reproducibility and accessibility in machine learning research on electronic health records.
Contribution
Introduces ACES, a novel library with a domain-specific language and pipeline for defining and extracting patient cohorts from event-stream data, improving reproducibility and ease of use.
Findings
Enables exact and conceptual reproduction of cohorts across datasets.
Supports datasets in MEDS and ESGPT formats.
Reduces barriers to defining ML tasks in healthcare datasets.
Abstract
Reproducibility remains a significant challenge in machine learning (ML) for healthcare. Datasets, model pipelines, and even task or cohort definitions are often private in this field, leading to a significant barrier in sharing, iterating, and understanding ML results on electronic health record (EHR) datasets. We address a significant part of this problem by introducing the Automatic Cohort Extraction System (ACES) for event-stream data. This library is designed to simultaneously simplify the development of tasks and cohorts for ML in healthcare and also enable their reproduction, both at an exact level for single datasets and at a conceptual level across datasets. To accomplish this, ACES provides: (1) a highly intuitive and expressive domain-specific configuration language for defining both dataset-specific concepts and dataset-agnostic inclusion or exclusion criteria, and (2) a…
Peer Reviews
Decision·ICLR 2025 Poster
+ The study is indeed very timely and sound. It addresses a critical area of concern in ML for health, by publishing offering a pipeline to standardize common cohort extraction tasks from the health datasets. + The paper does a nice job of providing public and anonymized resources. + The study adopts several recent prior studies, and seems to nicely complement those.
+ One major unclear aspect is providing robust evidence that the proposed tool/pipeline indeed achieves a good performance extracting the right cohort. A natural question, how one can ensure this fairly complex procedure will not miss critical samples or include incorrect ones in the final results. Is there any way to compare the results against some sort of baseline or ground truth? + While the paper is relatively sound and straightforward, and while some examples are provided, the whole docume
1. The paper addresses a significant barrier in healthcare ML research: the complexity of EHR data, especially in cohort extraction. The motivation is clear and relevant. 2. The authors provide an open-source Python package with thorough documentation, which will support reproducibility and benefit future research. 3. ACES is built on the MEDS format, showing potential for broad applicability across various datasets and tasks. **Note:** I think this library could offer value to ML in healthcare
A primary concern is the library’s utility for more complex tasks. For example, in the case of “CKD in diabetics within 5Y of kidney panel,” much of the effort lies in translating high-level criteria into specific medical features—a step that remains challenging and unresolved. Extracting cohorts from predefined features is relatively straightforward.
Overall this is a very interesting paper with some key strengths (but perhaps handicapped by some key weakness below). First considering the strengths we can identify the following - The authors aim to improve standardization and interpretability in healthcare ML. While there have been many approaches to this including OMOP and i2b2, ACES aims to improve integration with potentially non-compliant sources by enabling reproducible task definitions across multiple datasets, which can aid in consis
Given the promising aspects, they paper can be improved upon by addressing the following 1. First, the authors may have created the classic problem of `resolving n competing standards by ending up with n + 1 competing standards`. Specifically, despite claims of flexibility, ACES requires data to be formatted in supported structures, which could necessitate pre-processing for datasets outside MEDS or ESGPT. The authors doesn't describe in details the effort in taking a new dataset and making it
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting
MethodsLib
