Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts

Xiaolei Lu; Shamim Nemati

arXiv:2512.06652·cs.LG·January 28, 2026

Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts

Xiaolei Lu, Shamim Nemati

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Adaptive Test-Time Training (AdaTTT), a novel framework that dynamically adapts predictive models during inference for ICU patients needing mechanical ventilation, addressing domain shifts across hospitals.

Contribution

The work develops AdaTTT with information-theoretic bounds, self-supervised pretext tasks, and partial optimal transport for improved domain adaptation in EHR-based IMV prediction.

Findings

01

Achieves robust performance across multi-center ICU datasets.

02

Enhances model generalization with domain adaptation techniques.

03

Demonstrates competitive results in test-time adaptation benchmarks.

Abstract

Accurate prediction of the need for invasive mechanical ventilation (IMV) in intensive care units (ICUs) patients is crucial for timely interventions and resource allocation. However, variability in patient populations, clinical practices, and electronic health record (EHR) systems across institutions introduces domain shifts that degrade the generalization performance of predictive models during deployment. Test-Time Training (TTT) has emerged as a promising approach to mitigate such shifts by adapting models dynamically during inference without requiring labeled target-domain data. In this work, we introduce Adaptive Test-Time Training (AdaTTT), an enhanced TTT framework tailored for EHR-based IMV prediction in ICU settings. We begin by deriving information-theoretic bounds on the test-time prediction error and demonstrate that it is constrained by the uncertainty between the main and…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

Following are the strengths of the paper: 1. The paper addresses an important problem—predicting IMV need across hospitals with different EHR systems, patient populations, and clinical practices. 2. The novel dynamic masking strategy that adapts based on feature importance is an important contribution. Also, POT-based alignment is more flexible than rigid full-transport approaches and better suited to partial distribution shifts. 3. The information-theoretic bounds provide valuable insights

Weaknesses

Following are the main weakenesses of the paper: 1. Paper presentation and writing needs improvement. Specifically: - The prototype-guided adaptation component (Section 3.3.2) is not well-motivated. The rationale behind its inclusion is unclear, and the implementation details are insufficiently explained. - The methodology as a whole lacks clarity, largely due to the number of interconnected components. It would be helpful to include a comprehensive diagram or a formal algorithm to clearl

Reviewer 02Rating 6Confidence 4

Strengths

* This paper introduces a theoretically principled and practically motivated adaptation framework combining information theory, SSL, and transport alignment. * This paper includes experiments across real EHR datasets with fair baseline comparisons and ablation studies. * This paper is structured and well-written, with figures showing dynamic risk evolution and feature importance shifts. * The experiment result demonstrates consistent improvement in both predictive accuracy and calibration under

Weaknesses

* The testing windows for Site A (Jan–Jun 2024) and Site B (Jan 2023–Aug 2024) partially overlap, which raises the possibility that similar patient populations, care protocols, or even duplicated encounters could appear in both test sets. Such overlap could inflate generalization performance by reducing the effective distribution shift the model faces. * Although Appendix B.1 outlines inclusion criteria (≥ 5 hours ICU stay, no prior IMV, etc.), it does not explain how cohorts were sampled from e

Reviewer 03Rating 6Confidence 5

Strengths

- Info-theoretic error bound motivates SSL + prototype/POT. - Source-free, per-patient test-time adaptation with bounded updates and reset-to-clean safeguard. - Consistent multi-site improvements; improved calibration; expanded baselines and ablations.

Weaknesses

- Gains (~1%) may fall within noise; lacks decision-curve or threshold-level analysis. - Vision-based baselines may not adapt fairly to tabular EHR; no EHR-transformer baseline. - No subgroup audit; only tested on IMV/EHR; reproducibility details partly scattered.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Respiratory Support and Mechanisms · Sepsis Diagnosis and Treatment