Cohort-Based Active Modality Acquisition
Tillmann Rheude, Roland Eils, Benjamin Wild

TL;DR
This paper introduces a cohort-based active modality acquisition method that optimizes the selection of missing data modalities in multimodal datasets, improving resource efficiency in real-world applications.
Contribution
It proposes a novel cohort-level acquisition framework with imputation-based strategies and benchmarks, demonstrating effectiveness on large-scale datasets including UK Biobank.
Findings
Imputation-based strategies outperform entropy-based and random methods in guiding modality acquisition.
The approach effectively scales to datasets with up to 15 modalities.
Demonstrated success in guiding proteomics data acquisition for disease prediction in UK Biobank.
Abstract
Real-world multimodal machine learning often faces missing, costly-to-acquire modalities, raising the problem of which samples to prioritize for additional acquisition under a budget. Prior work mainly studies per-sample or training-time acquisition while test-time, cohort-level acquisition is less explored. We propose Cohort-based Active Modality Acquisition (CAMA), a novel test-time cohort-level modality acquisition setting, and introduce imputation-based acquisition strategies that estimate the expected utility of acquiring a missing modality, along with upper-bound heuristics for benchmarking. Experiments on datasets with up to 15 modalities demonstrate that our proposed imputation-based strategies can more effectively guide the acquisition of an additional modality for selected samples compared with methods relying solely on pre-acquisition information, entropy-based guidance, or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
