Maximally informative feature selection using Information Imbalance: Application to COVID-19 severity prediction
Romina Wild, Emanuela Sozio, Riccardo G. Margiotta, Fabiana Dellai,, Angela Acquasanta, Fabio Del Ben, Carlo Tascini, Francesco Curcio, Alessandro, Laio

TL;DR
This paper introduces a method using Information Imbalance to select the most informative clinical features for COVID-19 severity prediction, effectively handling heterogeneous, categorical, and incomplete data.
Contribution
It adapts the Information Imbalance technique for clinical data, enabling automatic, robust feature selection that accounts for feature correlation and data incompleteness.
Findings
Selected features are measurable at admission.
Optimal feature set size is 10-15 features.
Method outperforms traditional feature selection approaches.
Abstract
Clinical databases typically include, for each patient, many heterogeneous features, for example blood exams, the clinical history before the onset of the disease, the evolution of the symptoms, the results of imaging exams, and many others. We here propose to exploit a recently developed statistical approach, the Information Imbalance, to compare different subsets of patient features, and automatically select the set of features which is maximally informative for a given clinical purpose, especially in minority classes. We adapt the Information Imbalance approach to work in a clinical framework, where patient features are often categorical and are generally available only for a fraction of the patients. We apply this algorithm to a data set of ~ 1,300 patients treated for COVID-19 in Udine hospital before October 2021. Using this approach, we find combinations of features which, if…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Anomaly Detection Techniques and Applications · Artificial Intelligence in Healthcare
