# Explanation Beyond Individual Features: Instance-wise Feature Grouping for EHR Predictive Analytics

**Authors:** Chin Wang Cheong, Kejing Yin, William K. Cheung, Ivor Tsang

PMC · DOI: 10.1007/s41666-025-00222-8 · 2025-11-06

## TL;DR

This paper introduces FlexGPC, a method for improving the stability and accuracy of feature selection in clinical prediction models using electronic health records.

## Contribution

FlexGPC introduces flexible feature group representation and combination for robust instance-wise feature selection in EHR predictive analytics.

## Key findings

- FlexGPC outperforms state-of-the-art methods in prediction accuracy and feature selection stability.
- The method enables computational phenotyping through identified feature groups.
- Experiments were conducted on real-world EHR data for mortality and next-admission diagnosis prediction.

## Abstract

Identifying relevant input features which contribute to the output of a clinical prediction model can enhance the model explainability. To allow the explainability to be more personalized, instance-wise feature selection (IWFS) methods can be adopted where features are selected specifically for each input instance. Existing IWFS methods often grapple with feature selection instability, and thus precarious interpretation. As relevant features among the instances in a dataset do overlap, feature grouping tricks have been proposed to regularize the selection, but often at the expense of sacrificing the downstream prediction accuracy. To this end, we propose a novel instance-wise feature grouping method called FlexGPC to achieve robust and stable selection by learning i) flexible representation for feature groups, and ii) flexible combination of feature groups implemented using neural networks. To evaluate the effectiveness of FlexGPC, we explore various feature group combination schemes and conduct extensive experiments for performance comparison using real-world electronic health records (EHR) data. Our experimental results show that FlexGPC outperforms all the SOTA baselines in terms of accuracy and feature selection stability for both downstream mortality and next-admission diagnosis prediction tasks. We also illustrate that computational phenotyping can be achieved at the same time, with the identified feature groups being the potential phenotypes.

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12873021/full.md

---
Source: https://tomesphere.com/paper/PMC12873021