Automated Supervised Feature Selection for Differentiated Patterns of Care
Catherine Wanjiru, William Ogallo, Girmaw Abebe Tadesse, Charles, Wachira, Isaiah Onando Mulang', Aisha Walcott-Bryant

TL;DR
This paper presents an automated feature selection pipeline combining multiple techniques to improve the detection of anomalous care patterns, emphasizing the importance of data distribution in selecting features.
Contribution
The study introduces a comprehensive feature selection pipeline for Differentiated Patterns of Care, integrating filters, wrappers, and embedded methods, and evaluates their effectiveness in anomaly detection.
Findings
Feature selection improves anomaly detection performance.
Data distribution influences the choice of feature selection technique.
Selected features enhance the identification of anomalous care subpopulations.
Abstract
An automated feature selection pipeline was developed using several state-of-the-art feature selection techniques to select optimal features for Differentiating Patterns of Care (DPOC). The pipeline included three types of feature selection techniques; Filters, Wrappers and Embedded methods to select the top K features. Five different datasets with binary dependent variables were used and their different top K optimal features selected. The selected features were tested in the existing multi-dimensional subset scanning (MDSS) where the most anomalous subpopulations, most anomalous subsets, propensity scores, and effect of measures were recorded to test their performance. This performance was compared with four similar metrics gained after using all covariates in the dataset in the MDSS pipeline. We found out that despite the different feature selection techniques used, the data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Pneumonia and Respiratory Infections · Bayesian Methods and Mixture Models
MethodsFeature Selection · Partition Filter Network
