TL;DR
This paper introduces a novel curvature-based feature selection method for electronic health records, improving classification performance by effectively reducing dimensionality in high-dimensional, unstructured healthcare data.
Contribution
The paper proposes a new filter-based feature selection method using Menger Curvature, demonstrating superior performance over PCA and recent approaches on multiple EHR datasets.
Findings
Achieved state-of-the-art classification accuracy on four EHR datasets.
Outperformed PCA and recent feature selection methods.
Source code is publicly available for reproducibility.
Abstract
Disruptive technologies provides unparalleled opportunities to contribute to the identifications of many aspects in pervasive healthcare, from the adoption of the Internet of Things through to Machine Learning (ML) techniques. As a powerful tool, ML has been widely applied in patient-centric healthcare solutions. To further improve the quality of patient care, Electronic Health Records (EHRs) are commonly adopted in healthcare facilities for analysis. It is a crucial task to apply AI and ML to analyse those EHRs for prediction and diagnostics due to their highly unstructured, unbalanced, incomplete, and high-dimensional nature. Dimensionality reduction is a common data preprocessing technique to cope with high-dimensional EHR data, which aims to reduce the number of features of EHR representation while improving the performance of the subsequent data analysis, e.g. classification. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection · Principal Components Analysis
