Sparsity-based Feature Selection for Anomalous Subgroup Discovery
Girmaw Abebe Tadesse, William Ogallo, Catherine Wanjiru, Charles, Wachira, Isaiah Onando Mulang', Vibha Anand, Aisha Walcott-Bryant, Skyler, Speakman

TL;DR
This paper introduces SAFS, a scalable, model-agnostic feature selection framework based on sparsity, which improves anomalous subgroup discovery by reducing computation time and enhancing detection performance.
Contribution
The paper presents a novel sparsity-based feature selection method that is scalable, model-agnostic, and improves efficiency and accuracy in anomalous subgroup detection.
Findings
SAFS reduces computation time by over 3 times.
SAFS maintains detection performance comparable to existing methods.
SAFS outperforms multiple baseline feature selection techniques.
Abstract
Anomalous pattern detection aims to identify instances where deviation from normalcy is evident, and is widely applicable across domains. Multiple anomalous detection techniques have been proposed in the state of the art. However, there is a common lack of a principled and scalable feature selection method for efficient discovery. Existing feature selection techniques are often conducted by optimizing the performance of prediction outcomes rather than its systemic deviations from the expected. In this paper, we proposed a sparsity-based automated feature selection (SAFS) framework, which encodes systemic outcome deviations via the sparsity of feature-driven odds ratios. SAFS is a model-agnostic approach with usability across different discovery techniques. SAFS achieves more than reduction in computation time while maintaining detection performance when validated on publicly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data-Driven Disease Surveillance · Network Security and Intrusion Detection
MethodsFeature Selection
