An Experiment on Feature Selection using Logistic Regression
Raisa Islam, Subhasish Mazumdar, Rakibul Islam

TL;DR
This study compares feature selection methods based on L1 and L2 regularization in logistic regression, demonstrating that their combined feature set performs similarly to individual methods across different models and metrics.
Contribution
It synthesizes L1 and L2 regularization results for feature selection and evaluates their effectiveness on complex models using a large cybersecurity dataset.
Findings
No significant accuracy difference between L1 and L2 feature selection methods.
Combined features from L1 and L2 yield comparable performance on complex models.
Standard metrics confirm similar effectiveness of the combined feature set.
Abstract
In supervised machine learning, feature selection plays a very important role by potentially enhancing explainability and performance as measured by computing time and accuracy-related metrics. In this paper, we investigate a method for feature selection based on the well-known L1 and L2 regularization strategies associated with logistic regression (LR). It is well known that the learned coefficients, which serve as weights, can be used to rank the features. Our approach is to synthesize the findings of L1 and L2 regularization. For our experiment, we chose the CIC-IDS2018 dataset owing partly to its size and also to the existence of two problematic classes that are hard to separate. We report first with the exclusion of one of them and then with its inclusion. We ranked features first with L1 and then with L2, and then compared logistic regression with L1 (LR+L1) against that with L2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
MethodsSparse Evolutionary Training · Feature Selection · Logistic Regression
