Safe Feature Pruning for Sparse High-Order Interaction Models
Kazuya Nakagawa, Shinya Suzumura, Masayuki Karasuyama, Koji Tsuda,, Ichiro Takeuchi

TL;DR
This paper introduces a novel, efficient safe feature pruning algorithm for LASSO-based high-order interaction models, significantly reducing computational costs by exploiting tree structures to screen out inactive features before training.
Contribution
It proposes a new safe feature pruning rule leveraging tree structures to efficiently handle extremely large high-order interaction feature sets in sparse learning.
Findings
Can handle 3rd order interactions of 10,000 covariates
Reduces computational cost and memory requirements
Enables working with over 10^{12} features
Abstract
Taking into account high-order interactions among covariates is valuable in many practical regression problems. This is, however, computationally challenging task because the number of high-order interaction features to be considered would be extremely large unless the number of covariates is sufficiently small. In this paper, we propose a novel efficient algorithm for LASSO-based sparse learning of such high-order interaction models. Our basic strategy for reducing the number of features is to employ the idea of recently proposed safe feature screening (SFS) rule. An SFS rule has a property that, if a feature satisfies the rule, then the feature is guaranteed to be non-active in the LASSO solution, meaning that it can be safely screened-out prior to the LASSO training process. If a large number of features can be screened-out before training the LASSO, the computational cost and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
