Identifying Higher-order Combinations of Binary Features
Felipe Llinares, Mahito Sugiyama, Karsten M. Borgwardt

TL;DR
This paper improves methods for detecting significant higher-order interactions among binary features in high-dimensional data, demonstrating a scalable approach that is much faster than previous methods.
Contribution
It introduces strategies to accelerate Terada et al.'s approach, especially incremental search with early stopping, and thoroughly evaluates them on real-world datasets.
Findings
Incremental search with early stopping is significantly faster.
The proposed methods scale well to large datasets.
Evaluation on 11 real-world datasets confirms efficiency gains.
Abstract
Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses. Terada et al. recently showed how to elegantly address this multiple testing problem by excluding non-testable hypotheses. Still, it remains unclear how their approach scales to large datasets. We here proposed strategies to speed up the approach by Terada et al. and evaluate them thoroughly in 11 real-world benchmark datasets. We observe that one approach, incremental search with early stopping, is orders of magnitude faster than the current state-of-the-art approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Mining Algorithms and Applications · Machine Learning and Data Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
