Identifying Higher-order Combinations of Binary Features

Felipe Llinares; Mahito Sugiyama; Karsten M. Borgwardt

arXiv:1407.1176·stat.ML·July 7, 2014

Identifying Higher-order Combinations of Binary Features

Felipe Llinares, Mahito Sugiyama, Karsten M. Borgwardt

PDF

Open Access

TL;DR

This paper improves methods for detecting significant higher-order interactions among binary features in high-dimensional data, demonstrating a scalable approach that is much faster than previous methods.

Contribution

It introduces strategies to accelerate Terada et al.'s approach, especially incremental search with early stopping, and thoroughly evaluates them on real-world datasets.

Findings

01

Incremental search with early stopping is significantly faster.

02

The proposed methods scale well to large datasets.

03

Evaluation on 11 real-world datasets confirms efficiency gains.

Abstract

Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses. Terada et al. recently showed how to elegantly address this multiple testing problem by excluding non-testable hypotheses. Still, it remains unclear how their approach scales to large datasets. We here proposed strategies to speed up the approach by Terada et al. and evaluate them thoroughly in 11 real-world benchmark datasets. We observe that one approach, incremental search with early stopping, is orders of magnitude faster than the current state-of-the-art approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Data Mining Algorithms and Applications · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings