Random Intersection Trees

Rajen Dinesh Shah; Nicolai Meinshausen

arXiv:1303.6223·stat.ML·April 27, 2016·J. Mach. Learn. Res.·1 cites

Random Intersection Trees

Rajen Dinesh Shah, Nicolai Meinshausen

PDF

Open Access 1 Repo

TL;DR

Random Intersection Trees is a novel method for identifying high-order interactions in high-dimensional binary data efficiently, using a top-down approach that retains informative interactions with high probability and reduces computational costs.

Contribution

The paper introduces Random Intersection Trees, a new algorithm that efficiently detects variable interactions in high-dimensional data, outperforming brute-force methods in computational complexity.

Findings

01

Retains informative interactions with high probability

02

Computational complexity can be as low as p^1 in sparse data

03

Uses min-wise hashing to further reduce costs

Abstract

Finding interactions between variables in large and high-dimensional datasets is often a serious computational challenge. Most approaches build up interaction sets incrementally, adding variables in a greedy fashion. The drawback is that potentially informative high-order interactions may be overlooked. Here, we propose at an alternative approach for classification problems with binary predictor variables, called Random Intersection Trees. It works by starting with a maximal interaction that includes all variables, and then gradually removing variables if they fail to appear in randomly chosen observations of a class of interest. We show that informative interactions are retained with high probability, and the computational complexity of our procedure is of order $p^{κ}$ for a value of $κ$ that can reach values as low as 1 for very sparse data; in many more general settings, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sumbose/iRF
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Machine Learning and Data Classification · Data Management and Algorithms