TL;DR
The paper introduces the xyz algorithm, a randomized method that efficiently discovers variable interactions in high-dimensional data with subquadratic runtime, enabling rapid genome-wide association analysis.
Contribution
A novel randomized algorithm transforming interaction search into a closest pair problem, achieving subquadratic runtime for high-dimensional interaction discovery.
Findings
Can screen over 10^11 interactions in under 280 seconds
Achieves almost linear time for strong interactions
Runtime depends on interaction strength, from nearly linear to subquadratic
Abstract
When performing regression on a dataset with variables, it is often of interest to go beyond using main linear effects and include interactions as products between individual variables. For small-scale problems, these interactions can be computed explicitly but this leads to a computational complexity of at least if done naively. This cost can be prohibitive if is very large. We introduce a new randomised algorithm that is able to discover interactions with high probability and under mild conditions has a runtime that is subquadratic in . We show that strong interactions can be discovered in almost linear time, whilst finding weaker interactions requires operations for depending on their strength. The underlying idea is to transform interaction search into a closestpair problem which can be solved efficiently in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
