Intrinsic dimension and its application to association rules

Tom Hanika; Friedrich Martin Schneider; Gerd Stumme

arXiv:1805.05714·cs.AI·May 16, 2018

Intrinsic dimension and its application to association rules

Tom Hanika, Friedrich Martin Schneider, Gerd Stumme

PDF

Open Access

TL;DR

This paper introduces a method to measure the intrinsic dimension of data sets, addressing the curse of dimensionality in association rule mining and enabling geometric analysis in high-dimensional machine learning.

Contribution

It presents the first feasible approach to quantify the dimension curse in data, facilitating the application of geometric methods in high-dimensional machine learning.

Findings

01

Provides a computational method to measure data dimension

02

Enables geometric analysis in high-dimensional data

03

Addresses computational challenges in association rule mining

Abstract

The curse of dimensionality in the realm of association rules is twofold. Firstly, we have the well known exponential increase in computational complexity with increasing item set size. Secondly, there is a \emph{related curse} concerned with the distribution of (spare) data itself in high dimension. The former problem is often coped with by projection, i.e., feature selection, whereas the best known strategy for the latter is avoidance. This work summarizes the first attempt to provide a computationally feasible method for measuring the extent of dimension curse present in a data set with respect to a particular class machine of learning procedures. This recent development enables the application of various other methods from geometric analysis to be investigated and applied in machine learning procedures in the presence of high dimension.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Data Mining Algorithms and Applications · Advanced Algebra and Logic