Selective Inference Approach for Statistically Sound Predictive Pattern   Mining

Shinya Suzumura; Kazuya Nakagawa; Mahito Sugiyama; Koji Tsuda; Ichiro; Takeuchi

arXiv:1602.04601·stat.ML·March 10, 2016

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Shinya Suzumura, Kazuya Nakagawa, Mahito Sugiyama, Koji Tsuda, Ichiro, Takeuchi

PDF

Open Access

TL;DR

This paper introduces a novel selective inference framework for predictive pattern mining that effectively addresses selection bias, enabling statistically sound identification of significant patterns from large databases.

Contribution

The paper proposes a new algorithmic approach within the selective inference framework to handle the complex selection process in pattern mining problems.

Findings

01

Successfully identifies statistically significant patterns in large databases.

02

Addresses the challenge of characterizing the selection process in pattern mining.

03

Provides a practical method for bias-aware pattern discovery.

Abstract

Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that patterns are selected from extremely large number of candidates in databases. In this paper, we introduce a new approach for predictive pattern mining problems that can address the selection bias issue. Our approach is built on a recently popularized statistical inference framework called selective inference. In selective inference, statistical inferences (such as statistical hypothesis testing) are conducted based on sampling distributions conditional on a selection event. If the selection event is characterized in a tractable way, statistical inferences can be made without minding selection bias issue. However, in pattern mining problems, it is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Imbalanced Data Classification Techniques · Rough Sets and Fuzzy Logic