A Bayesian Network Model for Interesting Itemsets

Jaroslav Fowkes; Charles Sutton

arXiv:1510.04130·stat.ML·November 14, 2016

A Bayesian Network Model for Interesting Itemsets

Jaroslav Fowkes, Charles Sutton

PDF

1 Repo

TL;DR

This paper introduces a novel Bayesian network generative model for interesting itemsets, enabling efficient inference and achieving comparable or superior results to existing methods in exploratory data analysis.

Contribution

It presents the first Bayesian network-based generative model for itemsets and a new interestingness measure, improving efficiency and effectiveness in itemset mining.

Findings

01

Model efficiently infers interesting itemsets from data

02

Achieves comparable or better quality than state-of-the-art algorithms

03

Easily parallelizable and simple to implement

Abstract

Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the art performance. Continuing this highly promising line of work, we propose the first, to the best of our knowledge, generative model over itemsets, in the form of a Bayesian network, and an associated novel measure of interestingness. Our model is able to efficiently infer interesting itemsets directly from the transaction database using structural EM, in which the E-step employs the greedy approximation to weighted set cover. Our approach is theoretically simple, straightforward to implement, trivially parallelizable and retrieves itemsets whose quality is comparable to, if not better than, existing state of the art algorithms as we demonstrate on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mast-group/itemset-mining
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.