# Learning what matters - Sampling interesting patterns

**Authors:** Vladimir Dzyuba, Matthijs van Leeuwen

arXiv: 1702.01975 · 2017-04-28

## TL;DR

This paper introduces LetSIP, a novel interactive pattern sampling algorithm that learns user interests through feedback, enabling efficient, personalized data exploration with improved quality and diversity of discovered patterns.

## Contribution

It presents a new sampling approach combining weighted sampling and learning to rank, allowing direct adaptation to user interests in pattern mining.

## Key findings

- LetSIP outperforms state-of-the-art methods in quality-diversity trade-offs.
- The system enables efficient, user-specific, anytime data exploration.
- It effectively learns user preferences through feedback during pattern sampling.

## Abstract

In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.01975/full.md

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/1702.01975/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1702.01975/full.md

---
Source: https://tomesphere.com/paper/1702.01975