Maximum entropy models and subjective interestingness: an application to tiles in binary databases
Tijl De Bie

TL;DR
This paper proposes a method to quantify the interestingness of patterns in binary databases by modeling prior information as maximum entropy distributions, enabling meaningful subjective interestingness measures.
Contribution
It introduces a general strategy for formalizing prior information using MaxEnt models and applies it to measure subjective interestingness of database tiles.
Findings
MaxEnt models effectively represent prior knowledge.
Subjective interestingness can be quantified using MaxEnt contrast measures.
The approach provides a practical framework for pattern interestingness assessment.
Abstract
Recent research has highlighted the practical benefits of subjective interestingness measures, which quantify the novelty or unexpectedness of a pattern when contrasted with any prior information of the data miner (Silberschatz and Tuzhilin, 1995; Geng and Hamilton, 2006). A key challenge here is the formalization of this prior information in a way that lends itself to the definition of an interestingness subjective measure that is both meaningful and practical. In this paper, we outline a general strategy of how this could be achieved, before working out the details for a use case that is important in its own right. Our general strategy is based on considering prior information as constraints on a probabilistic model representing the uncertainty about the data. More specifically, we represent the prior information by the maximum entropy (MaxEnt) distribution subject to these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Data Management and Algorithms · Data Mining Algorithms and Applications
