Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules
Jose L. Balcazar

TL;DR
This paper investigates the fundamental notions of redundancy among association rules in data mining, providing logical characterizations, deduction systems, and methods to construct minimal rule bases.
Contribution
It introduces a logical framework for understanding redundancy, offers complete deduction calculi, and constructs minimal bases for association rules.
Findings
Two main variants of redundancy are identified.
Complete and sound deduction systems are developed.
Minimal rule bases are constructed for each redundancy notion.
Abstract
Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
