An Efficient Algorithm for Mining Frequent Sequence with Constraint Programming
John O.R. Aoga, Tias Guns, Pierre Schaus

TL;DR
This paper introduces an optimized constraint programming algorithm for sequential pattern mining that outperforms existing specialized systems by leveraging data mining techniques and efficient data structures.
Contribution
The paper presents a novel, scalable CP-based approach for SPM that improves projected database computation using pre-computation and backtracking-aware data structures.
Findings
Outperforms existing CP and specialized SPM systems in efficiency.
Speed-ups achieved through pre-computation of symbol support positions.
Enhanced performance in mining with regular expressions.
Abstract
The main advantage of Constraint Programming (CP) approaches for sequential pattern mining (SPM) is their modularity, which includes the ability to add new constraints (regular expressions, length restrictions, etc). The current best CP approach for SPM uses a global constraint (module) that computes the projected database and enforces the minimum frequency; it does this with a filtering algorithm similar to the PrefixSpan method. However, the resulting system is not as scalable as some of the most advanced mining systems like Zaki's cSPADE. We show how, using techniques from both data mining and CP, one can use a generic constraint solver and yet outperform existing specialized systems. This is mainly due to two improvements in the module that computes the projected frequencies: first, computing the projected database can be sped up by pre-computing the positions at which an symbol can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Natural Language Processing Techniques
