Mining Compressed Repetitive Gapped Sequential Patterns Efficiently
Yongxin Tong, Li Zhao, Dan Yu, Shilong Ma, Ke Xu

TL;DR
This paper introduces CRGSgrow, an efficient algorithm for compressing and summarizing repetitive gapped sequential patterns in sequence databases, addressing the challenge of large pattern sets at low support thresholds.
Contribution
It proposes a novel two-step approach with pruning and pattern checking strategies to efficiently identify representative repetitive gapped sequential patterns.
Findings
CRGSgrow outperforms existing methods in efficiency.
The algorithm effectively reduces pattern set size.
Empirical results confirm its scalability and effectiveness.
Abstract
Mining frequent sequential patterns from sequence databases has been a central research topic in data mining and various efficient mining sequential patterns algorithms have been proposed and studied. Recently, in many problem domains (e.g, program execution traces), a novel sequential pattern mining research, called mining repetitive gapped sequential patterns, has attracted the attention of many researchers, considering not only the repetition of sequential pattern in different sequences but also the repetition within a sequence is more meaningful than the general sequential pattern mining which only captures occurrences in different sequences. However, the number of repetitive gapped sequential patterns generated by even these closed mining algorithms may be too large to understand for users, especially when support threshold is low. In this paper, we propose and study the problem of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Algorithms and Data Compression
