Interpretable Sequence Clustering
Junjie Dong, Xinyi Yang, Mudi Jiang, Lianyu Hu, Zengyou He

TL;DR
This paper introduces Interpretable Sequence Clustering Tree (ISCT), a method that combines pattern mining with decision trees to produce understandable clusters from categorical sequences, achieving fast, accurate, and interpretable results.
Contribution
The paper presents ISCT, a novel approach that integrates sequential pattern mining with a decision tree framework for interpretable sequence clustering.
Findings
ISCT produces interpretable tree structures for sequence clusters.
The method achieves high accuracy and speed on real-world datasets.
Experimental results validate the interpretability and effectiveness of ISCT.
Abstract
Categorical sequence clustering plays a crucial role in various fields, but the lack of interpretability in cluster assignments poses significant challenges. Sequences inherently lack explicit features, and existing sequence clustering algorithms heavily rely on complex representations, making it difficult to explain their results. To address this issue, we propose a method called Interpretable Sequence Clustering Tree (ISCT), which combines sequential patterns with a concise and interpretable tree structure. ISCT leverages k-1 patterns to generate k leaf nodes, corresponding to k clusters, which provides an intuitive explanation on how each cluster is formed. More precisely, ISCT first projects sequences into random subspaces and then utilizes the k-means algorithm to obtain high-quality initial cluster assignments. Subsequently, it constructs a pattern-based decision tree using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Mining Algorithms and Applications · Data Stream Mining Techniques
