Scaling pattern mining through non-overlapping variable partitioning
Leonardo Alexandre, Rafael S. Costa, Rui Henriques

TL;DR
This paper introduces a scalable biclustering pipeline that uses variable partitioning based on pattern likelihood, significantly improving execution times in high-dimensional biological data analysis.
Contribution
It proposes a novel vertical partitioning approach considering pattern coherence, enhancing the efficiency of biclustering algorithms in high-dimensional datasets.
Findings
Significant reduction in execution times for some datasets.
Partitioning based on pattern likelihood outperforms dissimilarity-based methods.
Provides a scalable solution for high-dimensional biological data analysis.
Abstract
Biclustering algorithms play a central role in the biotechnological and biomedical domains. The knowledge extracted supports the extraction of putative regulatory modules, essential to understanding diseases, aiding therapy research, and advancing biological knowledge. However, given the NP-hard nature of the biclustering task, algorithms with optimality guarantees tend to scale poorly in the presence of high-dimensionality data. To this end, we propose a pipeline for clustering-based vertical partitioning that takes into consideration both parallelization and cross-partition pattern merging needs. Given a specific type of pattern coherence, these clusters are built based on the likelihood that variables form those patterns. Subsequently, the extracted patterns per cluster are then merged together into a final set of closed patterns. This approach is evaluated using five published…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Gene expression and cancer classification · Neural Networks and Applications
