The Sparse Tsetlin Machine: Sparse Representation with Active Literals
Sebastian {\O}stby, Tobias M. Brambo, Sondre Glimsdal

TL;DR
The Sparse Tsetlin Machine (STM) efficiently handles sparse data by focusing on active literals, reducing memory and computation while maintaining competitive accuracy in NLP and similar applications.
Contribution
This paper introduces Active Literals to enable the Tsetlin Machine to process sparse data effectively, a novel approach that improves efficiency without sacrificing performance.
Findings
Reduces memory usage and computation time significantly.
Maintains competitive classification accuracy on sparse datasets.
Successfully identifies active literals for efficient training.
Abstract
This paper introduces the Sparse Tsetlin Machine (STM), a novel Tsetlin Machine (TM) that processes sparse data efficiently. Traditionally, the TM does not consider data characteristics such as sparsity, commonly seen in NLP applications and other bag-of-word-based representations. Consequently, a TM must initialize, store, and process a significant number of zero values, resulting in excessive memory usage and computational time. Previous attempts at creating a sparse TM have predominantly been unsuccessful, primarily due to their inability to identify which literals are sufficient for TM training. By introducing Active Literals (AL), the STM can focus exclusively on literals that actively contribute to the current data representation, significantly decreasing memory footprint and computational time while demonstrating competitive classification performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
MethodsFocus
