FastOmniTMAE: Parallel Clause Learning for Scalable and Hardware-Efficient Tsetlin Embeddings
Ahmed K. Kadhim, Lei Jiao, Rishad Shafik, Ole-Christoffer Granmo, Mayur Kishor Shende

TL;DR
FastOmniTMAE introduces a parallel training method for Tsetlin Machine-based embeddings, significantly speeding up training while maintaining quality, and is efficiently implementable on FPGA hardware.
Contribution
It reformulates Omni TM-AE with a two-stage parallel process and demonstrates hardware-efficient implementation on FPGA platforms.
Findings
Achieves up to 5× faster training in classification tasks.
Maintains comparable embedding quality under similarity measures.
Demonstrates efficient FPGA implementation with small hardware footprint.
Abstract
Embedding models in natural language processing (NLP) increasingly rely on deep architectures such as BERT, while simpler models such as Word2Vec provide efficient representations but limited interpretability. The Tsetlin Machine (TM) offers an alternative logic-based learning paradigm. Omni TM Autoencoder (Omni TM-AE) applies this paradigm to static embedding by exploiting automaton state distributions within a single clause layer, but its training process remains slow. In this work, we propose FastOmniTMAE, a reformulation of Omni TM-AE that replaces sequential training dependencies with a two-stage parallel process: evaluation and update. Using a Single-Run Multi-Environment Benchmark covering classification, similarity, and clustering, FastOmniTMAE achieves up to 5 faster training in classification while maintaining comparable embedding quality under both Spearman and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
