An Efficient Classification Model for Cyber Text
Md Sakhawat Hossen, Md. Zashid Iqbal Borshon, A. S. M. Badrudduza

TL;DR
This paper proposes a modified TF-IDF algorithm called CTF-IDF and uses classical machine learning with dimensionality reduction to create a more efficient, less resource-intensive text classification model with comparable accuracy.
Contribution
It introduces CTF-IDF and combines it with IRLBA for dimensionality reduction, enhancing efficiency and reducing computational costs in text classification.
Findings
Significant reduction in training time
Improved model accuracy with classical methods
Lower carbon footprint compared to deep learning
Abstract
The uprising of deep learning methodology and practice in recent years has brought about a severe consequence of increasing carbon footprint due to the insatiable demand for computational resources and power. The field of text analytics also experienced a massive transformation in this trend of monopolizing methodology. In this paper, the original TF-IDF algorithm has been modified, and Clement Term Frequency-Inverse Document Frequency (CTF-IDF) has been proposed for data preprocessing. This paper primarily discusses the effectiveness of classical machine learning techniques in text analytics with CTF-IDF and a faster IRLBA algorithm for dimensionality reduction. The introduction of both of these techniques in the conventional text analytics pipeline ensures a more efficient, faster, and less computationally intensive application when compared with deep learning methodology regarding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Text and Document Classification Technologies · Advanced Graph Neural Networks
