Rational Kernels for Arabic Stemming and Text Classification
Attia Nehar, Djelloul Ziadi, Hadda Cherroun

TL;DR
This paper presents a novel Arabic stemming method using transducers based on patterns, enabling effective text classification with rational kernels, achieving promising accuracy, recall, and F1 scores without relying on dictionaries.
Contribution
Introduces a pattern-based stemming technique using transducers and applies rational kernels for Arabic text classification, avoiding dictionary dependence.
Findings
Effective stemming without dictionaries.
Improved classification metrics on Arabic datasets.
Promising results in accuracy, recall, and F1 scores.
Abstract
In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Stemming experiments are conducted on three word collections and classification experiments are done on the Saudi Press Agency dataset. Results show that our approach, when compared with other approaches, is promising specially in terms of Accuracy, Recall and F1.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Natural Language Processing Techniques · Advanced Text Analysis Techniques
