Unsupervised Thematic Clustering Of hadith Texts Using The Apriori Algorithm
Wisnu Uriawan, Achmad Ajie Priyajie, Angga Gustian, Fikri Nur Hidayat, Sendi Ahmad Rafiudin, Muhamad Fikri Zaelani

TL;DR
This paper demonstrates that the Apriori algorithm can effectively automatically identify thematic groupings and semantic relationships in unlabeled hadith texts, aiding digital Islamic studies.
Contribution
It introduces an unsupervised thematic clustering method for hadith texts using the Apriori algorithm, highlighting its ability to uncover semantic associations.
Findings
Identified meaningful association patterns like rakaat-prayer and verse-revelation
Demonstrated Apriori's effectiveness in semantic relationship discovery
Contributed to digital Islamic studies and technology-based learning
Abstract
This research stems from the urgency to automate the thematic grouping of hadith in line with the growing digitalization of Islamic texts. Based on a literature review, the unsupervised learning approach with the Apriori algorithm has proven effective in identifying association patterns and semantic relations in unlabeled text data. The dataset used is the Indonesian Translation of the hadith of Bukhari, which first goes through preprocessing stages including case folding, punctuation cleaning, tokenization, stopword removal, and stemming. Next, an association rule mining analysis was conducted using the Apriori algorithm with support, confidence, and lift parameters. The results show the existence of meaningful association patterns such as the relationship between rakaat-prayer, verse-revelation, and hadith-story, which describe the themes of worship, revelation, and hadith narration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEdcuational Technology Systems · Data Mining and Machine Learning Applications · Text and Document Classification Technologies
