Mubeen AI: A Specialized Arabic Language Model for Heritage Preservation and User Intent Understanding
Mohammed Aljafari, Ismail Alturki, Ahmed Mori, Yehya Kadumi

TL;DR
Mubeen AI is a specialized Arabic language model designed for heritage preservation and understanding user intent, trained on authentic Arabic sources and utilizing a novel architecture to improve accuracy and cultural authenticity.
Contribution
The paper introduces the Practical Closure Architecture, a new framework that enhances Arabic language understanding and addresses the utility gap in AI responses.
Findings
Outperforms existing Arabic models in intent detection.
Effectively handles classical, contemporary, and dialectal Arabic.
Ensures cultural authenticity through native source training.
Abstract
Mubeen is a proprietary Arabic language model developed by MASARAT SA, optimized for deep understanding of Arabic linguistics, Islamic studies, and cultural heritage. Trained on an extensive collection of authentic Arabic sources significantly expanded by digitizing historical manuscripts via a proprietary Arabic OCR engine, the model incorporates seminal scholarly works in linguistics, jurisprudence, hadith, and Quranic exegesis, alongside thousands of academic theses and peer-reviewed research papers. Conditioned through a deep linguistic engineering framework, Mubeen masters not just the meaning but the eloquence of Arabic, enabling precise understanding across classical texts, contemporary writing, and regional dialects with focus on comprehending user intent and delivering accurate, contextually relevant responses. Unlike other Arabic models relying on translated English data that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
