TL;DR
This paper introduces a large-scale, instruction-based dataset for Arabic poetry generation and demonstrates that fine-tuned LLMs can effectively create poetry aligned with user specifications.
Contribution
It provides the first comprehensive dataset and methodology for controllable Arabic poetry generation using LLMs, covering Modern Standard Arabic and dialects.
Findings
Fine-tuned models produce poetry matching user criteria.
Automated metrics and human evaluations confirm quality.
Dataset enables tasks like writing, revising, and analyzing Arabic poetry.
Abstract
Poetry has long been a central art form for Arabic speakers, serving as a powerful medium of expression and cultural identity. While modern Arabic speakers continue to value poetry, existing research on Arabic poetry within Large Language Models (LLMs) has primarily focused on analysis tasks such as interpretation or metadata prediction, e.g., rhyme schemes and titles. In contrast, our work addresses the practical aspect of poetry creation in Arabic by introducing controllable generation capabilities to assist users in writing poetry. Specifically, we present a large-scale, carefully curated instruction-based dataset in Modern Standard Arabic (MSA) and various Arabic dialects. This dataset enables tasks such as writing, revising, and continuing poems based on predefined criteria, including style and rhyme, as well as performing poetry analysis. Our experiments show that fine-tuning LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
