MegaChat: A Synthetic Persian Q&A Dataset for High-Quality Sales Chatbot Evaluation
Mahdi Rahmani, AmirHossein Saffari, Reyhane Rahmani

TL;DR
MegaChat introduces a fully synthetic Persian Q&A dataset generated by an innovative multi-agent system, enabling efficient evaluation of sales chatbots in low-resource language e-commerce without extensive human annotation.
Contribution
The paper presents the first synthetic Persian Q&A dataset for sales chatbot evaluation, utilizing a novel multi-agent architecture to produce realistic, diverse conversational data automatically.
Findings
The agentic system outperformed traditional models in 4 out of 5 channels.
Generated datasets are high-quality and reduce reliance on human annotation.
The approach is scalable and cost-effective for low-resource language domains.
Abstract
Small and medium-sized enterprises (SMEs) in Iran increasingly leverage Telegram for sales, where real-time engagement is essential for conversion. However, developing AI-driven chatbots for this purpose requires large, high-quality question-and-answer (Q&A) datasets, which are typically expensive and resource-intensive to produce, especially for low-resource languages like Persian. In this paper, we introduce MegaChat, the first fully synthetic Persian Q&A dataset designed to evaluate intelligent sales chatbots in Telegram-based e-commerce. We propose a novel, automated multi-agent architecture that generates persona-aware Q&A pairs by collecting data from active Telegram shopping channels. The system employs specialized agents for question generation, validation, and refinement, ensuring the production of realistic and diverse conversational data. To evaluate answer generation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Topic Modeling · Expert finding and Q&A systems
