Turn Segmentation into Utterances for Arabic Spontaneous Dialogues and Instance Messages
AbdelRahim A. Elmadany, Sherif M. Abdou, Mervat Gheith

TL;DR
This paper presents a machine learning-based approach for turn segmentation into utterances in Egyptian spontaneous dialogues and Instance Messages, achieving high accuracy despite limited corpus data.
Contribution
It introduces a novel ML method for turn segmentation in Egyptian dialogues, addressing the lack of dialect-specific corpora and improving segmentation accuracy.
Findings
Achieved F1 score of 90.74% in segmentation
Achieved accuracy of 95.98% in segmentation
Utilized a corpus of 3001 manually annotated turns
Abstract
Text segmentation task is an essential processing task for many of Natural Language Processing (NLP) such as text summarization, text translation, dialogue language understanding, among others. Turns segmentation considered the key player in dialogue understanding task for building automatic Human-Computer systems. In this paper, we introduce a novel approach to turn segmentation into utterances for Egyptian spontaneous dialogues and Instance Messages (IM) using Machine Learning (ML) approach as a part of automatic understanding Egyptian spontaneous dialogues and IM task. Due to the lack of Egyptian dialect dialogue corpus the system evaluated by our corpus includes 3001 turns, which are collected, segmented, and annotated manually from Egyptian call-centers. The system achieves F1 scores of 90.74% and accuracy of 95.98%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
