WEKA-Based: Key Features and Classifier for French of Five Countries
Zeqian Li, Keyu Qiu, Chenxu Jiao, Wen Zhu, Haoran Tang

TL;DR
This paper presents a dialect recognition system for French across five regions, utilizing a corpus and machine learning tools to distinguish regional variations based on thematic content.
Contribution
It introduces a French dialect classification approach using a new regional corpus and machine learning with WEKA, tailored for regional dialect identification.
Findings
Effective differentiation of regional French dialects achieved
Utilized WEKA classifiers with thematic corpus for dialect recognition
Demonstrated feasibility of machine learning in dialect classification
Abstract
This paper describes a French dialect recognition system that will appropriately distinguish between different regional French dialects. A corpus of five regions - Monaco, French-speaking, Belgium, French-speaking Switzerland, French-speaking Canada and France, which is targeted forconstruction by the Sketch Engine. The content of the corpus is related to the four themes of eating, drinking, sleeping and living, which are closely linked to popular life. The experimental results were obtained through the processing of a python coded pre-processor and Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many filters and classifiers for machine learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
