Development of a classifiers/quantifiers dictionary towards French-Japanese MT
Mutsuko Tomokiyo (GETALP, LIG), Mathieu Mangeot (GETALP, LIG),, Christian Boitet (LIG, GETALP)

TL;DR
This paper presents a bilingual classifiers/quantifiers dictionary derived from annotated corpora, aimed at improving French-Japanese machine translation by addressing lexical ambiguity and phrase recognition issues.
Contribution
It introduces a novel CQs dictionary based on UNL-UWs annotations, facilitating better handling of CQs in French-Japanese MT systems.
Findings
Created a CQs dictionary from annotated corpus data
Enhanced MT accuracy by addressing lexical ambiguity
Improved recognition of CQs in translation process
Abstract
Although classifiers/quantifiers (CQs) expressions appear frequently in everyday communications or written documents, they are described neither in classical bilingual paper dictionaries , nor in machine-readable dictionaries. The paper describes a CQs dictionary, edited from the corpus we have annotated, and its usage in the framework of French-Japanese machine translation (MT). CQs treatment in MT often causes problems of lexical ambiguity, polylexical phrase recognition difficulties in analysis and doubtful output in transfer-generation, in particular for distant languages pairs like French and Japanese. Our basic treatment of CQs is to annotate the corpus by UNL-UWs (Universal Networking Language-Universal words) 1 , and then to produce a bilingual or multilingual dictionary of CQs, based on synonymy through identity of UWs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Translation Studies and Practices
