# Development of a classifiers/quantifiers dictionary towards   French-Japanese MT

**Authors:** Mutsuko Tomokiyo (GETALP, LIG), Mathieu Mangeot (GETALP, LIG),, Christian Boitet (LIG, GETALP)

arXiv: 1902.08061 · 2019-02-22

## TL;DR

This paper presents a bilingual classifiers/quantifiers dictionary derived from annotated corpora, aimed at improving French-Japanese machine translation by addressing lexical ambiguity and phrase recognition issues.

## Contribution

It introduces a novel CQs dictionary based on UNL-UWs annotations, facilitating better handling of CQs in French-Japanese MT systems.

## Key findings

- Created a CQs dictionary from annotated corpus data
- Enhanced MT accuracy by addressing lexical ambiguity
- Improved recognition of CQs in translation process

## Abstract

Although classifiers/quantifiers (CQs) expressions appear frequently in everyday communications or written documents, they are described neither in classical bilingual paper dictionaries , nor in machine-readable dictionaries. The paper describes a CQs dictionary, edited from the corpus we have annotated, and its usage in the framework of French-Japanese machine translation (MT). CQs treatment in MT often causes problems of lexical ambiguity, polylexical phrase recognition difficulties in analysis and doubtful output in transfer-generation, in particular for distant languages pairs like French and Japanese. Our basic treatment of CQs is to annotate the corpus by UNL-UWs (Universal Networking Language-Universal words) 1 , and then to produce a bilingual or multilingual dictionary of CQs, based on synonymy through identity of UWs.

---
Source: https://tomesphere.com/paper/1902.08061