Un syst\`eme modulaire d'acquisition automatique de traductions \`a partir du Web
St\'ephanie L\'eon (LIRMM)

TL;DR
This paper introduces a modular web-based method for automatically translating complex lexical units between French and English, leveraging linguistic properties and web data to improve bilingual lexicon extraction.
Contribution
It presents a novel modular system that uses linguistic features and web validation to enhance automatic translation of complex lexical units.
Findings
High precision in translation extraction demonstrated
Effective use of web data for validation and term collection
Modular approach handles various linguistic properties
Abstract
We present a method of automatic translation (French/English) of Complex Lexical Units (CLU) for aiming at extracting a bilingual lexicon. Our modular system is based on linguistic properties (compositionality, polysemy, etc.). Different aspects of the multilingual Web are used to validate candidate translations and collect new terms. We first build a French corpus of Web pages to collect CLU. Three adapted processing stages are applied for each linguistic property : compositional and non polysemous translations, compositional polysemous translations and non compositional translations. Our evaluation on a sample of CLU shows that our technique based on the Web can reach a very high precision.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
