The Cross-Lingual Arabic Information REtrieval (CLAIRE) System
Zhizhong Chen, Carsten Eickhoff

TL;DR
The CLAIRE system enables cross-lingual Arabic information retrieval using English-Arabic word embeddings, simplifying the pipeline and avoiding translation errors, with promising initial results on Arabic news data.
Contribution
This paper introduces an end-to-end cross-lingual retrieval system based on cross-lingual word embeddings, avoiding complex translation models and supporting various neural retrieval methods.
Findings
Promising retrieval performance on Arabic news collection
Simplifies cross-lingual retrieval pipeline
Avoids translation-related errors
Abstract
Despite advances in neural machine translation, cross-lingual retrieval tasks in which queries and documents live in different natural language spaces remain challenging. Although neural translation models may provide an intuitive approach to tackle the cross-lingual problem, their resource-consuming training and advanced model structures may complicate the overall retrieval pipeline and reduce users engagement. In this paper, we build our end-to-end Cross-Lingual Arabic Information REtrieval (CLAIRE) system based on the cross-lingual word embedding where searchers are assumed to have a passable passive understanding of Arabic and various supporting information in English is provided to aid retrieval experience. The proposed system has three major advantages: (1) The usage of English-Arabic word embedding simplifies the overall pipeline and avoids the potential mistakes caused by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
