Cross-Lingual Document Retrieval with Smooth Learning
Jiapeng Liu, Xiao Zhang, Dan Goldwasser, Xiao Wang

TL;DR
This paper introduces a robust end-to-end framework for cross-lingual document retrieval that utilizes a novel relevance measure and loss function, achieving significant improvements across multiple languages.
Contribution
The paper proposes a new relevance measure and loss function for neural cross-lingual search, along with theoretical guarantees, enhancing stability and performance.
Findings
Significant performance gains on cross-lingual retrieval benchmarks
Effective handling of language diversity in document search
Theoretical generalization error bounds established
Abstract
Cross-lingual document search is an information retrieval task in which the queries' language differs from the documents' language. In this paper, we study the instability of neural document search models and propose a novel end-to-end robust framework that achieves improved performance in cross-lingual search with different documents' languages. This framework includes a novel measure of the relevance, smooth cosine similarity, between queries and documents, and a novel loss function, Smooth Ordinal Search Loss, as the objective. We further provide theoretical guarantee on the generalization error bound for the proposed framework. We conduct experiments to compare our approach with other document search models, and observe significant gains under commonly used ranking metrics on the cross-lingual document retrieval task in a variety of languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Domain Adaptation and Few-Shot Learning
