Cross-Language Question Re-Ranking
Giovanni Da San Martino, Salvatore Romeo, Alberto Barron-Cedeno,, Shafiq Joty, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

TL;DR
This paper investigates cross-language question re-ranking between Arabic and English, comparing kernel-based and neural network approaches, and introduces a cross-language tree kernel to improve performance close to monolingual systems.
Contribution
It introduces a cross-language tree kernel for question re-ranking that significantly improves cross-language performance, bridging the gap with monolingual systems.
Findings
Cross-language tree kernel nearly matches monolingual performance.
Kernel-based system outperforms neural network in all tested scenarios.
Using cross-language embeddings improves neural network results.
Abstract
We study how to find relevant questions in community forums when the language of the new questions is different from that of the existing questions in the forum. In particular, we explore the Arabic-English language pair. We compare a kernel-based system with a feed-forward neural network in a scenario where a large parallel corpus is available for training a machine translation system, bilingual dictionaries, and cross-language word embeddings. We observe that both approaches degrade the performance of the system when working on the translated text, especially the kernel-based system, which depends heavily on a syntactic kernel. We address this issue using a cross-language tree kernel, which compares the original Arabic tree to the English trees of the related questions. We show that this kernel almost closes the performance gap with respect to the monolingual system. On the neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Topic Modeling
