Addressing Community Question Answering in English and Arabic
Giovanni Da San Martino, Alberto Barr\'on-Cede\~no, Salvatore Romeo,, Alessandro Moschitti, Shafiq Joty, Fahad A. Al Obaidli, Kateryna Tymoshenko,, Antonio Uva

TL;DR
This paper explores various feature-based machine learning models for question re-ranking in community QA, demonstrating the effectiveness of structural kernels and embeddings in both English and Arabic datasets, with notable improvements over baseline methods.
Contribution
It introduces the novel application of syntactic tree kernels to Arabic question re-ranking and compares multiple features and algorithms on multilingual community QA datasets.
Findings
Structural kernels are robust to noisy data.
Effective BoW and TK features improve re-ranking performance.
Achieved second-best results on SemEval-2016 tasks for English and Arabic.
Abstract
This paper studies the impact of different types of features applied to learning to re-rank questions in community Question Answering. We tested our models on two datasets released in SemEval-2016 Task 3 on "Community Question Answering". Task 3 targeted real-life Web fora both in English and Arabic. Our models include bag-of-words features (BoW), syntactic tree kernels (TKs), rank features, embeddings, and machine translation evaluation features. To the best of our knowledge, structural kernels have barely been applied to the question reranking task, where they have to model paraphrase relations. In the case of the English question re-ranking task, we compare our learning to rank (L2R) algorithms against a strong baseline given by the Google-generated ranking (GR). The results show that i) the shallow structures used in our TKs are robust enough to noisy data and ii) improving GR is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems
