Cross-lingual Short-text Matching with Deep Learning

Asmelash Teka Hadgu

arXiv:1811.05569·cs.CL·November 15, 2018·1 cites

Cross-lingual Short-text Matching with Deep Learning

Asmelash Teka Hadgu

PDF

Open Access

TL;DR

This paper presents a deep learning-based system for cross-lingual short-text matching, effectively identifying semantically similar question pairs without heavy feature engineering, demonstrated through a competitive challenge.

Contribution

The authors develop an end-to-end deep learning approach for cross-lingual question matching that outperforms traditional methods and avoids extensive feature engineering.

Findings

01

Achieved top 7th rank out of 1027 teams in a major contest.

02

Log-loss scores of 0.35 and 0.39 in two contest rounds.

03

System demonstrated practical effectiveness in cross-lingual question matching.

Abstract

The problem of short text matching is formulated as follows: given a pair of sentences or questions, a matching model determines whether the input pair mean the same or not. Models that can automatically identify questions with the same meaning have a wide range of applications in question answering sites and modern chatbots. In this article, we describe the approach by team hahu to solve this problem in the context of the "CIKM AnalytiCup 2018 - Cross-lingual Short-text Matching of Question Pairs" that is sponsored by Alibaba. Our solution is an end-to-end system based on current advances in deep learning which avoids heavy feature-engineering and achieves improved performance over traditional machine-learning approaches. The log-loss scores for the first and second rounds of the contest are 0.35 and 0.39 respectively. The team was ranked 7th from 1027 teams in the overall ranking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems