Linear Transformations for Cross-lingual Semantic Textual Similarity
Tom\'a\v{s} Brychc\'in

TL;DR
This paper introduces a novel linear transformation method for cross-lingual semantic textual similarity that leverages bilingual dictionaries and word weighting, outperforming previous approaches without relying on machine translation.
Contribution
The paper proposes a new linear transformation technique using bilingual dictionaries and word weighting to improve cross-lingual semantic similarity without heavy supervision.
Findings
Outperforms existing methods on multiple datasets
Unsupervised sentence similarity can be significantly improved
Word weighting enhances transformation effectiveness
Abstract
Cross-lingual semantic textual similarity systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-art algorithms usually employ machine translation and combine vast amount of features, making the approach strongly supervised, resource rich, and difficult to use for poorly-resourced languages. In this paper, we study linear transformations, which project monolingual semantic spaces into a shared space using bilingual dictionaries. We propose a novel transformation, which builds on the best ideas from prior works. We experiment with unsupervised techniques for sentence similarity based only on semantic spaces and we show they can be significantly improved by the word weighting. Our transformation outperforms other methods and together with word weighting leads to very promising results on several datasets in different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
