Paraphrastic Representations at Scale
John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces a scalable system for training high-quality paraphrastic sentence representations across multiple languages, outperforming prior models in accuracy and speed, and providing accessible tools for training and deployment.
Contribution
The authors present a multilingual, fast, and open-source system for training and deploying state-of-the-art paraphrastic sentence embeddings, surpassing previous models in performance and efficiency.
Findings
Models outperform previous state-of-the-art on semantic similarity tasks.
Models are significantly faster and can run efficiently on CPU.
The system supports multiple languages and is easy to train and deploy.
Abstract
We present a system that allows users to train their own state-of-the-art paraphrastic sentence representations in a variety of languages. We also release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese. We train these models on large amounts of data, achieving significantly improved performance from the original papers proposing the methods on a suite of monolingual semantic similarity, cross-lingual semantic similarity, and bitext mining tasks. Moreover, the resulting models surpass all prior work on unsupervised semantic textual similarity, significantly outperforming even BERT-based models like Sentence-BERT (Reimers and Gurevych, 2019). Additionally, our models are orders of magnitude faster than prior work and can be used on CPU with little difference in inference speed (even improved speed over GPU when using more CPU cores), making…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
