Toward Interpretable Semantic Textual Similarity via Optimal   Transport-based Contrastive Sentence Learning

Seonghyeon Lee; Dongha Lee; Seongbo Jang; Hwanjo Yu

arXiv:2202.13196·cs.AI·April 15, 2022

Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning

Seonghyeon Lee, Dongha Lee, Seongbo Jang, Hwanjo Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an optimal transport-based method for interpretable semantic textual similarity, improving accuracy and providing human-aligned explanations through a novel contrastive learning framework.

Contribution

It proposes RCMD, a new distance measure based on optimal transport, and CLRCMD, a contrastive learning framework that enhances sentence similarity and interpretability.

Findings

01

Outperforms baselines on STS benchmarks

02

Provides human-aligned interpretability of sentence similarity

03

Enhances sentence similarity quality through contrastive learning

Abstract

Recently, finetuning a pretrained language model to capture the similarity between sentence embeddings has shown the state-of-the-art performance on the semantic textual similarity (STS) task. However, the absence of an interpretation method for the sentence similarity makes it difficult to explain the model output. In this work, we explicitly describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs. In the end, we propose CLRCMD, a contrastive learning framework that optimizes RCMD of sentence pairs, which enhances the quality of sentence similarity and their interpretation. Extensive experiments demonstrate that our learning framework outperforms other baselines on both STS and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sh0416/clrcmd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsContrastive Learning