Subspace Representations for Soft Set Operations and Sentence   Similarities

Yoichi Ishibashi; Sho Yokoi; Katsuhito Sudoh; Satoshi Nakamura

arXiv:2210.13034·cs.CL·April 11, 2024

Subspace Representations for Soft Set Operations and Sentence Similarities

Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel subspace-based method for representing word sets in NLP, enabling efficient set operations and improved sentence similarity measurement over traditional vector approaches.

Contribution

It proposes a quantum-inspired subspace representation for word sets that allows for set operations and enhances sentence similarity tasks in pre-trained embeddings.

Findings

01

Subspace set operations outperform vector-based methods in sentence similarity.

02

The approach enables efficient computation of set unions, intersections, and complements.

03

Experiments show improved performance on standard NLP benchmarks.

Abstract

In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words. Yet, when it comes to the representations of sets of words, the conventional vector-based approaches often struggle with expressiveness and lack the essential set operations such as union, intersection, and complement. Inspired by quantum logic, we realize the representation of word sets and corresponding set operations within pre-trained word embedding spaces. By grounding our approach in the linear subspaces, we enable efficient computation of various set operations and facilitate the soft computation of membership functions within continuous spaces. Moreover, we allow for the computation of the F-score directly within word vectors, thereby establishing a direct link to the assessment of sentence similarity. In experiments with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yoichi1484/subspace
pytorchOfficial

Videos

Subspace Representations for Soft Set Operations and Sentence Similarities· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Bioinformatics