Are there identifiable structural parts in the sentence embedding whole?

Vivi Nastase; Paola Merlo

arXiv:2406.16563·cs.CL·July 3, 2024

Are there identifiable structural parts in the sentence embedding whole?

Vivi Nastase, Paola Merlo

PDF

Open Access 1 Video

TL;DR

This paper investigates whether sentence embeddings from transformer models contain separable structural and semantic information, demonstrating that specific linguistic features can be extracted from different layers of the embeddings.

Contribution

It introduces a method to identify and separate structural and semantic information within sentence embeddings, revealing layered linguistic representations.

Findings

01

Embeddings encode distinguishable chunk and semantic role information.

02

Layer-wise analysis shows separable structural and semantic features.

03

Performance on linguistic tasks improves with targeted extraction methods.

Abstract

Sentence embeddings from transformer models encode in a fixed length vector much linguistic information. We explore the hypothesis that these embeddings consist of overlapping layers of information that can be separated, and on which specific types of information -- such as information about chunks and their structural and semantic properties -- can be detected. We show that this is the case using a dataset consisting of sentences with known chunk structure, and two linguistic intelligence datasets, solving which relies on detecting chunks and their grammatical number, and respectively, their semantic roles, and through analyses of the performance on the tasks and of the internal representations built during learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Are there identifiable structural parts in the sentence embedding whole?· underline

Taxonomy

TopicsTranslation Studies and Practices · Linguistic research and analysis · Discourse Analysis and Cultural Communication