Tracking linguistic information in transformer-based sentence embeddings   through targeted sparsification

Vivi Nastase; Paola Merlo

arXiv:2407.18119·cs.CL·July 26, 2024

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

Vivi Nastase, Paola Merlo

PDF

1 Repo

TL;DR

This paper investigates how linguistic information, such as grammatical and semantic features, is encoded in transformer-based sentence embeddings, revealing that such information is localized rather than distributed.

Contribution

It introduces a targeted sparsification method to analyze the localization of linguistic features within sentence embeddings in transformer models.

Findings

01

Linguistic information is encoded in specific regions of sentence embeddings.

02

Certain grammatical and semantic features can be localized within embeddings.

03

Understanding this localization aids in explainability of transformer models.

Abstract

Analyses of transformer-based models have shown that they encode a variety of linguistic information from their textual input. While these analyses have shed a light on the relation between linguistic information on one side, and internal architecture and parameters on the other, a question remains unanswered: how is this linguistic information reflected in sentence embeddings? Using datasets consisting of sentences with known structure, we test to what degree information about chunks (in particular noun, verb or prepositional phrases), such as grammatical number, or semantic role, can be localized in sentence embeddings. Our results show that such information is not distributed over the entire sentence embedding, but rather it is encoded in specific regions. Understanding how the information from an input text is compressed into sentence embeddings helps understand current transformer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clcl-geneva/blm-snfdisentangling
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.