Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense   Enriched Representations

Geetanjali Bihani; Julia Taylor Rayz

arXiv:2104.10833·cs.CL·April 23, 2021

Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations

Geetanjali Bihani, Julia Taylor Rayz

PDF

TL;DR

This paper investigates the anisotropy in deep pretrained language models' word representations, revealing degeneration issues, and proposes LASeR, a post-processing method to make representations isotropic and sense-enriched, improving disambiguation.

Contribution

The paper introduces LASeR, a novel post-processing technique that reduces anisotropy in language model representations, enhancing their semantic and sense disambiguation capabilities.

Findings

01

Deep models produce highly anisotropic representations.

02

LASeR effectively makes representations more isotropic.

03

Sense enrichment improves disambiguation performance.

Abstract

Contextual word representation models have shown massive improvements on a multitude of NLP tasks, yet their word sense disambiguation capabilities remain poorly explained. To address this gap, we assess whether contextual word representations extracted from deep pretrained language models create distinguishable representations for different senses of a given word. We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations, pointing towards the existence of representation degeneration problem in contextual word representations. After accounting for anisotropy, our study further reveals that there is variability in sense learning capabilities across different language models. Finally, we propose LASeR, a 'Low Anisotropy Sense Retrofitting' approach that renders off-the-shelf representations isotropic and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.