Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning   Sparse Contextualized Word Representations

G\'abor Berend

arXiv:2307.13776·cs.CL·July 27, 2023

Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations

G\'abor Berend

PDF

1 Repo

TL;DR

This paper proposes a method to improve cross-lingual zero-shot word sense disambiguation by aligning sparse contextualized representations from monolingual models, achieving significant accuracy gains across diverse languages.

Contribution

It introduces a novel approach combining large monolingual models with sparse representations and a contextualized mapping, enhancing cross-lingual WSD performance.

Findings

01

6.5-point increase in F-score over baseline

02

Effective across 17 diverse languages

03

Open-source code available for replication

Abstract

In this paper, we advocate for using large pre-trained monolingual language models in cross lingual zero-shot word sense disambiguation (WSD) coupled with a contextualized mapping mechanism. We also report rigorous experiments that illustrate the effectiveness of employing sparse contextualized word representations obtained via a dictionary learning procedure. Our experimental results demonstrate that the above modifications yield a significant improvement of nearly 6.5 points of increase in the average F-score (from 62.0 to 68.5) over a collection of 17 typologically diverse set of target languages. We release our source code for replicating our experiments at https://github.com/begab/sparsity_makes_sense.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

begab/sparsity_makes_sense
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.