LEA: Improving Sentence Similarity Robustness to Typos Using Lexical   Attention Bias

Mario Almagro; Emilio Almaz\'an; Diego Ortego; David Jim\'enez

arXiv:2307.02912·cs.CL·July 7, 2023

LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias

Mario Almagro, Emilio Almaz\'an, Diego Ortego, David Jim\'enez

PDF

Open Access 1 Repo

TL;DR

This paper introduces LEA, a lexical attention bias module for cross-encoders that enhances robustness to typos in sentence similarity tasks, especially in noisy, domain-specific, and short-text scenarios.

Contribution

The paper proposes a novel lexical-aware attention (LEA) module that incorporates lexical similarities to improve Transformer-based models' robustness to textual noise without tokenization shift issues.

Findings

01

LEA improves robustness to typos across multiple datasets.

02

LEA maintains competitive performance on clean data.

03

Analysis reveals key design choices impacting effectiveness.

Abstract

Textual noise, such as typos or abbreviations, is a well-known issue that penalizes vanilla Transformers for most downstream tasks. We show that this is also the case for sentence similarity, a fundamental task in multiple domains, e.g. matching, retrieval or paraphrasing. Sentence similarity can be approached using cross-encoders, where the two sentences are concatenated in the input allowing the model to exploit the inter-relations between them. Previous works addressing the noise issue mainly rely on data augmentation strategies, showing improved robustness when dealing with corrupted samples that are similar to the ones used for training. However, all these methods still suffer from the token distribution shift induced by typos. In this work, we propose to tackle textual noise by equipping cross-encoders with a novel LExical-aware Attention module (LEA) that incorporates lexical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

m-almagro-cadiz/lea
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining

MethodsAttention Is All You Need · Layer Normalization · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding · Linear Layer · Adam · Multi-Head Attention · Position-Wise Feed-Forward Layer · Residual Connection