Language Invariant Properties in Natural Language Processing

Federico Bianchi; Debora Nozza; Dirk Hovy

arXiv:2109.13037·cs.CL·October 4, 2021

Language Invariant Properties in Natural Language Processing

Federico Bianchi, Debora Nozza, Dirk Hovy

PDF

Open Access 1 Repo

TL;DR

This paper introduces the concept of language invariant properties in NLP, emphasizing their role in evaluating the robustness of text transformations like translation and paraphrasing, and highlights social biases introduced by such transformations.

Contribution

It defines language invariant properties and demonstrates their use in assessing NLP transformation robustness, revealing biases and encouraging social-aware NLP development.

Findings

01

Transformations often alter author characteristics, such as gender bias.

02

Many NLP transformations impact properties like sentiment and entailment.

03

The paper provides an evaluation suite for invariance in NLP transformations.

Abstract

Meaning is context-dependent, but many properties of language (should) remain the same even if we transform the context. For example, sentiment, entailment, or speaker properties should be the same in a translation and original of a text. We introduce language invariant properties: i.e., properties that should not change when we transform text, and how they can be used to quantitatively evaluate the robustness of transformation algorithms. We use translation and paraphrasing as transformation examples, but our findings apply more broadly to any transformation. Our results indicate that many NLP transformations change properties like author characteristics, i.e., make them sound more male. We believe that studying these properties will allow NLP to address both social factors and pragmatic aspects of language. We also release an application suite that can be used to evaluate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

milanlproc/language-invariant-properties
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification