Language Invariant Properties in Natural Language Processing
Federico Bianchi, Debora Nozza, Dirk Hovy

TL;DR
This paper introduces the concept of language invariant properties in NLP, emphasizing their role in evaluating the robustness of text transformations like translation and paraphrasing, and highlights social biases introduced by such transformations.
Contribution
It defines language invariant properties and demonstrates their use in assessing NLP transformation robustness, revealing biases and encouraging social-aware NLP development.
Findings
Transformations often alter author characteristics, such as gender bias.
Many NLP transformations impact properties like sentiment and entailment.
The paper provides an evaluation suite for invariance in NLP transformations.
Abstract
Meaning is context-dependent, but many properties of language (should) remain the same even if we transform the context. For example, sentiment, entailment, or speaker properties should be the same in a translation and original of a text. We introduce language invariant properties: i.e., properties that should not change when we transform text, and how they can be used to quantitatively evaluate the robustness of transformation algorithms. We use translation and paraphrasing as transformation examples, but our findings apply more broadly to any transformation. Our results indicate that many NLP transformations change properties like author characteristics, i.e., make them sound more male. We believe that studying these properties will allow NLP to address both social factors and pragmatic aspects of language. We also release an application suite that can be used to evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
