Systematicity, Compositionality and Transitivity of Deep NLP Models: a   Metamorphic Testing Perspective

Edoardo Manino; Julia Rozanova; Danilo Carvalho; Andre Freitas; Lucas; Cordeiro

arXiv:2204.12316·cs.CL·April 27, 2022

Systematicity, Compositionality and Transitivity of Deep NLP Models: a Metamorphic Testing Perspective

Edoardo Manino, Julia Rozanova, Danilo Carvalho, Andre Freitas, Lucas, Cordeiro

PDF

Open Access

TL;DR

This paper introduces new metamorphic relations to test systematicity, compositionality, and transitivity in deep NLP models, revealing inconsistencies in their linguistic behavior without relying on ground truth labels.

Contribution

It proposes three novel classes of metamorphic relations that expand testing capabilities for linguistic properties in NLP models, along with a graphical notation for summarizing these relations.

Findings

01

State-of-the-art NLP models often violate expected linguistic properties.

02

New metamorphic relations enable polynomially more test cases.

03

Models show internal inconsistencies in systematicity, compositionality, and transitivity.

Abstract

Metamorphic testing has recently been used to check the safety of neural NLP models. Its main advantage is that it does not rely on a ground truth to generate test cases. However, existing studies are mostly concerned with robustness-like metamorphic relations, limiting the scope of linguistic properties they can test. We propose three new classes of metamorphic relations, which address the properties of systematicity, compositionality and transitivity. Unlike robustness, our relations are defined over multiple source inputs, thus increasing the number of test cases that we can produce by a polynomial factor. With them, we test the internal consistency of state-of-the-art NLP models, and show that they do not always behave according to their expected linguistic properties. Lastly, we introduce a novel graphical notation that efficiently summarises the inner structure of metamorphic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Software Testing and Debugging Techniques · Topic Modeling