Systematicity, Compositionality and Transitivity of Deep NLP Models: a Metamorphic Testing Perspective
Edoardo Manino, Julia Rozanova, Danilo Carvalho, Andre Freitas, Lucas, Cordeiro

TL;DR
This paper introduces new metamorphic relations to test systematicity, compositionality, and transitivity in deep NLP models, revealing inconsistencies in their linguistic behavior without relying on ground truth labels.
Contribution
It proposes three novel classes of metamorphic relations that expand testing capabilities for linguistic properties in NLP models, along with a graphical notation for summarizing these relations.
Findings
State-of-the-art NLP models often violate expected linguistic properties.
New metamorphic relations enable polynomially more test cases.
Models show internal inconsistencies in systematicity, compositionality, and transitivity.
Abstract
Metamorphic testing has recently been used to check the safety of neural NLP models. Its main advantage is that it does not rely on a ground truth to generate test cases. However, existing studies are mostly concerned with robustness-like metamorphic relations, limiting the scope of linguistic properties they can test. We propose three new classes of metamorphic relations, which address the properties of systematicity, compositionality and transitivity. Unlike robustness, our relations are defined over multiple source inputs, thus increasing the number of test cases that we can produce by a polynomial factor. With them, we test the internal consistency of state-of-the-art NLP models, and show that they do not always behave according to their expected linguistic properties. Lastly, we introduce a novel graphical notation that efficiently summarises the inner structure of metamorphic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Testing and Debugging Techniques · Topic Modeling
