White-box Testing of NLP models with Mask Neuron Coverage
Arshdeep Sekhon, Yangfeng Ji, Matthew B. Dwyer, Yanjun Qi

TL;DR
This paper introduces Mask Neuron Coverage (MNCOVER), a white-box testing method for transformer-based NLP models that enhances testing efficiency and fault detection by analyzing internal attention layers.
Contribution
It proposes MNCOVER, a novel white-box testing approach for NLP models, which refines test suites and guides data augmentation to improve model evaluation and robustness.
Findings
MNCOVER reduces test suite size by over 60% while retaining faults.
MNCOVER improves fault detection and test suite efficiency.
It guides input generation and data augmentation for better NLP model testing.
Abstract
Recent literature has seen growing interest in using black-box strategies like CheckList for testing the behavior of NLP models. Research on white-box testing has developed a number of methods for evaluating how thoroughly the internal behavior of deep models is tested, but they are not applicable to NLP models. We propose a set of white-box testing methods that are customized for transformer-based NLP models. These include Mask Neuron Coverage (MNCOVER) that measures how thoroughly the attention layers in models are exercised during testing. We show that MNCOVER can refine testing suites generated by CheckList by substantially reduce them in size, for more than 60\% on average, while retaining failing tests -- thereby concentrating the fault detection power of the test suite. Further we show how MNCOVER can be used to guide CheckList input generation, evaluate alternative NLP testing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification
