Effective Black Box Testing of Sentiment Analysis Classification Networks
Parsa Karbasizadeh, Fathiyeh Faghih, and Pouria Golshanrad

TL;DR
This paper introduces a black-box testing approach for transformer-based sentiment analysis models, using input space partitioning and emotional feature coverage to improve test comprehensiveness and identify model vulnerabilities.
Contribution
It proposes novel coverage criteria and a k-projection coverage metric for systematic testing of sentiment analysis networks, leveraging large language models for test case generation.
Findings
Average 16% increase in test coverage
Average 6.5% decrease in model accuracy
Effective identification of model vulnerabilities
Abstract
Transformer-based neural networks have demonstrated remarkable performance in natural language processing tasks such as sentiment analysis. Nevertheless, the issue of ensuring the dependability of these complicated architectures through comprehensive testing is still open. This paper presents a collection of coverage criteria specifically designed to assess test suites created for transformer-based sentiment analysis networks. Our approach utilizes input space partitioning, a black-box method, by considering emotionally relevant linguistic features such as verbs, adjectives, adverbs, and nouns. In order to effectively produce test cases that encompass a wide range of emotional elements, we utilize the k-projection coverage metric. This metric minimizes the complexity of the problem by examining subsets of k features at the same time, hence reducing dimensionality. Large language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies
