A Search-Based Testing Framework for Deep Neural Networks of Source Code Embedding
Maryam Vahdat Pour, Zhuo Li, Lei Ma, Hadi Hemmati

TL;DR
This paper introduces a search-based testing framework for deep neural networks processing source code, using code refactoring and mutation testing to generate adversarial samples that improve model robustness.
Contribution
It presents a novel testing approach combining refactoring and mutation testing for source code DNNs, with large-scale evaluation on popular models.
Findings
Adversarial samples reduce DNN performance by 5.41% to 9.58%.
Retraining with adversarial samples improves robustness by 23.05%.
Minimal impact on regular test data performance.
Abstract
Over the past few years, deep neural networks (DNNs) have been continuously expanding their real-world applications for source code processing tasks across the software engineering domain, e.g., clone detection, code search, comment generation. Although quite a few recent works have been performed on testing of DNNs in the context of image and speech processing, limited progress has been achieved so far on DNN testing in the context of source code processing, that exhibits rather unique characteristics and challenges. In this paper, we propose a search-based testing framework for DNNs of source code embedding and its downstream processing tasks like Code Search. To generate new test inputs, we adopt popular source code refactoring tools to generate the semantically equivalent variants. For more effective testing, we leverage the DNN mutation testing to guide the testing direction. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
