MT4DP: Data Poisoning Attack Detection for DL-based Code Search Models via Metamorphic Testing
Gong Chen, Wenjie Liu, Xiaoyuan Xie, Xunzhu Tang, Tegawend\'e F. Bissyand\'e, Songqiang Chen

TL;DR
This paper introduces MT4DP, a metamorphic testing-based framework that effectively detects data poisoning attacks in deep learning code search models, significantly outperforming existing methods.
Contribution
Proposes a novel SE-MR based detection framework for data poisoning in DL-based code search models, enhancing detection accuracy and robustness.
Findings
MT4DP outperforms baselines by 191% in F1 score
Achieves 265% improvement in average precision
Effectively detects malicious patterns in training data
Abstract
Recently, several studies have indicated that data poisoning attacks pose a severe security threat to deep learning-based (DL-based) code search models. Attackers inject carefully crafted malicious patterns into the training data, misleading the code search model to learn these patterns during training. During the usage of the poisoned code search model for inference, once the malicious pattern is triggered, the model tends to rank the vulnerability code higher. However, existing detection methods for data poisoning attacks on DL-based code search models remain insufficiently effective. To address this critical security issue, we propose MT4DP, a Data Poisoning Attack Detection Framework for DL-based Code Search Models via Metamorphic Testing. MT4DP introduces a novel Semantically Equivalent Metamorphic Relation (SE-MR) designed to detect data poisoning attacks on DL-based code search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Reliability and Analysis Research · Software Engineering Research
