Semantic-Preserving Adversarial Code Comprehension

Yiyang Li; Hongqiu Wu; Hai Zhao

arXiv:2209.05130·cs.CL·September 13, 2022·6 cites

Semantic-Preserving Adversarial Code Comprehension

Yiyang Li, Hongqiu Wu, Hai Zhao

PDF

Open Access 1 Repo

TL;DR

This paper introduces SPACE, a method that enhances code comprehension models by generating semantic-preserving adversarial attacks, improving robustness and performance simultaneously.

Contribution

SPACE is a novel approach that finds worst-case semantic-preserving attacks and trains models to remain accurate, balancing robustness and generalization.

Findings

01

SPACE improves model robustness against state-of-the-art attacks.

02

SPACE boosts the performance of pre-trained language models for code.

03

Models trained with SPACE maintain accuracy under adversarial conditions.

Abstract

Based on the tremendous success of pre-trained language models (PrLMs) for source code comprehension tasks, current literature studies either ways to further improve the performance (generalization) of PrLMs, or their robustness against adversarial attacks. However, they have to compromise on the trade-off between the two aspects and none of them consider improving both sides in an effective and practical way. To fill this gap, we propose Semantic-Preserving Adversarial Code Embeddings (SPACE) to find the worst-case semantic-preserving attacks while forcing the model to predict the correct labels under these worst cases. Experiments and analysis demonstrate that SPACE can stay robust against state-of-the-art attacks while boosting the performance of PrLMs for code.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ericlee8/space
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · COVID-19 diagnosis using AI