Assessing and Improving Syntactic Adversarial Robustness of Pre-trained   Models for Code Translation

Guang Yang; Yu Zhou; Xiangyu Zhang; Xiang Chen; Tingting Han; Taolue; Chen

arXiv:2310.18587·cs.SE·October 31, 2023·1 cites

Assessing and Improving Syntactic Adversarial Robustness of Pre-trained Models for Code Translation

Guang Yang, Yu Zhou, Xiangyu Zhang, Xiang Chen, Tingting Han, Taolue, Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces CoTR, a novel approach to evaluate and enhance the syntactic adversarial robustness of pre-trained models in code translation, demonstrating improved robustness through adversarial training and data augmentation.

Contribution

The study proposes CoTR, a new method combining adversarial example generation and robustness enhancement techniques specifically for code translation models.

Findings

01

CoTR-A significantly reduces existing PTMs' performance.

02

CoTR-D improves the robustness and generalization of PTMs.

03

Evaluation on Java to Python datasets confirms effectiveness.

Abstract

Context: Pre-trained models (PTMs) have demonstrated significant potential in automatic code translation. However, the vulnerability of these models in translation tasks, particularly in terms of syntax, has not been extensively investigated. Objective: To fill this gap, our study aims to propose a novel approach CoTR to assess and improve the syntactic adversarial robustness of PTMs in code translation. Method: CoTR consists of two components: CoTR-A and CoTR-D. CoTR-A generates adversarial examples by transforming programs, while CoTR-D proposes a semantic distance-based sampling data augmentation method and adversarial training method to improve the model's robustness and generalization capabilities. The Pass@1 metric is used by CoTR to assess the performance of PTMs, which is more suitable for code translation tasks and offers a more precise evaluation in real world scenarios.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ntdxyg/cotr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling