Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
Andr\'e Storhaug, Jingyue Li, and Tianyuan Hu

TL;DR
This paper introduces a vulnerability-constrained decoding method for transformer-based code auto-completion models, significantly reducing the generation of vulnerable smart contract code during automatic completion.
Contribution
It presents a novel fine-tuning and decoding approach that acts as an embedded classifier to avoid generating vulnerable code in smart contract auto-completion.
Findings
The fine-tuned model achieved an average BLEU score of 0.557.
Over 70% of auto-completed codes were initially vulnerable.
The approach avoided generating 67% of potential vulnerabilities.
Abstract
Auto-completing code enables developers to speed up coding significantly. Recent advances in transformer-based large language model (LLM) technologies have been applied to code synthesis. However, studies show that many of such synthesized codes contain vulnerabilities. We propose a novel vulnerability-constrained decoding approach to reduce the amount of vulnerable code generated by such models. Using a small dataset of labeled vulnerable lines of code, we fine-tune an LLM to include vulnerability labels when generating code, acting as an embedded classifier. Then, during decoding, we deny the model to generate these labels to avoid generating vulnerable code. To evaluate the method, we chose to automatically complete Ethereum Blockchain smart contracts (SCs) as the case study due to the strict requirements of SC security. We first fine-tuned the 6-billion-parameter GPT-J model using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlockchain Technology Applications and Security
MethodsVulnerability-constrained Decoding
