Semantic Code Search for Smart Contracts
Chaochen Shi, Yong Xiang, Jiangshan Yu, Longxiang Gao

TL;DR
This paper introduces MM-SCS, a multi-modal model for semantic smart contract code search that effectively bridges the semantic gap and outperforms existing models in accuracy and speed.
Contribution
The paper proposes a novel multi-modal model incorporating a dependency graph and attention mechanisms, improving smart contract code search performance with limited training data.
Findings
MM-SCS achieves an MRR of 0.572, outperforming state-of-the-art models.
The model improves retrieval accuracy by up to 59.3%.
Search speed is competitive, with 0.34 seconds per query.
Abstract
Semantic code search technology allows searching for existing code snippets through natural language, which can greatly improve programming efficiency. Smart contracts, programs that run on the blockchain, have a code reuse rate of more than 90%, which means developers have a great demand for semantic code search tools. However, the existing code search models still have a semantic gap between code and query, and perform poorly on specialized queries of smart contracts. In this paper, we propose a Multi-Modal Smart contract Code Search (MM-SCS) model. Specifically, we construct a Contract Elements Dependency Graph (CEDG) for MM-SCS as an additional modality to capture the data-flow and control-flow information of the code. To make the model more focused on the key contextual information, we use a multi-head attention network to generate embeddings for code features. In addition, we use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Advanced Malware Detection Techniques
MethodsSoftmax · Linear Layer · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
