Retrieval Augmented Generation Integrated Large Language Models in Smart Contract Vulnerability Detection
Jeffy Yu

TL;DR
This paper presents a novel approach combining Retrieval-Augmented Generation with GPT-4 to detect vulnerabilities in smart contracts, achieving over 60% success in automated security auditing, thus enhancing accessibility and efficiency.
Contribution
It introduces an integrated RAG-LLM framework for smart contract vulnerability detection, demonstrating promising results and paving the way for more accessible security auditing tools.
Findings
62.7% success rate with vulnerability type guidance
60.71% success rate in blind vulnerability detection
Proof of concept for cost-effective smart contract auditing
Abstract
The rapid growth of Decentralized Finance (DeFi) has been accompanied by substantial financial losses due to smart contract vulnerabilities, underscoring the critical need for effective security auditing. With attacks becoming more frequent, the necessity and demand for auditing services has escalated. This especially creates a financial burden for independent developers and small businesses, who often have limited available funding for these services. Our study builds upon existing frameworks by integrating Retrieval-Augmented Generation (RAG) with large language models (LLMs), specifically employing GPT-4-1106 for its 128k token context window. We construct a vector store of 830 known vulnerable contracts, leveraging Pinecone for vector storage, OpenAI's text-embedding-ada-002 for embeddings, and LangChain to construct the RAG-LLM pipeline. Prompts were designed to provide a binary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Blockchain Technology Applications and Security · Advanced Malware Detection Techniques
