TL;DR
This paper introduces TM-RugPull, a meticulously curated, time-bound multimodal dataset for early detection of rug pulls in blockchain projects, addressing previous dataset limitations.
Contribution
The creation of TM-RugPull, a comprehensive, temporally validated dataset with multimodal data and manual labels for early rug pull detection in blockchain projects.
Findings
Dataset includes 1,000 projects across various categories.
Achieves temporal validation with on-chain, metadata, and OSINT data.
Provides publicly available codebase for data collection and feature extraction.
Abstract
Rug pull is a critical attack in the world of blockchain technology. Despite this, the absence of sufficient time-bound and well-structured datasets is considered one of the significant issues faced while identifying early detection. Existing datasets do not provide the solution to this challenge because of temporal leakage or use of post-collapse indicators, insufficient modality coverage, and confusing or partial labels, especially with regards to DeFi tokens. To solve these problems, we present a highly curated and strictly time-bound dataset called TM-RugPull containing 1,000 projects, which include DeFi, meme, NFT, and celebrity token projects. We achieve temporal validation of the dataset by acquiring all three modalities, namely on-chain behavior, smart contract metadata, and OSINT signals. The project labels are provided based on manual investigation for the entire project's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
