ConfuGuard: Using Metadata to Detect Active and Stealthy Package Confusion Attacks Accurately and at Scale

Wenxin Jiang; Berk \c{C}akar; Mikola Lysenko; James C. Davis

arXiv:2502.20528·cs.CR·August 5, 2025

ConfuGuard: Using Metadata to Detect Active and Stealthy Package Confusion Attacks Accurately and at Scale

Wenxin Jiang, Berk \c{C}akar, Mikola Lysenko, James C. Davis

PDF

TL;DR

ConfuGuard is a novel, metadata-based detection system that accurately identifies package confusion attacks across multiple software ecosystems, significantly reducing false positives and proven effective in real-world deployment.

Contribution

This work introduces ConfuGuard, the first scalable detector leveraging package metadata to identify package confusion attacks across seven ecosystems, with improved accuracy and real-world validation.

Findings

01

False positive rate reduced from 80% to 28%.

02

Detected 630 real attacks in industry deployment.

03

Extended support from 3 to 7 package registries.

Abstract

Package confusion attacks such as typosquatting threaten software supply chains. Attackers make packages with names that syntactically or semantically resemble legitimate ones, tricking engineers into installing malware. While prior work has developed defenses against package confusions in some software package registries, notably NPM, PyPI, and RubyGems, gaps remain: high false-positive rates, generalization to more software package ecosystems, and insights from real-world deployment. In this work, we introduce ConfuGuard, a state-of-art detector for package confusion threats. We begin by presenting the first empirical analysis of benign signals derived from prior package confusion data, uncovering their threat patterns, engineering practices, and measurable attributes. Advancing existing detectors, we leverage package metadata to distinguish benign packages, and extend support from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.