Analyzing the Availability of E-Mail Addresses for PyPI Libraries
Alexandros Tsakpinis, Alexander Pretschner

TL;DR
This study empirically assesses the availability and validity of e-mail contact information for Python libraries on PyPI and GitHub, revealing high coverage but also significant invalid entries, and suggests improvements for maintainers.
Contribution
It provides the first large-scale analysis of contact information availability and validity across Python packages and their dependencies, highlighting areas for enhancing maintainer reachability.
Findings
79.1% of libraries have at least one valid e-mail address
Coverage of contact info is high in dependency chains (up to 97.7%)
Over 793,000 invalid entries identified, mainly missing fields
Abstract
Background: Open Source Software (OSS) libraries form the backbone of modern software systems, yet their long-term sustainability often depends on maintainers being reachable for support, coordination, and security reporting. Aims: In this paper, we empirically analyze the availability of contact information, specifically e-mail addresses, across 754,413 Python libraries on the Python Package Index (PyPI) and their associated GitHub repositories. Method: We examine where maintainers provide this information, assess its validity, and explore coverage across individual libraries and their dependency chains. Results: Our findings show that 79.1% of libraries include at least one valid e-mail address, with PyPI serving as the primary source (76.5%). When analyzing dependency chains, we observe that up to 97.7% of direct and 97.5% of transitive dependencies provide valid contact information.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
