Co-Evolution of Types and Dependencies: Towards Repository-Level Type Inference for Python Code
Shuo Sun, Shixin Zhang, Jiwei Yan, Jun Yan, Jian Zhang

TL;DR
This paper introduces extit{methodName}, a novel repository-level Python type inference approach using large language models, which models object dependencies and iteratively refines types for improved accuracy and reliability.
Contribution
The paper presents a new method that models repository-level type dependencies with an Entity Dependency Graph and iteratively refines types, improving accuracy over existing tools.
Findings
Achieved a TypeSim score of 0.89 and TypeExact score of 0.84, outperforming baselines.
Reduced new type errors introduced by the tool by 92.7%.
Significantly advances automated type inference for large Python repositories.
Abstract
Python's dynamic typing mechanism, while promoting flexibility, is a significant source of runtime type errors that plague large-scale software, which inspires the automatic type inference techniques. Existing type inference tools have achieved advances in type inference within isolated code snippets. However, repository-level type inference remains a significant challenge, primarily due to the complex inter-procedural dependencies that are difficult to model and resolve. To fill this gap, we present \methodName, a novel approach based on LLMs that achieves repository-level type inference through the co-evolution of types and dependencies. \methodName~constructs an Entity Dependency Graph (EDG) to model the objects and type dependencies across the repository. During the inference process, it iteratively refines types and dependencies in EDG for accurate type inference. Our key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Scientific Computing and Data Management
