TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference
Chong Wang, Jian Zhang, Yiling Lou, Mingwei Liu, Weisong Sun, Yang, Liu, and Xin Peng

TL;DR
TIGER is a two-stage framework that improves Python type inference by generating a broad set of type candidates and accurately ranking them, especially excelling in handling complex and user-defined types.
Contribution
The paper introduces TIGER, a novel generating-then-ranking framework utilizing pre-trained code models to enhance type inference accuracy for diverse Python types, including complex generics and user-defined types.
Findings
Outperforms existing methods in inferring user-defined types by 11.2% in Top-5 accuracy.
Achieves a 20.1% improvement in inferring unseen types.
Demonstrates superior efficiency and effectiveness in practical Python type inference.
Abstract
Python's dynamic typing system offers flexibility and expressiveness but can lead to type-related errors, prompting the need for automated type inference to enhance type hinting. While existing learning-based approaches show promising inference accuracy, they struggle with practical challenges in comprehensively handling various types, including complex generic types and (unseen) user-defined types. In this paper, we introduce TIGER, a two-stage generating-then-ranking (GTR) framework, designed to effectively handle Python's diverse type categories. TIGER leverages fine-tuned pre-trained code models to train a generative model with a span masking objective and a similarity model with a contrastive training objective. This approach allows TIGER to generate a wide range of type candidates, including complex generics in the generating stage, and accurately rank them with user-defined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Topic Modeling · Scientific Computing and Data Management
