Static Inference Meets Deep Learning: A Hybrid Type Inference Approach   for Python

Yun Peng; Cuiyun Gao; Zongjie Li; Bowei Gao; David Lo; Qirun Zhang,; Michael Lyu

arXiv:2105.03595·cs.SE·February 10, 2022

Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for Python

Yun Peng, Cuiyun Gao, Zongjie Li, Bowei Gao, David Lo, Qirun Zhang,, Michael Lyu

PDF

1 Repo

TL;DR

This paper introduces HiTyper, a hybrid type inference method for Python that combines static analysis and deep learning, improving accuracy especially for rare types by leveraging type dependency graphs.

Contribution

It proposes a novel hybrid approach integrating static inference with neural predictions using type dependency graphs, addressing the limitations of each method individually.

Findings

01

Outperforms state-of-the-art DL models by 10% in matching annotations.

02

Increases inference of rare types by over 30%.

03

Static inference alone infers 2-3 times more types than existing tools.

Abstract

Type inference for dynamic programming languages such as Python is an important yet challenging task. Static type inference techniques can precisely infer variables with enough static constraints but are unable to handle variables with dynamic features. Deep learning (DL) based approaches are feature-agnostic, but they cannot guarantee the correctness of the predicted types. Their performance significantly depends on the quality of the training data (i.e., DL models perform poorly on some common types that rarely appear in the training dataset). It is interesting to note that the static and DL-based approaches offer complementary benefits. Unfortunately, to our knowledge, precise type inference based on both static inference and neural predictions has not been exploited and remains an open challenge. In particular, it is hard to integrate DL models into the framework of rule-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johnnypeng18/hityper
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.