Transformer Models for Type Inference in the Simply Typed Lambda   Calculus: A Case Study in Deep Learning for Code

Brando Miranda; Avi Shinnar; Vasily Pestun; and Barry Trager

arXiv:2304.10500·cs.PL·April 21, 2023·1 cites

Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code

Brando Miranda, Avi Shinnar, Vasily Pestun, and Barry Trager

PDF

Open Access

TL;DR

This paper investigates the use of transformer models for type inference in the simply typed lambda calculus, highlighting the challenges and optimizer effects in a systematic benchmark for reasoning about programming language features.

Contribution

It introduces a benchmark for transformer-based reasoning in typed lambda calculi and explores optimizer effects, revealing insights into training dynamics for code-related tasks.

Findings

01

Adafactor optimizer converges faster than Adam and RAdam.

02

Optimization landscape is challenging even for simple type inference tasks.

03

Transformer models face difficulties in generalization over formal datasets.

Abstract

Despite a growing body of work at the intersection of deep learning and formal languages, there has been relatively little systematic exploration of transformer models for reasoning about typed lambda calculi. This is an interesting area of inquiry for two reasons. First, typed lambda calculi are the lingua franc of programming languages. A set of heuristics that relate various typed lambda calculi to effective neural architectures would provide a systematic method for mapping language features (e.g., polymorphism, subtyping, inheritance, etc.) to architecture choices. Second, transformer models are widely used in deep learning architectures applied to code, but the design and hyperparameter space for them is large and relatively unexplored in programming language applications. Therefore, we suggest a benchmark that allows us to explore exactly this through perhaps the simplest and most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Natural Language Processing Techniques · Mathematics, Computing, and Information Processing

MethodsAdafactor · RAdam · Adam