Program Synthesis Over Noisy Data with Guarantees

Shivam Handa; Martin Rinard

arXiv:2103.05030·cs.PL·April 29, 2021·1 cites

Program Synthesis Over Noisy Data with Guarantees

Shivam Handa, Martin Rinard

PDF

Open Access

TL;DR

This paper formalizes the problem of synthesizing programs from noisy data, introduces the concept of optimal loss functions based on noise sources, and provides conditions for convergence guarantees in noisy program synthesis.

Contribution

It introduces the first formalization and closed-form definition of optimal loss functions for noisy data, along with convergence conditions for synthesis algorithms.

Findings

01

Formalization of noisy data synthesis process

02

Definition of optimal loss functions based on noise sources

03

Conditions for convergence of synthesis algorithms

Abstract

We explore and formalize the task of synthesizing programs over noisy data, i.e., data that may contain corrupted input-output examples. By formalizing the concept of a Noise Source, an Input Source, and a prior distribution over programs, we formalize the probabilistic process which constructs a noisy dataset. This formalism allows us to define the correctness of a synthesis algorithm, in terms of its ability to synthesize the hidden underlying program. The probability of a synthesis algorithm being correct depends upon the match between the Noise Source and the Loss Function used in the synthesis algorithm's optimization process. We formalize the concept of an optimal Loss Function given prior information about the Noise Source. We provide a technique to design optimal Loss Functions given perfect and imperfect information about the Noise Sources. We also formalize the concept and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Formal Methods in Verification · Machine Learning and Algorithms