Program Synthesis Over Noisy Data with Guarantees
Shivam Handa, Martin Rinard

TL;DR
This paper formalizes the problem of synthesizing programs from noisy data, introduces the concept of optimal loss functions based on noise sources, and provides conditions for convergence guarantees in noisy program synthesis.
Contribution
It introduces the first formalization and closed-form definition of optimal loss functions for noisy data, along with convergence conditions for synthesis algorithms.
Findings
Formalization of noisy data synthesis process
Definition of optimal loss functions based on noise sources
Conditions for convergence of synthesis algorithms
Abstract
We explore and formalize the task of synthesizing programs over noisy data, i.e., data that may contain corrupted input-output examples. By formalizing the concept of a Noise Source, an Input Source, and a prior distribution over programs, we formalize the probabilistic process which constructs a noisy dataset. This formalism allows us to define the correctness of a synthesis algorithm, in terms of its ability to synthesize the hidden underlying program. The probability of a synthesis algorithm being correct depends upon the match between the Noise Source and the Loss Function used in the synthesis algorithm's optimization process. We formalize the concept of an optimal Loss Function given prior information about the Noise Source. We provide a technique to design optimal Loss Functions given perfect and imperfect information about the Noise Sources. We also formalize the concept and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Formal Methods in Verification · Machine Learning and Algorithms
