Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples
Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, Swarat, Chaudhuri

TL;DR
This paper introduces a flexible, component-based synthesis method for automating data table transformation tasks driven by examples, leveraging type-guided search, SMT deduction, and partial evaluation to improve scalability and effectiveness.
Contribution
It presents a novel, scalable synthesis approach that can generate data transformation programs from arbitrary components, including higher-order combinators, using type-directed enumeration and SMT-based reasoning.
Findings
Successfully solves diverse data preparation tasks from online forums.
Outperforms existing methods in automating table transformations.
Demonstrates scalability to complex, real-world data tasks.
Abstract
This paper presents an example-driven synthesis technique for automating a large class of data preparation tasks that arise in data science. Given a set of input tables and an out- put table, our approach synthesizes a table transformation program that performs the desired task. Our approach is not restricted to a fixed set of DSL constructs and can synthesize programs from an arbitrary set of components, including higher-order combinators. At a high-level, our approach performs type-directed enumerative search over partial pro- grams but incorporates two key innovations that allow it to scale: First, our technique can utilize any first-order specification of the components and uses SMT-based deduction to reject partial programs. Second, our algorithm uses partial evaluation to increase the power of deduction and drive enumerative search. We have evaluated our synthesis algorithm on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Quality and Management · Data Mining Algorithms and Applications
