Input Repair via Synthesis and Lightweight Error Feedback

Lukas Kirschner; Ezekiel Soremekun; Rahul Gopinath; Andreas Zeller

arXiv:2208.08235·cs.SE·August 18, 2022

Input Repair via Synthesis and Lightweight Error Feedback

Lukas Kirschner, Ezekiel Soremekun, Rahul Gopinath, Andreas Zeller

PDF

Open Access

TL;DR

This paper introduces FSYNTH, a grammar-agnostic input repair method that uses lightweight error feedback to efficiently fix corrupt data, significantly improving over previous approaches like DDMax.

Contribution

FSYNTH leverages lightweight failure feedback and input synthesis to repair invalid inputs without requiring program analysis or input specifications.

Findings

01

Recovers 91% of valid inputs in real-world tests

02

Repairs 77% of invalid inputs within four minutes

03

Outperforms DDMax by up to 35% in effectiveness

Abstract

Often times, input data may ostensibly conform to a given input format, but cannot be parsed by a conforming program, for instance, due to human error or data corruption. In such cases, a data engineer is tasked with input repair, i.e., she has to manually repair the corrupt data such that it follows a given format, and hence can be processed by the conforming program. Such manual repair can be time-consuming and error-prone. In particular, input repair is challenging without an input specification (e.g., input grammar) or program analysis. In this work, we show that incorporating lightweight failure feedback (e.g., input incompleteness) to parsers is sufficient to repair any corrupt input data with maximal closeness to the semantics of the input data. We propose an approach (called FSYNTH) that leverages lightweight error-feedback and input synthesis to repair invalid inputs. FSYNTH…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Security and Verification in Computing · Parallel Computing and Optimization Techniques