Efficiently Computing Edit Distance to Dyck Language
Barna Saha

TL;DR
This paper introduces a near-linear time algorithm for approximating the edit distance to the Dyck language, enabling efficient repair of semi-structured documents and other data structure languages.
Contribution
It presents the first near-linear time approximation algorithm for edit distance to Dyck(s), extending to various memory checking languages.
Findings
Achieves an $O(rac{1}{ ext{epsilon}} ext{log}OPT( ext{log}n)^{1/ ext{epsilon}})$ approximation in $O(n^{1+ ext{epsilon}} ext{log}n)$ time.
Framework applies to languages recognized by stacks, queues, and other data structures.
Generalizes string edit distance to complex language repair problems.
Abstract
Given a string over alphabet and a grammar defined over the same alphabet, how many minimum number of repairs: insertions, deletions and substitutions are required to map into a valid member of ? We investigate this basic question in this paper for . is a fundamental context free grammar representing the language of well-balanced parentheses with s different types of parentheses and has played a pivotal role in the development of theory of context free languages. Computing edit distance to significantly generalizes string edit distance problem and has numerous applications ranging from repairing semi-structured documents such as XML to memory checking, automated compiler optimization, natural language processing etc. In this paper we give the first near-linear time algorithm for edit distance computation to that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Web Data Mining and Analysis · Natural Language Processing Techniques
