Data Language Specification via Terminal Attribution
Alexander Sakharov, Timothy Sakharov

TL;DR
This paper introduces a simplified notation for defining LL(1) data language grammars by classifying terminals into layered groups, easing the development of data parsers.
Contribution
It proposes a new notation for data language grammars that simplifies parser development by classifying terminals into layered groups.
Findings
Simplifies grammar definition for data languages
Facilitates easier parser development
Reduces complexity of grammar debugging
Abstract
Unstructured data have to be parsed in order to become usable. The complexity of grammar notations and the difficulty of grammar debugging limit the use of parsers for data preprocessing. We introduce a notation in which grammars are defined by simply dividing terminals into predefined classes and then splitting elements of some classes into multiple layered sub-groups. These LL(1) grammars are designed for data languages. They simplify the task of developing data parsers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Web Data Mining and Analysis
