Describing the syntax of programming languages using conjunctive and Boolean grammars
Alexander Okhotin

TL;DR
This paper demonstrates how conjunctive and Boolean grammars can precisely describe the syntax of a programming language, enabling feasible parsing algorithms that surpass traditional context-free grammar limitations.
Contribution
It introduces the first complete Boolean grammar for a programming language's syntax and shows how to transform it into an unambiguous conjunctive grammar with efficient parsing.
Findings
Boolean grammar allows complete syntax specification
Parsing with Boolean grammars can be done in polynomial time
Transformation to conjunctive grammar enables square-time parsing
Abstract
A classical result by Floyd ("On the non-existence of a phrase structure grammar for ALGOL 60", 1962) states that the complete syntax of any sensible programming language cannot be described by the ordinary kind of formal grammars (Chomsky's ``context-free''). This paper uses grammars extended with conjunction and negation operators, known as conjunctive grammars and Boolean grammars, to describe the set of well-formed programs in a simple typeless procedural programming language. A complete Boolean grammar, which defines such concepts as declaration of variables and functions before their use, is constructed and explained. Using the Generalized LR parsing algorithm for Boolean grammars, a program can then be parsed in time in its length, while another known algorithm allows subcubic-time parsing. Next, it is shown how to transform this grammar to an unambiguous conjunctive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · semigroups and automata theory · Formal Methods in Verification
