Describing the syntax of programming languages using conjunctive and Boolean grammars (2012.03538v1)

Published 7 Dec 2020 in cs.FL and cs.PL

Abstract: A classical result by Floyd ("On the non-existence of a phrase structure grammar for ALGOL 60", 1962) states that the complete syntax of any sensible programming language cannot be described by the ordinary kind of formal grammars (Chomsky's ``context-free''). This paper uses grammars extended with conjunction and negation operators, known as conjunctive grammars and Boolean grammars, to describe the set of well-formed programs in a simple typeless procedural programming language. A complete Boolean grammar, which defines such concepts as declaration of variables and functions before their use, is constructed and explained. Using the Generalized LR parsing algorithm for Boolean grammars, a program can then be parsed in time $O(n^4)$ in its length, while another known algorithm allows subcubic-time parsing. Next, it is shown how to transform this grammar to an unambiguous conjunctive grammar, with square-time parsing. This becomes apparently the first specification of the syntax of a programming language entirely by a computationally feasible formal grammar.