# Operator-precedence grammar

An **operator precedence grammar** is a kind of grammar for formal languages.

Technically, an operator precedence grammar is a context-free grammar that has the property (among others^{[1]}) that no production has either an empty right-hand side or two adjacent nonterminals in its right-hand side. These properties allow precedence relations to be defined between the terminals of the grammar. A parser that exploits these relations is considerably simpler than more general-purpose parsers such as LALR parsers. Operator-precedence parsers can be constructed for a large class of context-free grammars.

## Precedence relations

Operator precedence grammars rely on the following three precedence relations between the terminals:^{[2]}

Relation | Meaning |
---|---|

$a\lessdot b$ | a yields precedence to b |

$a\doteq b$ | a has the same precedence as b |

$a\gtrdot b$ | a takes precedence over b |

These operator precedence relations allow to delimit the handles in the right sentential forms: $\lessdot$ marks the left end, $\doteq$ appears in the interior of the handle, and $\gtrdot$ marks the right end. Contrary to other shift-reduce parsers, all nonterminals are considered equal for the purpose of identifying handles.^{[3]} The relations do not have the same properties as their un-dotted counterparts; e. g. $a\doteq b$ does not generally imply $b\doteq a$, and $b\gtrdot a$ does not follow from $a\lessdot b$. Furthermore, $a\doteq a$ does not generally hold, and $a\gtrdot a$ is possible.

Let us assume that between the terminals a_{i} and *a*_{i+1} there is always exactly one precedence relation. Suppose that $ is the end of the string. Then for all terminals b we define: $\$\lessdot b$ and $b\gtrdot \$$. If we remove all nonterminals and place the correct precedence relation: $\lessdot$, $\doteq$, $\gtrdot$ between the remaining terminals, there remain strings that can be analyzed by an easily developed bottom-up parser.

### Example

For example, the following operator precedence relations can be introduced for simple expressions:^{[4]}

- ${\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}$

They follow from the following facts:^{[5]}

- + has lower precedence than * (hence $+\lessdot *$ and $*\gtrdot +$).
- Both + and * are left-associative (hence $+\gtrdot +$ and $*\gtrdot *$).

The input string^{[4]}

- $\mathrm {id} _{1}+\mathrm {id} _{2}*\mathrm {id} _{3}$

after adding end markers and inserting precedence relations becomes

- $\$\lessdot \mathrm {id} _{1}\gtrdot +\lessdot \mathrm {id} _{2}\gtrdot *\lessdot \mathrm {id} _{3}\gtrdot \$$

## Operator precedence parsing

Having precedence relations allows to identify handles as follows:^{[4]}

- scan the string from left until seeing $\gtrdot$
- scan backwards (from right to left) over any $\doteq$ until seeing $\lessdot$
- everything between the two relations $\lessdot$ and $\gtrdot$, including any intervening or surrounding nonterminals, forms the handle

It is generally not necessary to scan the entire sentential form to find the handle.

## Operator precedence parsing algorithm

The algorithm below is from Aho et al.:^{[6]}

If$ is on the top of the stack and ip points to $thenreturnelseLet a be the top terminal on the stack, and b the symbol pointed to by ipifa$\lessdot$bora$\doteq$bthenpushbonto the stack advance ip to the next input symbolelse ifa$\gtrdot$bthenrepeatpop the stackuntilthe top stack terminal is related by $\lessdot$ to the terminal most recently poppedelseerror()end

## Precedence functions

An operator precedence parser usually does not store the precedence table with the relations, which can get rather large. Instead, precedence functions *f* and *g* are defined.^{[7]} They map terminal symbols to integers, and so the precedence relations between the symbols are implemented by numerical comparison: $f(a)<g(b)$ must hold if $a\lessdot b$ holds, etc.

Not every table of precedence relations has precedence functions, but in practice for most grammars such functions can be designed.^{[8]}

### Algorithm for constructing precedence functions

The below algorithm is from Aho et al.:^{[9]}

- Create symbols f
_{a}and g_{a}for each grammar terminal a and for the end of string symbol; - Partition the created symbols in groups so that f
_{a}and g_{b}are in the same group if $a\doteq b$ (there can be symbols in the same group even if their terminals are not connected by this relation); - Create a directed graph whose nodes are the groups. For each pair $(a,b)$ of terminals do: place an edge from the group of g
_{b}to the group of f_{a}if $a\lessdot b$, otherwise if $a\gtrdot b$ place an edge from the group of f_{a}to that of g_{b}; - If the constructed graph has a cycle then no precedence functions exist. When there are no cycles, let $f(a)$ be the length of the longest path from the group of f
_{a}and let $g(a)$ be the length of the longest path from the group of g_{a}.

### Example

Consider the following table (repeated from above):^{[10]}

- ${\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}$

Using the algorithm leads to the following graph:

gid \ fid f* \ / g* / f+ | \ | g+ | | g$ f$

from which we extract the following precedence functions from the maximum heights in the directed acyclic graph:

id + * $ *f*4 2 4 0 *g*5 1 3 0

## Operator-precedence languages

The class of languages described by operator-precedence grammars, i.e., operator-precedence languages, is strictly contained in the class of deterministic context-free languages, and strictly contains visibly pushdown languages.^{[11]}

Operator-precedence languages enjoy many closure properties: union, intersection, complementation,^{[12]} concatenation,^{[11]} and they are the largest known class closed under all these operations and for which the emptiness problem is decidable. Another peculiar feature of operator-precedence languages is their local parsability,^{[13]} that enables efficient parallel parsing.

There are also characterizations based on an equivalent form of automata and monadic second-order logic.^{[14]}

## Notes

**^**Aho, Sethi & Ullman 1988, p. 203**^**Aho, Sethi & Ullman 1988, pp. 203–204**^**Aho, Sethi & Ullman 1988, pp. 205–206- ^
^{a}^{b}^{c}Aho, Sethi & Ullman 1988, p. 205 **^**Aho, Sethi & Ullman 1988, p. 204**^**Aho, Sethi & Ullman 1988, p. 206**^**Aho, Sethi & Ullman 1988, pp. 208–209**^**Aho, Sethi & Ullman 1988, p. 209**^**Aho, Sethi & Ullman 1988, pp. 209–210**^**Aho, Sethi & Ullman 1988, p. 210- ^
^{a}^{b}Crespi Reghizzi & Mandrioli 2012 **^**Crespi Reghizzi, Mandrioli & Martin 1978**^**Barenghi et al. 2015**^**Lonati et al. 2015

## References

- Aho, Alfred V.; Sethi, Ravi; Ullman, Jeffrey D. (1988).
*Compilers — Principles, Techniques, and Tools*. Addison-Wesley. - Crespi Reghizzi, Stefano; Mandrioli, Dino (2012). "Operator precedence and the visibly pushdown property".
*Journal of Computer and System Sciences*.**78**(6): 1837–1867. doi:10.1016/j.jcss.2011.12.006. - Crespi Reghizzi, Stefano; Mandrioli, Dino; Martin, David F. (1978). "Algebraic Properties of Operator Precedence Languages".
*Information and Control*.**37**(2): 115–133. doi:10.1016/S0019-9958(78)90474-6. - Barenghi, Alessandro; Crespi Reghizzi, Stefano; Mandrioli, Dino; Panella, Federica; Pradella, Matteo (2015). "Parallel parsing made practical".
*Science of Computer Programming*.**112**(3): 245–249. doi:10.1016/j.scico.2015.09.002. - Lonati, Violetta; Mandrioli, Dino; Panella, Federica; Pradella, Matteo (2015). "Operator Precedence Languages: Their Automata-Theoretic and Logic Characterization".
*SIAM Journal on Computing*.**44**(4): 1026–1088. doi:10.1137/140978818. hdl:2434/352809.

## Further reading

- Floyd, R. W. (July 1963). "Syntactic Analysis and Operator Precedence".
*Journal of the ACM*.**10**(3): 316–333. doi:10.1145/321172.321179. S2CID 19785090.

## External links

- Nikolay Nikolaev: IS53011A Language Design and Implementation, Course notes for CIS 324, 2010.

- v
- t
- e

- LL
- Recursive descent
- Tail recursive