Backus–Naur Form — Computer Science
Backus–Naur Form — Computer Science
<fullname> :: =<title><name><name>
<title> :: =Mr|Mrs|Ms|Miss|Dr
In this rule, Mr
Mr, Mrs
Mrs, Ms
Ms, Miss
Miss, and Dr are terminal symbols
symbols. They are not enclos
chevrons so they are the actual values that are allowed for title
title. The pipe symbol |
metacharacter that is used to denote alternatives. Each production rule will be stric
you may know that other titles exist, for example Lord, but the production rule alone
the valid options for use in this particular formal language.
Production rules for something as complex as the syntax of a language, will come a
large set of BNF statements that specify how every aspect of the language is define
Whenever you find a non-terminal symbol on the right side of a production rule, ther
another rule that has the symbol on the left side. This continues until everything can
specified in relation to terminal symbols.
Terminal symbols are the digits 0–9, and the plus and minus signs. Note that the pl
appears twice, once as an operator and once as the sign for a number. A valid addit
statement could have a double plus sign, e.g. 23 + +6
It is clear that 2 is a number, as it is a digit (therefore satisfying the first alternative)
What about 16
16? Well, 16 is not a digit, so it doesn’t satisfy the first alternative. Look
second alternative that states that a number can be defined as a <digit> followed by
<number>.
1 is a digit so that is ok. Is 6 a number? Yes — you can go back to the rule and see t
satisfies the first part:
If you have already studied how recursion works, you will see that this recursive pro
has a base case (i.e. a single digit number) and a general case (i.e. a single digit
a number). In the general case, a recursive definition has been used. You will also se
problem (in this case working out if 234 is a number) gets smaller and smaller until
the base case: is 234 a number? Is 34 a number? Is 4 a number?
Parse trees can be very useful to check whether a string satisfies a production rule.
means to break something down into its component parts. Consider the set of prod
that were examined earlier:
Let’s look at how to make use of a parse tree to check whether 24+54 is a valid add
ad
Figure 2: Parse tree step 1
Step 2. To be a valid addition, the strings either side of the + symbol must satisfy t
production rule <number> ::= <sign><integer>|<integer>
<sign><integer>|<integer>. You must have a sign
by an integer
integer, or just an integer
integer. Well, neither 24 or 54 starts with a sign. To be va
addition needs to satisfy the production rule for integer
integer, so you can put integer be
number on either side of the tree.
Step 5. Now show that 4 is an integer (on both sides). The production rule for integ
<integer> ::= <digit>|<digit><integer>
<digit>|<digit><integer>. Well, 4 is a digit so write <digit> below <
and 4 below <digit>
<digit>. You can do this on the other side as well. Highlight both digits
You have proved, using a parse tree, that 24+54 is a valid addition
addition.
Figure 6: Parse tree for 24+54
Alongside BNF notation, you can apply a diagrammatic approach that mirrors the fo
grammar through a syntax diagram
diagram.
In the absence of a key, you must make sure that you can recognise terminal and n
terminal symbols. Let's look at a simple example by considering a syntax diagram
following production rule:
By following one of the lines, any of these characters can be defined as special
special. Th
alternatives are stacked under each other and any of them can be picked as you fol
through the diagram from start to end.
Observe how to deal with the fact that year is an optional component, as specified b
production rule. Syntax diagrams are always read from left to right, following the arr
can follow the top line that omits the year, or drop-down to include it. You cannot go
yourself unless the arrows allow it. Recursion is dealt with by allowing a loop back t
diagram (in BNF the only way to represent a loop is by using recursion). The rule:
Consider that you are required to describe the need for balanced parentheses in an
For every open (left) parenthesis, there must be a matching closed (right) parenthes
state machine has no way of remembering anything, except for being in a state that
corresponds to the fact. For example, there could be a state of having an unmatche
(left) parenthesis. A FSM that tried to 'count' unmatched parentheses would need a
state, '2 open' state, and so on. It could only represent an infinite number of open pa
it had an infinite number of open states — but, as the name makes clear, a FSM has
number of states.