0% found this document useful (0 votes)
3 views

Backus–Naur Form — Computer Science

BNF

Uploaded by

dva4331
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Backus–Naur Form — Computer Science

BNF

Uploaded by

dva4331
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Figure 1: Venn diagram: context-free and regular languages

Most programming languages can be defined as context-free languages. Context-fr


languages are more complex than regular languages. To define the set of strings in
context free language, use context-free grammar
grammar: a set of production rules that d
possible strings (in a given language). One example of a context-free grammar is Ba
Form.
rule (or set of rules) to define its replacement. Consider the following production ru

<fullname> :: =<title><name><name>

This shows that full name comprises a title


title, a name
name, and another name
name. Howeve
component parts are non-terminal. Therefore, further production rules are required.
example, a production rule may define title as follows:

<title> :: =Mr|Mrs|Ms|Miss|Dr

In this rule, Mr
Mr, Mrs
Mrs, Ms
Ms, Miss
Miss, and Dr are terminal symbols
symbols. They are not enclos
chevrons so they are the actual values that are allowed for title
title. The pipe symbol |
metacharacter that is used to denote alternatives. Each production rule will be stric
you may know that other titles exist, for example Lord, but the production rule alone
the valid options for use in this particular formal language.

Production rules for something as complex as the syntax of a language, will come a
large set of BNF statements that specify how every aspect of the language is define
Whenever you find a non-terminal symbol on the right side of a production rule, ther
another rule that has the symbol on the left side. This continues until everything can
specified in relation to terminal symbols.

Here is a complete set of rules (for a small subset of a programming language):

<addition> ::= <number>+<number>


<number> ::= <sign><integer>|<integer>
<integer> ::= <digit>|<digit><integer>
<digit>::=0|1|2|3|4|5|6|7|8|9
<sign> ::= +|-

Terminal symbols are the digits 0–9, and the plus and minus signs. Note that the pl
appears twice, once as an operator and once as the sign for a number. A valid addit
statement could have a double plus sign, e.g. 23 + +6
It is clear that 2 is a number, as it is a digit (therefore satisfying the first alternative)

<number> ::= <digit> |<digit><number>

What about 16
16? Well, 16 is not a digit, so it doesn’t satisfy the first alternative. Look
second alternative that states that a number can be defined as a <digit> followed by
<number>.

<number> ::= <digit>| <digit><number>

1 is a digit so that is ok. Is 6 a number? Yes — you can go back to the rule and see t
satisfies the first part:

<number> ::= <digit> |<digit><number>

Finally, consider 234


234. Immediately the first alternative can be discounted as there is
one digit. Using the second part of the rule, you can start to satisfy this as 2 is a dig
remainder, 34
34, a number? Again, use the second part of the rule and be happy that 3
Is the remainder, 4 , a number? Yes, as it satisfies the first part of the rule. Notice ho
return to the rule taking each symbol, one at a time until the supply is exhausted (or
encounter a symbol that means the rule cannot be satisfied).

If you have already studied how recursion works, you will see that this recursive pro
has a base case (i.e. a single digit number) and a general case (i.e. a single digit
a number). In the general case, a recursive definition has been used. You will also se
problem (in this case working out if 234 is a number) gets smaller and smaller until
the base case: is 234 a number? Is 34 a number? Is 4 a number?

A Level Parse trees

Parse trees can be very useful to check whether a string satisfies a production rule.
means to break something down into its component parts. Consider the set of prod
that were examined earlier:

<addition> ::= <number>+<number>


<number> ::= <sign><integer>|<integer>
<integer> ::= <digit>|<digit><integer>
<digit>::=0|1|2|3|4|5|6|7|8|9
<sign> ::= +|-

Let’s look at how to make use of a parse tree to check whether 24+54 is a valid add
ad
Figure 2: Parse tree step 1

Step 2. To be a valid addition, the strings either side of the + symbol must satisfy t
production rule <number> ::= <sign><integer>|<integer>
<sign><integer>|<integer>. You must have a sign
by an integer
integer, or just an integer
integer. Well, neither 24 or 54 starts with a sign. To be va
addition needs to satisfy the production rule for integer
integer, so you can put integer be
number on either side of the tree.

Figure 3: Parse tree step 2

Step 3. Now show that 24 is an integer


integer. The production rule for integer is <intege
<digit>|<digit><integer>
<digit>|<digit><integer>. 24 is not a digit (as digits are just single numeric symb
Therefore, to be valid, 24 must be a <digit><integer>
<digit><integer>. Put this on the next line of th
tree. You can do this on both sides, as 54 will be treated similarly.
Figure 4: Parse tree step 3

Step 4. Consider 24 again. Is it a digit followed by an integer? Well, 2 is definitely a


write 2 below <digit>
<digit>. Similarly, you can write 5 below <digit> on the right side of t
Highlight 2 and 5.
Figure 5: Parse tree step 4

Step 5. Now show that 4 is an integer (on both sides). The production rule for integ
<integer> ::= <digit>|<digit><integer>
<digit>|<digit><integer>. Well, 4 is a digit so write <digit> below <
and 4 below <digit>
<digit>. You can do this on the other side as well. Highlight both digits

You have proved, using a parse tree, that 24+54 is a valid addition
addition.
Figure 6: Parse tree for 24+54

A Level Syntax diagrams

Alongside BNF notation, you can apply a diagrammatic approach that mirrors the fo
grammar through a syntax diagram
diagram.

In the absence of a key, you must make sure that you can recognise terminal and n
terminal symbols. Let's look at a simple example by considering a syntax diagram
following production rule:

<special> ::= @ | ? | & | *


Figure 7: Syntax diagram

By following one of the lines, any of these characters can be defined as special
special. Th
alternatives are stacked under each other and any of them can be picked as you fol
through the diagram from start to end.

Now let’s consider a more complex rule:

<date> ::= <day-name><month-name>|<day-name><month-name><year>

This could be drawn using a syntax diagram as shown below:


Figure 8: Syntax diagrams

Observe how to deal with the fact that year is an optional component, as specified b
production rule. Syntax diagrams are always read from left to right, following the arr
can follow the top line that omits the year, or drop-down to include it. You cannot go
yourself unless the arrows allow it. Recursion is dealt with by allowing a loop back t
diagram (in BNF the only way to represent a loop is by using recursion). The rule:

<number> ::= <digit>|<digit><number>

Would be represented by a syntax diagram as shown below:


Figure 1: Venn diagram: context-free and regular languages

Consider that you are required to describe the need for balanced parentheses in an
For every open (left) parenthesis, there must be a matching closed (right) parenthes
state machine has no way of remembering anything, except for being in a state that
corresponds to the fact. For example, there could be a state of having an unmatche
(left) parenthesis. A FSM that tried to 'count' unmatched parentheses would need a
state, '2 open' state, and so on. It could only represent an infinite number of open pa
it had an infinite number of open states — but, as the name makes clear, a FSM has
number of states.

Therefore, a context-free language is needed to describe syntax whenever there are


number of elements of strings to be counted.

You might also like