0% found this document useful (0 votes)
152 views

Answers: The University of Nottingham

The document is about a level 2 module exam for a machines and their languages course. It provides the exam instructions and questions. The questions are multiple choice about concepts like alphabets, words, languages, finite automata, regular expressions, and context-free grammars. For one question, the student must perform the subset construction to convert a non-deterministic finite automaton to a deterministic finite automaton.

Uploaded by

GoldFacts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views

Answers: The University of Nottingham

The document is about a level 2 module exam for a machines and their languages course. It provides the exam instructions and questions. The questions are multiple choice about concepts like alphabets, words, languages, finite automata, regular expressions, and context-free grammars. For one question, the student must perform the subset construction to convert a non-deterministic finite automaton to a deterministic finite automaton.

Uploaded by

GoldFacts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

G52MAL-E1

The University of Nottingham


SCHOOL OF COMPUTER SCIENCE

A LEVEL 2 MODULE, AUTUMN SEMESTER 2008–2009

MACHINES AND THEIR LANGUAGES

ANSWERS
Time allowed TWO hours

Candidates may complete the front cover of their answer book and sign
their desk card but must NOT write anything else until the start of the
examination period is announced.

Answer QUESTION ONE and any TWO other questions

No calculators are permitted in this examination.

Dictionaries are not allowed with one exception. Those whose first
language is not English may use a standard translation dictionary to
translate between that language and English provided that neither language
is the subject of this examination. Subject-specific translation directories
are not permitted.

No electronic devices capable of storing and retrieving


text, including electronic dictionaries, may be used.

Note: ANSWERS

G52MAL-E1 Turn Over


2 G52MAL-E1

Question 1 (Compulsory)
The following questions are multiple choice. There is at least one correct
choice, but there may be several. To get all the marks you have to list all
the correct answers and none of the wrong ones.
Answer: Note that the answer that should be provided is just a list of
the correct alternative(s). The additional explanations below are just for your
information.

(a) Which of the following statements are correct?

(i) An alphabet is a finite sequence of distinct symbols.


(ii) A word is a finite sequence of symbols over a given alphabet.
(iii) A language is a possibly infinite set of words over a given
alphabet.
(iv) A regular language is always infinite.
(v) An infinite language can be regular.
(5)
Answer: Correct: ii, iii, v
Incorrect:

i An alphabet is a set of symbols, not a sequence.


iv All finite languages are regular.

(b) Which of the following statements are correct?

(i) The empty word ǫ is the only word in the empty language ∅.
(ii) ǫ ∈ Σ∗ , where Σ = {0, 1}
(iii) ǫ ∈
/ ∅∗
(iv) The empty word ǫ belongs to all non-empty languages.
(v) ǫ ∈ {ai bj | i, j ∈ N, i + j ≤ 42}
(5)
Answer: Correct: ii, v
Incorrect:

i The empty language ∅, i.e. the empty set, does not contain any
words, not even the empty one.
iii By definition, ǫ ∈ L∗ for any language L, including the empty
language ∅.
iv {a} is an example of a non-empty language that does not contain
the empty word ǫ.

G52MAL-E1
3 G52MAL-E1

(c) Consider the following finite automaton A over Σ = {a, b, c}:

a, b, c a, b, c a, b, c a, b, c a, b, c

a b b a
0 1 2 3 4

Which of the following statements about A are correct?

(i) The automaton A is a Deterministic Finite Automaton (DFA).


(ii) ǫ ∈ L(A)
(iii) bacabca ∈ L(A)
(iv) bbaacbabcac ∈ L(A)
(v) The language accepted by the automaton A is all words over Σ
that contains each of the letters a, b, b, a at least once in that
order.
(5)
Answer: Correct: iv, v
Incorrect:

i The automaton is an NFA since more than one transition some-


times are possible for some states and alphabet symbols.
ii ǫ ∈
/ L(A) since no start state is accepting.
iii bacabca ∈/ L(A) since it is not possible to reach any accepting
state on this word.

(d) Consider the following regular expression:

a∗ (ab)∗ (abc)∗

Which of the following regular expressions denote the same language


as the above regular expression?

(i) (a + ab + abc)∗
(ii) a∗ (a + b)∗ (a + b + c)∗
(iii) a∗ (ǫ + ab)∗ (∅ + abc)∗
(iv) a∗ + (ab)∗ + (abc)∗
(v) a∗ (a∗ b∗ )∗ (a∗ b∗ c∗ )∗

(5)
Answer: Correct: iii
Incorrect: i, ii, iv, v

G52MAL-E1 Turn Over


4 G52MAL-E1

(e) Consider the following Context-Free Grammar (CFG) G:

S → XX | Y
X → aXc | aY c
Y → Yb|ǫ

where S, X, Y are nonterminal symbols, S is the start symbol, and a,


b, c are terminal symbols.
Which of the following statements about the language L(G) generated
by G are correct?

(i) ǫ ∈ L(G)
(ii) aabbbccac ∈ L(G)
(iii) aabbbccbb ∈ L(G)
(iv) L(G) = L1 L1 ∪ L2 where L1 = {ai bj ci | i, j ∈ N, i > 0} and
L2 = {bi | i ∈ N}
(v) The following CFG G′ is equivalent to G above, i.e. L(G′ ) =
L(G):
S → XX
X → aXc | Y
Y → Yb|ǫ

(5)
Answer: Correct: i, ii, iv
Incorrect:

iii Since the word contains a’s and c’s, the derivation must begin
S ⇒ XX. The only possibility to derive the word from XX is
to split it into two parts after the last c, and derive the first part
from the first X and the last part from the second X. But while
aabbbcc can be derived from the first X, bb cannot be derived from
the second.
v Not equivalent because ǫ can now be derived from X, meaning
that a word like ac ∈ L(G′ ). However, ac ∈
/ L(G).

G52MAL-E1
5 G52MAL-E1

Question 2

(a) Given the following NFA N over the alphabet Σ = {0, 1, 2}, construct
a DFA D(N ) that accepts the same language as N by applying the
subset construction:

q1

1 1, 2

1 0, 2
q0 q3 q4

0, 2 0

q2

To save work, consider only the reachable part of D(N ). Clearly show
your calculations in a state-transition table. Then draw the transition
diagram for the resulting DFA D(N ). Do not forget to indicate the
initial state and the final states both in the transition table and the
final transition diagram. (15)
Answer: Starting from S = {q0 }, the set of start states of N and thus
the start state of D(N ), we compute δ̂N (S, x) for each x ∈ Σ. When-
ever we encounter a state P ⊆ Q of D(N ) that has not been considered
before, we add P to the table and proceed to tabulate δ̂N (P, x) for each
x ∈ Σ. We repeat the process until no new states are encountered.
Finally, we identify the initial state (→ to the left of the state) and
all accepting states (∗ to the left of the state). Note that a DFA state
is accepting if it contains at least one accepting NFA state (as this
means it is possible to reach at least one accepting state on a given
word, which means that word is considered to be in the language of the
NFA).

δD(N ) 0 1 2
→ {q0 } {q2 } {q1 , q3 } ∅
{q2 } {q0 } ∅ {q0 }
{q1 , q3 } ∅ ∪ {q4 } = {q4 } {q0 } ∪ ∅ = {q0 } {q0 } ∪ {q4 } = {q0 , q4 }
∅ ∅ ∅ ∅
∗ {q4 } ∅ ∅ ∅
∗ {q0 , q4 } {q2 } ∪ ∅ = {q2 } {q1 , q3 } ∪ ∅ = {q1 , q3 } ∅∪∅=∅

G52MAL-E1 Turn Over


6 G52MAL-E1

We can now draw the transition diagram for D(N ):

0, 1, 2
2

1
{q2 } ∅
0 2
0
0, 2
{q0 } {q0 , q4 } 0, 1, 2
1 2
1 1 0
{q1 , q3 } {q4 }

Accepting states have been marked by outgoing arrows in this case.


That is an alternative to the double circle.

(b) Consider the language L over the alphabet Σ = {3, 5} of all words for
which the arithmetic sum of the constituent symbols is divisible by 5.
For example, ǫ ∈ L (there are no symbols in the empty string, the sum
is thus 0 which is divisible by 5), 555 ∈ L (5 + 5 + 5 = 15 which is
divisible by 5), and 335333 ∈ L (3 + 3 + 5 + 3 + 3 + 3 = 20 which is
divisible by 5), but 333 ∈
/ L (3 + 3 + 3 = 9 which is not divisible by 5
(the reminder is 4)).
Is L a regular language or not? If it is, construct a DFA A such that
L(A) = L. Your answer should consist of the transition diagram for
A, with initial and final state(s) clearly identified, along with a brief
justification (in plain English) that makes the idea behind the con-
struction clear and thus explains why the given automaton accepts
the language in question.
If L is not a regular language, prove this by using the pumping lemma
for regular languages. (10)
Answer: The language L is regular. It is just a variation of the type of
language exemplified by “all strings with an odd number of a particular
symbol”, and can thus be recognised by a variation of the DFAs for
recognising that type of language.
We need one state for each possible reminder when dividing by 5, i.e.
5 states. Let us label each state with the reminder in question. State 0
is thus both the initial and the only final state. We then just note that
if the reminder when dividing the sum n of the symbols seen so far by
5 is r, and the next symbol is i, then the reminder of n + i divided by
5 is just the reminder of r + i divided by 5.

G52MAL-E1
7 G52MAL-E1

5 3 5
4 1
3
3

3 3

3 2

5 5

Or, if you prefer, can be drawn like this:

0
3

3
5 5
2 3

4 1
3

5 5

G52MAL-E1 Turn Over


8 G52MAL-E1

Question 3

(a) Give regular expressions defining the following languages over the al-
phabet Σ = {a, b, c}:

(i) All words that contain an a followed by a b (possibly with other


symbols in between).
(ii) All words that contain at least one a and at most one b.

You only need to provide the regular expressions as an answer, but


they should not be unnecessarily complicated. (5)
Answer: Note: these are not the only possibilities, nor necessarily the
“simplest” in any formal sense. But they are all fairly simple, and your
answers should not be much more complicated.

(i) (b + c)∗ a(a + c)∗ b(a + b + c)∗


An alternative. (It is essential that the first parentheses is a choice
between a, b, and c, as opposed to just b and c. Why?)
(a + b + c)∗ ac∗ b(a + b + c)∗
(ii) c∗ a(a + c)∗ (ǫ + b)(a + c)∗ + c∗ (ǫ + b)c∗ a(a + c)∗
Another alternative. (It is essential that the first parentheses is
a choice between a and c, as opposed to starting with just an
iteration of c. Why?)
(a + c)∗ ((b + ǫ)c∗ a + ac∗ (b + ǫ))(a + c)∗
And a slight variation:
(a + c)∗ (a + ac∗ b + bc∗ a)(a + c)∗

(b) Systematically construct an NFA for the regular expression

(a(ǫ + b))∗ (c + d + ∅)

by following the graphical construction from the lecture notes. Make


sure it is clear how you undertake the construction by showing the
major steps. Eliminate “dead ends” (states from which no final state
can be reached) when they appear. The states in the final NFA should
be named, but as long as it is clear what you are doing, you can leave
the states of intermediate NFAs unnamed. (10)
Answer: First construct an NFA A for the subexpression (a(ǫ + b))∗
according to the lecture notes. (I have named the states according to
how they will be named in the final NFA to make it easier to follow
the derivation. It is OK to leave states unnamed to the end. Also, it is
not necessary to show all intermediate stages of the construction: here
for clarity only.) NFA for a (to the left) and for ǫ + b (to the right):

G52MAL-E1
9 G52MAL-E1

2
a
0 1
b
3 4

Join the above two NFAs to obtain an NFA for a(ǫ + b), keeping in
mind that the left DFA does not accept ǫ:

a
2
a
0 1
b
a 3 4

It is now clear state 1 is a dead end, so it can be removed prior to


carrying out the construction for the Kleene star (not forgetting one
extra state for accepting ǫ) leaving us with the following NFA A for
(a(ǫ + b))∗ :
a
a 2

0 b

a b
3 4

The NFA B for (c + d + ∅) is simply:

c
6 7

d
8 9

10

G52MAL-E1 Turn Over


10 G52MAL-E1

It’s clear that state 10 will become a dead end, so it can be removed.
Now, joining A with B (less state 10), while keeping in mind that A
can accept ǫ which means states 6 and 8 will remain initial states,
gives:

a
a a c
2 6 7
a
0 b b
a b d
3 4 8 9

It’s now clear that states 2, 4, and 5 are all dead ends, so we can
simplify and obtain the final NFA for (a(ǫ + b))∗ (c + d + ∅):

a
a c
6 7
a b
0
b
a b d
3 8 9

Note that there are three initial states: 0, 6, and 8.

(c) Construct an unambiguous Context-Free Grammar (CFG) for regular


expressions over the alphabet Σ = {a, b, c} (with the syntax defined in
the lecture notes). To ensure your grammar is unambiguous, it should
reflect the precedence and associativity for the regular expression con-
structs as specified by the following table:

Operators Precedence Associativity


∗ highest n/a
concatenation medium left
+ lowest left

For example,
(a(ǫ + b))∗ + ∅

G52MAL-E1
11 G52MAL-E1

is a valid regular expression, whereas both

(a

(because the parentheses are not balanced) and

a+

(because + is a binary operator) are not. (10)


Answer: The following is one possible grammar. It has been stratified
to capture the desired precedence levels, and left recursion is used to
impart left associativity on the relevant constructs according to the
specification:

E → E + E1 | E1
E1 → E1 E2 | E2
E2 → E2 ∗ | E3
E3 → ( E ) | EP
EP → a | b | c | ǫ | ∅

Here, E, E1 , E2 , and EP are nonterminals with E being the start


symbol, and +, ∗, (, ), a, b, c, ǫ, ∅ are terminals. (Note in particular
that EP → ǫ is not an epsilon production in this case!)

G52MAL-E1 Turn Over


12 G52MAL-E1

Question 4

(a) Consider the following Context-Free Grammar (CFG):

S → SpA | A
A → BmA | B
B → a | b | c | lSr

S, A, and B are nonterminals, a, b, c, l, m, p, and r are terminals, and


S is the start symbol.
Draw the derivation tree according to this grammar for the word
amlapbpcrma. (5)
Answer: Derivation tree for amlapbpcrma:

B m A

a B m A

l S r B

S p A a

S p A B

A B c

B b

(b) Explain what it means for a Context-Free Grammar (CFG) to be


ambiguous. (5)
Answer: A context-free grammar is ambiguous if there exists at least
one word in the language generated by the grammar for which which
there is more than one derivation tree, or, equivalently, for which there
is more than one leftmost or more than one rightmost derivation.

G52MAL-E1
13 G52MAL-E1

(c) Is the following CFG ambiguous? If yes, show this. If no, explain why.

A → AaA | AbA | B
B → c

A and B are nonterminals, A is the start symbol, a, b, and c are


terminals. (5)
Answer: Yes, the grammar is ambiguous. Two different leftmost deriva-
tions for the word cacac:

A ⇒ AaA
lm
⇒ BaA
lm
⇒ caA
lm
⇒ caAaA
lm
⇒ caBaA
lm
⇒ cacaA
lm
⇒ cacaB
lm
⇒ cacac
lm

and

A ⇒ AaA
lm
⇒ AaAaA
lm
⇒ BaAaA
lm
⇒ caAaA
lm
⇒ caBaA
lm
⇒ cacaA
lm
⇒ cacaB
lm
⇒ cacac
lm

G52MAL-E1 Turn Over


14 G52MAL-E1

(d) The following Context-Free Grammar (CFG) is immediately left-recur-


sive:

S → aS | bX
X → XXc | Xd | Y
Y → Ye|f |g

S, X, and Y are nonterminals, a, b, c, d, e, f , and g are terminals,


and S is the start symbol.
Transform this grammar into an equivalent right-recursive CFG. State
the general transformation rule you are using and show the main trans-
formation steps. (10)
Answer: First identify the immediately left-recursive non-terminals.
Then group the productions for each such non-terminal into two groups:
one where each RHS starts with the non-terminal in question, and one
where they don’t:

A → Aα1 | . . . | Aαm
A → β1 | . . . | βn

Then replace those productions with new productions for A and pro-
ductions for A′ , where A′ is a new name, as follows:

A → β1 A′ | . . . | βn A′
A′ → α1 A′ | . . . | αm A′ | ǫ

There are two immediately left-recursive non-terminals in the given


grammar: X and Y . The grammar is essentially already grouped as
required. Applying the above transformation rule to both the X and Y
productions yields:

S → aS | bX
X → Y X′
X ′ → XcX ′ | dX ′ | ǫ
Y → f Y ′ | gY ′
Y ′ → eY ′ | ǫ

G52MAL-E1
15 G52MAL-E1

Question 5
Consider the following Pushdown Automaton (PDA) P :
P = (Q = {q0 , q1 , q2 }, Σ = {a, b, c}, Γ = {a, #}, δ, q0 , Z0 = #, F = {q2 })
where the transition function δ is given by
δ(q0 , a, #) = {(q0 , a#)}
δ(q0 , c, #) = {(q0 , #)}
δ(q0 , a, a) = {(q0 , aa)}
δ(q0 , b, a) = {(q1 , ǫ)}
δ(q0 , c, a) = {(q0 , a)}
δ(q1 , c, #) = {(q1 , #)}
δ(q1 , b, a) = {(q1 , ǫ)}
δ(q1 , c, a) = {(q1 , a)}
δ(q1 , ǫ, #) = {(q2 , #)}
δ(q, w, z) = ∅ everywhere else
Acceptance is by final state.

(a) Which of the following words are accepted by the PDA P ?


(i) acabbc
(ii) abcabc
For those words that are accepted, provide a sequence of Instantaneous
Descriptions (IDs) leading to an accepting configuration as evidence.
For those words that are not accepted, explain why there is no sequence
of IDs leading to an accepting configuration. (10)
Answer:
(i) The word acabbc is accepted. ID sequence:
(q0 , acabbc, #) ⊢ (q0 , cabbc, a#)
⊢ (q0 , abbc, a#)
⊢ (q0 , bbc, aa#)
⊢ (q1 , bc, a#)
⊢ (q1 , c, #)
⊢ (q1 , ǫ, #)
⊢ (q2 , ǫ, #)
Accepting configuration since q2 is an accepting state and since
all input has been read.
[Marking: 5 points]

G52MAL-E1 Turn Over


16 G52MAL-E1

(ii) The string abcabc is not accepted. For the first two moves, there
is no choice:

(q0 , abcabc, #) ⊢ (q0 , bcabc, a#)


⊢ (q1 , cabc, #)

Here there is a choice: consume c and stay in q1 or an epsilon


move to q2 . But the latter cannot lead to acceptance at this point
since not all input has been read.

⊢ (q1 , abc, #)

Now the only possibility is an epsilon move to q2 :

⊢ (q2 , abc, #)

This is not an accepting configuration as not all input has been


read.
[Marking: 5 points]

G52MAL-E1
17 G52MAL-E1

(b) Consider the following Context-Free Grammar (CFG):


S → ABC | BC
A → aA | a
B → b|ǫ
C → c|d|ǫ
S, A, B, and C are nonterminals, a, b, c, and d are terminals, and S
is the start symbol.
(i) What is the set Nǫ of nullable nonterminals? Provide a brief jus-
tification. (2)
Answer: Nǫ = {S, B, C}. B is nullable because B → ǫ is a
production. C is nullable because C → ǫ is a production. S is
nullable because S → BC is a production and both B and C are
nullable. A is not nullable since the RHS of all productions for A
start with a terminal (a).
(ii) Systematically compute the first sets for all nonterminals, i.e.
first(S), first(A), first(B), and first(C), by setting up and solving
the equations according to the definitions of first sets for nonter-
minals and strings of grammar symbols. Show your calculations.
(4)
Answer:
first(A) = first(aA) ∪ first(a)
= {a} ∪ {a}
= {a}

first(B) = first(b) ∪ first(ǫ)


= {b} ∪ ∅
= {b}

first(C) = first(c) ∪ first(d) ∪ first(ǫ)


= {c} ∪ {d} ∪ ∅
= {c, d}

first(S) = first(ABC) ∪ first(BC)


= (first(A) ∪ ∅) ∪ (first(B) ∪ first(C) ∪ ∅)
= {a} ∪ {b} ∪ {c, d}
= {a, b, c, d}

G52MAL-E1 Turn Over


18 G52MAL-E1

(iii) Set up the subset constraint system that defines the follow sets for
all nonterminals, i.e. follow(S), follow(A), follow(B), and follow(C).
Simplify where possible using the law

A⊆C ∧ B⊆C ⇐⇒ A∪B ⊆ C

and the fact that constraints like A ⊆ A are trivially satisfied and
can be omitted. (7)
Answer: Note: detailed account below for clarity. It is sufficient
to just state the constraints according to the definitions and then
simplify.
Constraints for follow(S):

{$} ⊆ follow(S)

Constraints for follow(A) from the productions where A occurs in


the RHS, i.e.

S → ABC
A → aA

(note: nullable(BC) and nullable(ǫ)):

first(BC) ⊆ follow(A)
follow(S) ⊆ follow(A)
first(ǫ) ⊆ follow(A)
follow(A) ⊆ follow(A)

Constraints for follow(B) from the productions where B occurs


in the RHS, i.e.

S → ABC
S → BC

(note: nullable(C)):

first(C) ⊆ follow(B)
follow(S) ⊆ follow(B)
first(C) ⊆ follow(B)
follow(S) ⊆ follow(B)

Constraints for follow(C) from the productions where C occurs


in the RHS, i.e.

S → ABC
S → BC

G52MAL-E1
19 G52MAL-E1

(note: nullable(ǫ)):

first(ǫ) ⊆ follow(C)
follow(S) ⊆ follow(C)

Using

first(ǫ) = ∅
first(C) = {c, d}
first(BC) = first(B) ∪ first(C) ∪ ∅
= {b} ∪ {c, d} = {b, c, d}

and eliminating trivial constraints yields:


{$} ⊆ follow(S)

{b, c, d} ⊆ follow(A)
follow(S) ⊆ follow(A)
∅ ⊆ follow(A)

{c, d} ⊆ follow(B)
follow(S) ⊆ follow(B)

∅ ⊆ follow(C)
follow(S) ⊆ follow(C)

This is equivalent to

{$} ⊆ follow(S)
{b, c, d} ∪ follow(S) ∪ ∅ ⊆ follow(A)
{c, d} ∪ follow(S) ⊆ follow(B)
∅ ∪ follow(S) ⊆ follow(C)

which can be further simplified to the final constraints:

{$} ⊆ follow(S)
{b, c, d} ∪ follow(S) ⊆ follow(A)
{c, d} ∪ follow(S) ⊆ follow(B)
∅ ∪ follow(S) ⊆ follow(C)

G52MAL-E1 Turn Over


20 G52MAL-E1

(iv) Solve the subset constraint system for the follow sets from the
previous question by finding the smallest sets satisfying the con-
straints. (2)
Answer: The smallest set satisfying the constraint for follow(S)
is obviously just {$}. Substituting this into the remaining con-
straints makes the smallest sets satisfying those obvious too. Thus:

follow(S) = {$}
follow(A) = {b, c, d} ∪ {$} = {b, c, d, $}
follow(B) = {c, d} ∪ {$} = {c, d, $}
follow(C) = {$}

G52MAL-E1 End

You might also like