Automata and Complexity Theory
Automata and Complexity Theory
College of Computing
March 2023
Debre Berhan,
Ethiopia
Automata and Complexity Theory
Theory of Automata
The theory of automata is a theoretical branch of computer science and mathematics. It is the study
of abstract machines and the computation problems that can be solved using these machines. The
abstract machine is called the automata. The main motivation behind developing the automata
theory was to develop methods to describe and analyze the dynamic behavior of discrete systems.
This automaton consists of states and transitions. The State is represented by circles, and
the Transition is represented by arrows. Automata is the kind of machine that takes some string
as input and this input goes through a finite number of states and may enter the final state.
The basic terminologies used in Automata Theory:- Alphabets, Strings, Languages, and Grammar.
Symbols: Symbols are an entity or individual objects, which can be any letter, alphabet or any
picture. E.g.: 1, a, b, #
Alphabets:
An alphabet finite set of non-empty symbols. It is denoted by ∑ (sigma).
Examples:
Finite Automata
o Finite automata are used to recognize patterns.
o It takes the string of symbol as input and changes its state accordingly. When the desired
symbol is found, then the transition occurs.
o At the time of transition, the automata can either move to the next state or stay in the same
state.
o Finite automata have two states, Accept state or Reject state. When the input string is
processed successfully, and the automata reached its final state, then it will accept.
Formal Definition of FA
A finite automaton is a collection of 5-tuple (Q, ∑, δ, q0, F), where:
1. Q: finite set of states
2. ∑: finite set of the input symbol
3. q0: initial state
4. F: final state
5. δ: Transition function
Finite Automata Model: Finite automata can be represented by input tape and finite control.
Input tape: It is a linear tape having some number of cells. Each input symbol is placed in each
cell.
Finite control: The finite control decides the next state on receiving particular input from input
tape. The tape reader reads the cells one by one from left to right, and at a time only one input
symbol is read.
Types of Automata:
There are two types of finite automata:
1. DFA(deterministic finite automata)
2. NFA(non-deterministic finite automata)
1. DFA
DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. In the DFA, the machine goes to one state only for a particular input character. DFA
does not accept the null move.
2. NFA
NFA stands for non-deterministic finite automata. It is used to transmit any number of states for a
particular input. It can accept the null move.
Regular Expression
o The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
For instance:
In a regular expression, x* means zero or more occurrence of x. It can generate {e, x, xx, xxx,
xxxx, .....}
In a regular expression, x+ means one or more occurrence of x. It can generate {x, xx, xxx, xxxx,
.....}
1. 1. L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a regular language.
1. 1. L* = Zero or more occurrence of language L.
Conversion of RE to FA
To convert the RE to FA, we are going to use a method called the subset method. This method is
used to obtain FA from the given regular expression. This method is given below:
Step 1: Design a transition diagram for given regular expression, using NFA with ε moves.
Step 2: Convert this NFA with ε to NFA without ε.
Step 3: Convert the obtained NFA to equivalent DFA.
Example 1:
Design a FA from given regular expression 10 + (0 + 11)0* 1.
Solution: First we will construct the transition diagram for a given regular expression.
Step 1: Step 4:
Step 2:
Step 5:
Step 3:
Now we have got NFA without ε. Now we will convert it into required DFA for that, we will first
write a transition table for this NFA.
The equivalent DFA will be:
State 0 1
State 0 1
→q0 q3 {q1, q2}
→[q0] [q3] [q1, q2]
q1 qf ϕ
[q1] [qf] ϕ
q2 ϕ q3
[q2] ϕ [q3]
q3 q3 qf
[q3] [q3] [qf]
*qf ϕ ϕ
[q1, q2] [qf] [qf]
*[qf] ϕ ϕ
Regular Languages
A language is regular if it can be described by a regular expression.
The Regular Languages (LREG) is the set of all languages that can be represented by a regular
expression, i.e. set of set of strings.
A language is said to be a Regular language if and only if some finite state machine recognizes it
So what Language are not Regular?
The language
which is not recognized by any FMS and
which require memory
Memory of FMS is very limited
It can’t store and count string
The languages accepted by all DFAs form the family of regular languages.
The language defined by regular grammar is known as regular language. Regular expression is an
important notation for specifying patterns. Each pattern matches a set of strings, so regular
expressions serve as names for a set of strings. Programming language tokens can be described by
regular languages.
Regular expressions have the capability to express finite languages by defining a pattern for finite
strings of symbols. The grammar defined by regular expressions is known as regular grammar.
The language defined by regular grammar is known as regular language.
There are two Pumping Lemmas, which are defined for 1. Regular Languages and 2. Context –
Free Languages Pumping Lemma for Regular Languages For any regular language L, there exists
an integer n, such that for all x ∈ L with |x| ≥ n, there exists u, v, w ∈ Σ*, such that x = uvw, and
(1) |uv| ≤ n (2) |v| ≥ 1 (3) for all i ≥ 0: uviw ∈ L In simple terms, this means that if a string v is
‘pumped’, i.e., if v is inserted any number of times, the resultant string still remains in L. Pumping
Lemma is used as a proof for irregularity of a language. Thus, if a language is regular, it always
satisfies pumping lemma. If there exists at least one string made from pumping which is not in L,
then L is surely not regular. The opposite of this may not always be true. That is, if Pumping
Lemma holds, it does not mean that the language is regular.
Context-free grammars can be modelled as parse trees. The nodes of the tree represent the
symbols and the edges represent the use of production rules. The leaves of the tree are the end
result (terminal symbols) that make up the string the grammar is generating with that particular
sequence of symbols and production rules. Grammar can be implemented with multiple parse
trees to get the same resulting string, this is said to be ambiguous.
A sentential form is any string derivable from the start symbol. Thus, in the derivation of a + a *
a, E + T * F and E + F * a and F + a * a are all sentential forms as are E and a + a * a themselves.
A sentence is a sentential form consisting only of terminals such as a + a * a.
The derivation tree is a graphical representation of the given production rules of context-free
grammar (CFG). It is a way to show how the derivation can be done to obtain some string from a
given set of production rules. Two types of derivation; in the leftmost derivation, the input is
scanned and replaced with the production rule from left to right. So in left most derivatives, we
read the input string from left to right, and in the rightmost derivation, the input is scanned and
replaced with the production rule from right to left. So in most right derivatives, we read the input
string from right to left.
In CFG, all the grammar is not always optimized which means the grammar may consist of some
extra symbols (non-terminal). Having additional symbols, unnecessarily increase the length of
grammar. Simplification of grammar means a reduction of grammar by removing useless symbols.
Simplification essentially comprises the following steps; Reduction of CFG, Removal of Unit
Productions, and Removal of Null Productions.
Pushdown automata
Pushdown automata are nondeterministic finite state machines augmented with additional memory
in the form of a stack, which is why the term “pushdown” is used, as elements are pushed down
onto the stack. Pushdown automata are computational models—theoretical computer-like
machines—that can do more than a finite state machine, but less than a Turing machine. A
pushdown automaton is formally defined as a 7-tuple: (Q, Σ, Γ, δ, q0, Z, F).
Pushdown automata accept context-free languages, which include the set of regular languages.
The language that describes strings that have matching parentheses is a context-free language. Say
that a programmer has written some code, and in order for the code to be valid, any parentheses
must be matched. One way to do this would be to feed the code (as strings) into a pushdown
automaton programmed with transition functions that implement the context-free grammar for the
language of balanced parentheses. If the code is valid and all parentheses are matched, the
pushdown automata will "accept" the code. If there are unbalanced parentheses, the pushdown
automaton will be able to return to the programmer that the code is not valid. This is one of the
more theoretical ideas behind computer parsers and compilers.
Pushdown automata can be useful when thinking about parser design and any area where context-
free grammars are used, such as in computer language design. Since pushdown automata are equal
in power to context-free languages, there are two ways of proving that a language is context-free:
provide the context-free grammar or provide a pushdown automaton for the language.
A non-deterministic pushdown automaton (NPDA), or just pushdown automaton (PDA) is a
variation on the idea of a non-deterministic finite automaton (NDFA). Unlike an NDFA, a PDA is
associated with a stack (hence the name pushdown). The transition function must also take into
account the “state” of the stack.
Turing machines, first described by Alan Turing, are simple abstract computational devices
intended to help investigate the extent and limitations of what can be computed.
• The Turing machine model uses an infinite tape as its unlimited memory.
• It has a tape head that can read and write symbols and move around on the tape.
• Initially the tape contains only the input string and is blank everywhere else.
• The Turing machine model uses an infinite tape as its unlimited memory.
• It has a tape head that can read and write symbols and move around on the tape.
• Initially the tape contains only the input string and is blank everywhere else.
1. Instantaneous descriptions
2. Transition table, and
3. Transition diagram (transition graph).
Instantaneous descriptions using move-relations (├ )
- Machine which is defined in terms of the entire input string and the current state.
- An ID of a Turing machine M is a string αβγ, where
- β is the present state of M, and
- the entire input string is split as αγ, the first symbol of γ is the current symbol a under the
R/W head and γ has all the subsequent symbols of the input string, and
- The string α is the substring of the input string formed by all the symbols to the left of a.
Example: A snapshot of Turing machine is shown in figure below. Obtain the instantaneous
description?
Solution:
- The present symbol under the R/W head is a1.
- The present state is q3. So a1 is written to the right of q3.
- The non-blank symbols to the left of a1 form the string a4a1a2a1a2a2, which is written to the
left of q3.
- The sequence of non-blank symbols to the right of a1 is a4a2.
Thus the ID is as given in the figure below.
NB. - For constructing the ID, we simply insert the current state in the input string to the left of
the symbol under the R/W head.
- We observe that the blank symbol may occur as part of the left or right substring.
Transition table
We give the definition of transition function δ in the form of a table called the transition table.
– If δ(q, a) = (γ, α, β), we write αβγ under the α–column and in the q–row.
– So if we get αβγ in the table, it means that α is written in the current cell, β gives the
movement of the head (L or R) and γ denotes the new state into which the Turing machine
enters.
– Consider, for example, a Turing machine with five states q1, ..., q5, where qlis the initial state
and q5 is the (only) final state, and the tape symbols(G) are 0, 1 and □.
– The transition table given in the Table below describes δ.
#Note: The initial state is marked with → and the final state is marked with a circle.
● We describe the computation sequence in terms of the contents of the tape and the current
state.
Transition diagram (transition graph).
We can use the transition systems (diagrams) to represent Turing machines.
– The states are represented by vertices.
– Directed edges are used to represent transition of states.
– The labels are triples of the form (α, β, γ ), where α, β Î Γ (where Γis set of tape symbols) and
γ Î{L, R}.
When there is a directed edge from qi to qj with label (α,β,γ),it means that δ(qi, α) = (qj, β, γ)
– During the processing of an input string, suppose the Turing machine enters qi and the R/W
head scans the (present) symbol α.
– As a result the symbol is β written in the cell under the R/W head.
– The R/W head moves to the left or to the right depending on γ, and the new state is qj.
– Every edge in the transition system can be represented by a 5-tuple (qi, α, β, γ, qj).
– So each Turing machine can be described by the sequence of 5-tuples representing all the
directed edges.
– The initial state is indicated by → and any final state is marked with O.
There are three possible outcomes of executing a Turing machine over a given input.
(1) Halt and accept the input
(2) Halt and reject the input
(3) Never halt
A Turing machine that accepts the language (input): If machine halts in a final state.
A Turing machine that rejects the language (input): If machine halts in a non-final state or If
machine enters an infinite loop.
Because of the infinite loop: -The final state cannot be reached, the machine never halts and
the input is not accepted
A function F is computableif there is a Turing Machine with a transition from initial state to
final state.
Decidable and Undecidable problem
Computability
Recursive function
A function f is recursive if it can be obtained from the initial functions by a finite number of
applications of composition, recursion and minimization over regular functions.
Recursive Functions is one that calls upon itself to determine the solution.
Example: F(0) = 3, F(1) = 5 F(n+1) = F(n) + F(n-1) for all n > 0Can be utilized to resolve many
problems.
Recursive Language and recursively enumerable languages
• Space Complexity: Determine the approximate memory required to solve a problem of size
n.
• The function f(n) is O(g(n)) if there exist positive numbers c and N such that f(n) ≤ c.g(n)
for all n ≥ N.
• If an operation always completes in the same amount of CPU time regardless of the input
size, it is called a constant time operation.
• If it always uses the same amount of memory regardless of the input size, it is called a
constant space operation.
• Example: if an algorithm increment each number of length n , this algorithm runs in
O(n) time and performs O(1) work for each elements.
• More precisely, this means that there is a constant c such that the running time is at
most cn for every input of size n.
• If an algorithm’s time/space usage only grows linearly with the number of elements in the
input, then it has linear time/space complexity.
• Linear time is the best possible time complexity in situations where the algorithm has to
sequentially read its entire input.
• More formally, an algorithm is exponential time if T(n) is bounded by O(2nk) for some
constant k.