Introduction to Theory of Computation
Last Updated :
28 Jan, 2025
Automata theory, also known as the Theory of Computation, is a field within computer science and mathematics that focuses on studying abstract machines to understand the capabilities and limitations of computation by analyzing mathematical models of how machines can perform calculations.
Why we study Theory of Computation?
Automata theory is a fundamental area of computer science, with real-world applications in various systems and domains.
1. Regular Expressions (RE) in Systems
Regular expressions are powerful tools for pattern matching and text processing used extensively in many systems.
Examples:
- UNIX: In UNIX, regular expressions like
a.*b
are used for matching text patterns within files, making it easier to search for specific content across vast datasets. - XML and DTDs: Document Type Definitions (DTDs) describe the structure of XML documents using regular expressions. For example, a tag like
person (name, addr, child*)
ensures that the person
tag must include a name
, an addr
, and optionally multiple child
tags. - Programming Languages: Almost all modern programming languages have libraries for regular expression that allow us to do text processing.
2. Finite Automata in Modeling Systems
- Modeling Protocols and Circuits: Finite automata (FA) are used to model protocols, like those in network communication, and to design electronic circuits that operate based on a set of predefined rules or states.
- Model-Checking: FA theory is also applied in model-checking, which is used to verify whether a system behaves as expected under all possible conditions.
3. Context-Free Grammars (CFG)
- Syntax of Programming Languages: Context-free grammars are essential in describing the syntax of most programming languages. They define the rules that specify how programs should be written and structured.
- Natural Language Processing: CFGs also play a vital role in computational linguistics, helping to describe the structure of natural languages like English.
- XML and DTDs as CFGs: DTDs (Document Type Definitions) can be thought of as a specific application of context-free grammars, as they define the structure of XML documents.
Core Areas of the Theory of Computation
The field of computation theory can be broadly divided into three major areas:
1. Automata Theory
Automata theory studies abstract computational models and their applications. It forms the basis for understanding how machines process inputs and produce outputs. Key components include:
- Finite Automata: Used to model simple systems like lexical analyzers in compilers.
- Pushdown Automata: A more powerful model capable of recognizing context-free languages, essential for parsing programming languages.
- Turing Machines: The most powerful automata, used as a standard for defining what is computable.
2. Formal Languages and Grammars
This area examines the syntax and structure of languages used in computation. It involves:
- Regular Languages: Described by regular expressions and finite automata, representing simple patterns.
- Context-Free Languages: Defined by context-free grammars, crucial for designing compilers.
- Chomsky Hierarchy: A classification of languages into regular, context-free, context-sensitive, and recursively enumerable languages.
3. Computability and Decidability
Computability theory addresses the question: What problems can a computer solve? It studies concepts like:
- Decidable Problems: Problems with an algorithmic solution.
- Undecidable Problems: Problems, such as the Halting Problem, for which no algorithm can determine a solution for all inputs.
4. Complexity Theory
Complexity theory focuses on the efficiency of algorithms by analyzing the time and space resources they require. It categorizes problems into classes such as:
- P (Polynomial Time): Problems solvable in polynomial time.
- NP (Nondeterministic Polynomial Time): Problems whose solutions can be verified in polynomial time.
- NP-Complete and NP-Hard: The most challenging problems in NP, with applications in cryptography, optimization, and artificial intelligence.
Basic Terminologies of Theory of Computation
Now, let's understand the basic terminologies, which are important and frequently used in the Theory of Computation.
1. Symbol
A symbol (often also called a character) is the smallest building block, which can be any alphabet, letter, or picture.

2. Alphabets (Σ)
A finite, non-empty set of symbols used to construct strings and languages. For example, Σ = {a, b}.

3. String
A string is a finite sequence of symbols from some alphabet. A string is generally denoted as w and the length of a string is denoted as |w|. Empty string is the string with zero occurrence of symbols, represented as ε.
Number of Strings (of length 2)
that can be generated over the alphabet {a, b}:
- -
a a
a b
b a
b b
Length of String |w| = 2
Number of Strings = 4
Conclusion:
For alphabet {a, b} with length n, number of
strings can be generated = 2n.
Automata theory is used in modeling computational problems hence enhancing the understanding and design of systems such as compilers, interpreters among others.
Closure Representation in TOC
1. L+: It is a Positive Closure that represents a set of all strings except Null or ε-strings.
2. L*: It is "Kleene Closure", that represents the occurrence of certain alphabets for given language alphabets from zero to the infinite number of times. In which ε-string is also included.
From the above two statements, it can be concluded that:
L* = εL+
Example:
(a) Regular expression for language accepting all combination of g's over Σ={g}:
R = g*
R={ε,g,gg,ggg,gggg,ggggg,...}
(b) Regular Expression for language accepting all combination of g's over Σ={g} :
R = g+
R={g,gg,ggg,gggg,ggggg,gggggg,...}
Note: Σ* is a set of all possible strings(often power set(need not be unique here or we can say multiset) of string) So this implies that language is a subset of Σ*.This is also called a "Kleene Star".
Kleene Star is also called a "Kleene Operator" or "Kleene Closure". Engineers and IT professionals make use of Kleene Star to achieve all set of strings which is to be included from a given set of characters or symbols. It is one kind of Unary operator. In Kleene Star methodology all individual elements of a given string must be present but additional elements or combinations of these alphabets can be included to any extent.
Example:
Input String: "GFG".
Σ* = { ε,"GFG","GGFG","GGFG","GFGGGGGGGG","GGGGGGGGFFFFFFFFFGGGGGGGG",...}
(Kleene Star is an infinite set but if we provide any grammar rules then it can work as a finite set.
Please note that we can include ε string also in given Kleene star representation.)
Language
- A language is a set of strings formed using the symbols of a given alphabet Σ\SigmaΣ.
- Formally, a language is a subset of Σ∗\Sigma^*Σ∗, where Σ∗\Sigma^*Σ∗ is the set of all possible strings (including ε) over the alphabet Σ\SigmaΣ.
Examples of Languages:
Finite Language:
L1 = { set of string of 2 }
L1 = { xy, yx, xx, yy }
Infinite Language:
L1 = { set of all strings starts with 'b' }
L1 = { babb, baa, ba, bbb, baab, ....... }
Types of Languages in TOC
Languages are classified based on the computational model or grammar generating them:
- Regular Languages: Defined using regular expressions or finite automata. Example: L=a^n∣n≥0.
- Context-Free Languages: Defined using context-free grammars or pushdown automata. Example: L=a^n b^n ∣n≥0.
- Context-Sensitive Languages: Defined using context-sensitive grammars or linear-bounded automata.
- Recursive and Recursively Enumerable Languages: Defined using Turing machines.
Similar Reads
Introduction to Theory of Computation Automata theory, also known as the Theory of Computation, is a field within computer science and mathematics that focuses on studying abstract machines to understand the capabilities and limitations of computation by analyzing mathematical models of how machines can perform calculations.Why we study
7 min read
TOC Basics
Regular Expressions & Finite Automata
Introduction of Finite AutomataFinite automata are abstract machines used to recognize patterns in input sequences, forming the basis for understanding regular languages in computer science. They consist of states, transitions, and input symbols, processing each symbol step-by-step. If the machine ends in an accepting state after
4 min read
Regular Expressions, Regular Grammar and Regular LanguagesTo work with formal languages and string patterns, it is essential to understand regular expressions, regular grammar, and regular languages. These concepts form the foundation of automata theory, compiler design, and text processing.Regular ExpressionsRegular expressions are symbolic notations used
7 min read
Arden's Theorem in Theory of ComputationA Regular Expression (RE) is a way to describe patterns of strings using symbols and operators like union, concatenation, and star. A Deterministic Finite Automaton (DFA) is a machine that reads input strings and decides if they match the pattern by moving through a set of defined states without any
6 min read
Conversion from NFA to DFAAn NFA can have zero, one or more than one move from a given state on a given input symbol. An NFA can also have NULL moves (moves without input symbol). On the other hand, DFA has one and only one move from a given state on a given input symbol. Steps for converting NFA to DFA:Step 1: Convert the g
5 min read
Minimization of DFADFA minimization stands for converting a given DFA to its equivalent DFA with minimum number of states. DFA minimization is also called as Optimization of DFA and uses partitioning algorithm.Minimization of DFA Suppose there is a DFA D < Q, Î, q0, Î, F > which recognizes a language L. Then the
7 min read
Reversing Deterministic Finite AutomataPrerequisite â Designing finite automata Reversal: We define the reversed language L^R \text{ of } L  to be the language L^R = \{ w^R \mid w \in L \} , where w^R := a_n a_{n-1} \dots a_1 a_0 \text{ for } w = a_0 a_1 \dots a_{n-1} a_n Steps to Reversal: Draw the states as it is.Add a new single accep
4 min read
Mealy and Moore Machines in TOCMoore and Mealy Machines are Transducers that help in producing outputs based on the input of the current state or previous state. In this article we are going to discuss Moore Machines and Mealy Machines, the difference between these two machinesas well as Conversion from Moore to Mealy and Convers
3 min read
CFG & PDA
Simplifying Context Free GrammarsA Context-Free Grammar (CFG) is a formal grammar that consists of a set of production rules used to generate strings in a language. However, many grammars contain redundant rules, unreachable symbols, or unnecessary complexities. Simplifying a CFG helps in reducing its size while preserving the gene
6 min read
Converting Context Free Grammar to Chomsky Normal FormChomsky Normal Form (CNF) is a way to simplify context-free grammars (CFGs) so that all production rules follow specific patterns. In CNF, each rule either produces two non-terminal symbols, or a single terminal symbol, or, in some cases, the empty string. Converting a CFG to CNF is an important ste
5 min read
Closure Properties of Context Free LanguagesContext-Free Languages (CFLs) are an essential class of languages in the field of automata theory and formal languages. They are generated by context-free grammars (CFGs) and are recognized by pushdown automata (PDAs). Understanding the closure properties of CFLs helps in determining which operation
11 min read
Pumping Lemma in Theory of ComputationThere are two Pumping Lemmas, which are defined for 1. Regular Languages, and 2. Context - Free Languages Pumping Lemma for Regular Languages For any regular language L, there exists an integer n, such that for all x ? L with |x| ? n, there exists u, v, w ? ?*, such that x = uvw, and (1) |uv| ? n (2
4 min read
Ambiguity in Context free Grammar and LanguagesContext-Free Grammars (CFGs) are essential in formal language theory and play a crucial role in programming language design, compiler construction, and automata theory. One key challenge in CFGs is ambiguity, which can lead to multiple derivations for the same string.Understanding Derivation in Cont
3 min read
Context-sensitive Grammar (CSG) and Language (CSL)Context-Sensitive Grammar - A Context-sensitive grammar is an Unrestricted grammar in which all the productions are of form - Where α and β are strings of non-terminals and terminals. Context-sensitive grammars are more powerful than context-free grammars because there are some languages that can be
2 min read
Introduction of Pushdown AutomataWe have already discussed finite automata. But finite automata can be used to accept only regular languages. Pushdown Automata is a finite automata with extra memory called stack which helps Pushdown automata to recognize Context Free Languages. This article describes pushdown automata in detail.Pus
5 min read
Turing Machine & Decidability
Problems on Finite Automata
DFA for Strings not ending with "THE"Problem - Accept Strings that not ending with substring "THE". Check if a given string is ending with "the" or not. The different forms of "the" which are avoided in the end of the string are: "THE", "ThE", "THe", "tHE", "thE", "The", "tHe" and "the" All those strings that are ending with any of the
12 min read
DFA of a string with at least two 0âs and at least two 1âsProblem - Draw deterministic finite automata (DFA) of a string with at least two 0âs and at least two 1âs. The first thing that come to mind after reading this question us that we count the number of 1's and 0's. Thereafter if they both are at least 2 the string is accepted else not accepted. But we
3 min read
DFA for accepting the language L = { anbm | n+m =even }ProblemDesign a deterministic finite automata(DFA) for accepting the language L = {an bm | n+m = even}Examples:Input: a a b b , n = 2, m = 2 2 + 2 = 4 (even)Output: ACCEPTEDInput: a a a b b b b ,n = 3, m = 43 + 4 = 7 (odd) Output: NOT ACCEPTEDInput: a a a b b b , n = 3, m = 33 + 3 = 6 (even)Output:
14 min read
DFA machines accepting odd number of 0âs or/and even number of 1âsPrerequisite - Designing finite automata Problem - Construct a DFA machine over input alphabet \sum_= {0, 1}, that accepts: Odd number of 0âs or even number of 1âs Odd number of 0âs and even number of 1âs Either odd number of 0âs or even number of 1âs but not the both together Solution - Let first d
3 min read
DFA of a string in which 2nd symbol from RHS is 'a'Draw deterministic finite automata (DFA) of the language containing the set of all strings over {a, b} in which 2nd symbol from RHS is 'a'. The strings in which 2nd last symbol is "a" are: aa, ab, aab, aaa, aabbaa, bbbab etc Input/Output INPUT : baba OUTPUT: NOT ACCEPTED INPUT: aaab OUTPUT: ACCEPTED
10 min read
Problems on PDA
Construct Pushdown Automata for all length palindromeA Pushdown Automata (PDA) is like an epsilon Non deterministic Finite Automata (NFA) with infinite stack. PDA is a way to implement context free languages. Hence, it is important to learn, how to draw PDA. Here, take the example of odd length palindrome:Que-1: Construct a PDA for language L = {wcw'
6 min read
Construct Pushdown automata for L = {0n1m2m3n | m,n ⥠0}Prerequisite - Pushdown automata, Pushdown automata acceptance by final state Pushdown automata (PDA) plays a significant role in compiler design. Therefore there is a need to have a good hands on PDA. Our aim is to construct a PDA for L = {0n1m2m3n | m,n ⥠0} Examples - Input : 00011112222333 Outpu
3 min read
Construct Pushdown automata for L = {a2mc4ndnbm | m,n ⥠0}Pushdown Automata plays a very important role in task of compiler designing. That is why there is a need to have a good practice on PDA. Our objective is to construct a PDA for L = {a2mc4ndn bm | m,n ⥠0} Example:Input: aaccccdbOutput: AcceptedInput: aaaaccccccccddbbOutput: AcceptedInput: acccddbOut
3 min read
NPDA for accepting the language L = {anbn | n>=1}Prerequisite: Basic knowledge of pushdown automata.Problem :Design a non deterministic PDA for accepting the language L = {an bn | n>=1}, i.e.,L = {ab, aabb, aaabbb, aaaabbbb, ......} In each of the string, the number of a's are followed by equal number of b's. ExplanationHere, we need to maintai
2 min read
NPDA for accepting the language L = {ambncm+n | m,n ⥠1}The problem below require basic knowledge of Pushdown Automata.Problem Design a non deterministic PDA for accepting the language L = {am bn cm+n | m,n ⥠1} for eg. ,L = {abcc, aabccc, abbbcccc, aaabbccccc, ......} In each of the string, the total sum of the number of 'aâ and 'b' is equal to the numb
2 min read
NPDA for accepting the language L = {aibjckdl | i==k or j==l,i>=1,j>=1}Prerequisite - Pushdown automata, Pushdown automata acceptance by final state Problem - Design a non deterministic PDA for accepting the language L = {a^i b^j c^k d^l : i==k or j==l, i>=1, j>=1}, i.e., L = {abcd, aabccd, aaabcccd, abbcdd, aabbccdd, aabbbccddd, ......} In each string, the numbe
3 min read
NPDA for accepting the language L = {anb2n| n>=1} U {anbn| n>=1}To understand this question, you should first be familiar with pushdown automata and their final state acceptance mechanism.ProblemDesign a non deterministic PDA for accepting the language L = {an b2n : n>=1} U {an bn : n>=1}, i.e.,L = {abb, aabbbb, aaabbbbbb, aaaabbbbbbbb, ......} U {ab, aabb
2 min read
Problems on Turing Machines
Turing Machine for additionPrerequisite - Turing Machine A number is represented in binary format in different finite automata. For example, 5 is represented as 101. However, in the case of addition using a Turing machine, unary format is followed. In unary format, a number is represented by either all ones or all zeroes. For
3 min read
Turing machine for multiplicationPrerequisite - Turing Machine Problem: Draw a turing machine which multiply two numbers. Example: Steps: Step-1. First ignore 0's, C and go to right & then if B found convert it into C and go to left. Step-2. Then ignore 0's and go left & then convert C into C and go right. Step-3. Then conv
2 min read
Construct a Turing Machine for language L = {wwr | w ∈ {0, 1}}The language L = {wwres | w â {0, 1}} represents a kind of language where you use only 2 character, i.e., 0 and 1. The first part of language can be any string of 0 and 1. The second part is the reverse of the first part. Combining both these parts a string will be formed. Any such string that falls
5 min read
Construct a Turing Machine for language L = {ww | w ∈ {0,1}}Prerequisite - Turing Machine The language L = {ww | w â {0, 1}} tells that every string of 0's and 1's which is followed by itself falls under this language. The logic for solving this problem can be divided into 2 parts: Finding the mid point of the string After we have found the mid point we matc
7 min read
Construct Turing machine for L = {an bm a(n+m) | n,mâ¥1}Problem : L = { anbma(n +m) | n , m ⥠1} represents a kind of language where we use only 2 character, i.e., a and b. The first part of language can be any number of "a" (at least 1). The second part be any number of "b" (at least 1). The third part of language is a number of "a" whose count is sum o
3 min read
Construct a Turing machine for L = {aibjck | i*j = k; i, j, k ⥠1}Prerequisite â Turing Machine In a given language, L = {aibjck | i*j = k; i, j, k ⥠1}, where every string of 'a', 'b' and 'c' has a certain number of a's, then a certain number of b's and then a certain number of c's. The condition is that each of these 3 symbols should occur at least once. 'a' and
2 min read
Turing machine for 1's and 2âs complementProblem-1:Draw a Turing machine to find 1's complement of a binary number. 1âs complement of a binary number is another binary number obtained by toggling all bits in it, i.e., transforming the 0 bit to 1 and the 1 bit to 0. Example:1's ComplementApproach:Scanning input string from left to rightConv
3 min read
Practice