Introduction to Theory of Computation

Last Updated : 25 Aug, 2025

Automata theory, also known as the Theory of Computation, is a field within computer science and mathematics that focuses on studying abstract machines to understand the capabilities and limitations of computation by analyzing mathematical models of how machines can perform calculations.

Now, let's understand the basic terminologies, which are important and frequently used in the Theory of Computation.

1. Symbol

A symbol (often also called a character) is the smallest building block, which can be any alphabet, letter, or picture.

2. Alphabets (Σ)

A finite, non-empty set of symbols used to construct strings and languages. For example, Σ = {a, b}.

3. String

A string is a finite sequence of symbols from some alphabet. A string is generally denoted as w and the length of a string is denoted as |w|. Empty string is the string with zero occurrence of symbols, represented as ε.

Number of Strings (of length 2)
that can be generated over the alphabet {a, b}:
- -
a a
a b
b a
b b

Length of String |w| = 2
Number of Strings = 4

Conclusion:
For alphabet {a, b} with length n, number of
strings can be generated = 2ⁿ.

Automata theory is used in modeling computational problems hence enhancing the understanding and design of systems such as compilers, interpreters among others.

Closure Representation in TOC

1. L⁺: It is a Positive Closure that represents a set of all strings except Null or ε-strings.

2. L^*: It is "Kleene Closure", that represents the occurrence of certain alphabets for given language alphabets from zero to the infinite number of times. In which ε-string is also included.

From the above two statements, it can be concluded that:

L* = εL⁺

Example:

(a) Regular expression for language accepting all combination of g's over Σ={g}:
R = g^*
R={ε,g,gg,ggg,gggg,ggggg,...}

(b) Regular Expression for language accepting all combination of g's over Σ={g} :
R = g⁺
R={g,gg,ggg,gggg,ggggg,gggggg,...}

Note: Σ* is a set of all possible strings(often power set(need not be unique here or we can say multiset) of string) So this implies that language is a subset of Σ*.This is also called a "Kleene Star".

Kleene Star is also called a "Kleene Operator" or "Kleene Closure". Engineers and IT professionals make use of Kleene Star to achieve all set of strings which is to be included from a given set of characters or symbols. It is one kind of Unary operator. In Kleene Star methodology all individual elements of a given string must be present but additional elements or combinations of these alphabets can be included to any extent.

Example:

Input String: "GFG".
Σ* = { ε,"GFG","GGFG","GGFG","GFGGGGGGGG","GGGGGGGGFFFFFFFFFGGGGGGGG",...}
(Kleene Star is an infinite set but if we provide any grammar rules then it can work as a finite set.
Please note that we can include ε string also in given Kleene star representation.)

Language

A language is a set of strings formed using the symbols of a given alphabet Σ\SigmaΣ.
Formally, a language is a subset of Σ∗\Sigma^*Σ∗, where Σ∗\Sigma^*Σ∗ is the set of all possible strings (including ε) over the alphabet Σ\SigmaΣ.

Examples of Languages:

Finite Language:
L1 = { set of string of 2 }
L1 = { xy, yx, xx, yy }

Infinite Language:
L1 = { set of all strings starts with 'b' }
L1 = { babb, baa, ba, bbb, baab, ....... }

Types of Languages in TOC

Languages are classified based on the computational model or grammar generating them:

Regular Languages: Defined using regular expressions or finite automata. Example: L=a^n∣n≥0.
Context-Free Languages: Defined using context-free grammars or pushdown automata. Example: L=a^n b^n ∣n≥0.
Context-Sensitive Languages: Defined using context-sensitive grammars or linear-bounded automata.
Recursive and Recursively Enumerable Languages: Defined using Turing machines.

Core Areas of the Theory of Computation

The field of computation theory can be broadly divided into three major areas:

1. Automata Theory

Automata theory studies abstract computational models and their applications. It forms the basis for understanding how machines process inputs and produce outputs. Key components include:

Finite Automata: Used to model simple systems like lexical analyzers in compilers.
Pushdown Automata: A more powerful model capable of recognizing context-free languages, essential for parsing programming languages.
Turing Machines: The most powerful automata, used as a standard for defining what is computable.

2. Formal Languages and Grammars

This area examines the syntax and structure of languages used in computation. It involves:

Regular Languages: Described by regular expressions and finite automata, representing simple patterns.
Context-Free Languages: Defined by context-free grammars, crucial for designing compilers.
Chomsky Hierarchy: A classification of languages into regular, context-free, context-sensitive, and recursively enumerable languages.

3. Computability and Decidability

Computability theory addresses the question: What problems can a computer solve? It studies concepts like:

Decidable Problems: Problems with an algorithmic solution.
Undecidable Problems: Problems, such as the Halting Problem, for which no algorithm can determine a solution for all inputs.

4. Complexity Theory

Complexity theory focuses on the efficiency of algorithms by analyzing the time and space resources they require. It categorizes problems into classes such as:

P (Polynomial Time): Problems solvable in polynomial time.
NP (Nondeterministic Polynomial Time): Problems whose solutions can be verified in polynomial time.
NP-Complete and NP-Hard: The most challenging problems in NP, with applications in cryptography, optimization, and artificial intelligence.

02. Understanding Basic Terminologies

A

Improve

Article Tags :

Explore