Module 1

This document provides an introduction to compiler design, including: 1. It discusses the need for compilers by describing the drawbacks of writing programs in assembly language. This triggered the development of language processors like compilers and interpreters. 2. It describes the basic components of a language processing system, including preprocessors, compilers, assemblers, linkers, and loaders. Compilers convert programs to assembly language, and assemblers convert to machine language. 3. It provides an overview of compilers, describing their role in translating programming languages to machine languages, and discusses some key aspects of compilation like types of code generated and the work of a compiler.

Uploaded by

ARSHIYA K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views9 pages

Module 1

Uploaded by

ARSHIYA K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

e-PG PATHSHALA- Computer Science

Compiler Design
Module 1
Module 1 - Introduction
Objective: To understand the processes involved in Compiler Design.

1. Introduction :
This module starts with discussing the need for a Translator, Compiler. This module also
tries to group the compiler into phases which will be discussed in the later part of this
module. To begin, let us get to introduce a brief history of compilers.
1.1 A brief History.
In this Context, software can be defined as an essential component of the current
scenario. Normally in earlier days software was written in assembly language. The
instructions are written in Mnemonic code. For example, to add two numbers the
following would be the assembly code.

MOV R1, a
MOV R2, b
ADD R1, R1, R2  1.1

In statement (1.1), MOV is a command that would move the value stored in variable ‘a’
to register ‘R1’, ‘b’ to R2. The command ADD then adds the contents of the registers R1
and R2 and stores the result in R1. As one could observe, these instructions are closer to
the machine than to the human. The drawbacks of writing programs in assembly
instructions are:
– Very difficult to remember instructions
– Benefits of reusing software on different CPUs became greater than the cost of
designing compiler
– Very cumbersome to write
These drawbacks trigger the need for software that will understand human language and
that is the birth of Language Processors called translators.
1.1.2. Language Processors
A translator is one that converts a source program written in one language to a target
program in another language. This is similar to having a translator when two people who
doesn’t know the other person’s language want to communicate. In the context of
computer Science, a Source program is written in one programming language and a
Target Program typically belongs to machine language. The Target language is called
machine language as it is easier for the machine to understand. Some of the translators
are Assembler, Compiler and Interpreter. Compiler converts programs written in high-
level programming language to assembly language. Assemblers convert assembly
language programs to machine language (object language).
The translators help programmers to write programs in a language that is easier for them
to remember and understand and converts them into a language that is closer to the
machine. This results in the following ways of designing software:
a. Design an interpreter / translator to convert human language to machine language
The interpreter will have difficulties in parsing which may be ambiguous. For
example, inefficient parsing would result in incorrect word boundaries during
interpretation resulting in ambiguity.
b. Design a compiler that will understand high level language which is not necessarily in
English but closer to English and convert that to assembly language.
The design is complex but parsing ambiguity could be avoided. The major drawback
is the mapping of the high level language to assembly language. This also necessitates
the designing a compiler for every high level programming language keeping in mind
the instruction set of the target assembly language.
c. Design an assembler that converts assembly language to machine language
The drawback of this is that the target language needs to be specified. Output of the
various compilers to be known prior time
So, our aim is to design a Compiler and Assembler for converting high-level language
to machine language. In addition, certain other things are need for pre-processing and
execution which is discussed in the next section.

1.2 Language Processing System

A typical Language Processing system is given in Figure 1.1. The source program –
program written in high-level programming language goes through a pre-processor. The
pre-processor replaces macros and converts them into a complete code. For example, if
we have a statement called #define MAX 100, in the source program, the pre-processor
replaces MAX with 100 in all the places in the source program and passes it to the
compiler. The compiler converts this to assembly language and the assembler converts to
object language. At this point, the object language is called as the re-locatable object
code. The code is re-locatable as it doesn’t have the exact address of the memory at
which this code is to be loaded for execution. This re-locatable machine code is passed on
to the linker. The linker will link multiple source files into one or link the current source
files with the object code of the standard library and gets one object file. This file is then
loaded into the main memory for execution by the loader.

Figur1.1 Language Processing System

To given an example, FORTRAN was the first real compiler to be built. It was
built in the late 1950’s and it required 18 person-years.

1.3 Compiler Basics

Programming Machine
Language Language
Compiler
(Source) (Target)

Figure 1.2 Compiler Overview

A compiler acts as a translator, transforming human-oriented programming
languages into computer-oriented machine languages and is shown in Figure 1.2. This
doesn’t require any concern about machine-dependent details for programmer.
1.2.1 Types of Code
In the process of generating assembly level code, the compiler could generate any one of the
following types of codes:
a. Pure Machine Code: This refers to the set of Machine instruction which is independent of
any operating system or library. These codes are typically available for the Operating
Systems or Embedded Applications.
b. Augmented Machine Code: They refer to the machine instruction that has operating
system routines along with run-time support routines.
c. Virtual Machine Code: These refer to the Virtual instructions that can be run on any
architecture with a virtual machine interpreter or a just-in-time compiler. Ex: Java
1.2.2 Work of a Compiler
The Compiler has to necessarily do the following to translate high-level source code to
low-level assembly code
 Processes source program
 Prompts errors in source program
 Recovers / Corrects the errors
 Produce assembly language program
After generating assembly language program, an assembler is used to convert the
assembly language code into a relocatable machine code. The time of conversion from
source program into object program is called compile time. The object program is
executed at run time

1.3 Interpreter
An Interpreter is a language processor that executes the operation as specified in the
source program. The inputs are supplied by the user. The interpreter processes an internal
form of the source program and data at the same time (at run time) and therefore no
object program is generated.
1.3.1 Compiler vs Interpreter
The following are some comparison between the compiler and the interpreter.
 For a compiler, a higher degree of machine independence exists and hence it
facilitates high portability.
 A compiler supports dynamic execution. This helps in making modification or
addition to user programs even during execution.
 A compiler also supports dynamic data type which helps in supporting the change
in the type of object even during runtime
 An Interpreter on the other hand requires no synthesis part.
 Interpreter provides better diagnostics: more source text information available
 The machine-language target program produced by a compiler is much faster than
an interpreter at mapping input to output.
 An interpreter is better with error diagnostics as it executes the source program
statement by statement.

1.4 Compilation process

The process of Compilation and Interpretation is given in Figures 1.3 and 1.4
respectively.

Data

Source Object Executing Results

program Compiler program Computer

Figure 1.3: Compilation Process

Data

Source
Compiler Result
program

Figure 1.4 Interpretive process

As discussed, the compiler converts source program into relocatable object program, which then
uses the data and executes in main memory on the other hand, the interpreter uses the source
program and data and produces the execution without any intermediate object program.
1.4.1 Compiler
The compiler consists of two parts: Analysis and Synthesis.
- The analysis part breaks up the source program into constituent pieces and imposes a
grammatical structure on them. It then uses this structure to create an intermediate
representation of the source program.
- The synthesis of its corresponding program: constructs the desired target program from
the intermediate representation and the information in the symbol table.
The analysis part is often called the front end of the compiler; the synthesis part is the back
end.
The Front End of the Compiler is typically language dependent. It depends on the source
language but it does not depend on the target machine’s architecture. The Back End is target
dependent as it requires the instruction set of the target machine but it doesn’t require
information of the source language.
The Analysis and the Synthesis part of the Compiler is given in Figure 1.5. The Analysis part
consists of three components while the Synthesis part consists of two components Code
Generation and Optimization. In the process, it uses Error Table and a Symbol table for the
generation of target code.

Source Program
Target Program

Figure 1.5 Analysis and Synthesis stages of the Compiler

1.4.2 Compiler Passes
The grouping of the work of the compiler into analysis and synthesis part poses the
following questions.
• How many passes should the compiler go through?
• One for analysis and one for synthesis?
• One for each division of the analysis and synthesis?
To answer all these questions, the work done by a compiler is grouped into phases which is
discussed in the next section.

1.4 Phases of the Compiler

The compiler’s analysis and synthesis part is grouped into 6 phases and is shown in Figure
1.6. The first three phases belong to the analysis phase and the last three phases to the
synthesis phase. All the phases of the compiler interacts with the symbol table and the error
handler.

Figure 1.6 Phases of the Compiler

Lexical Phase: Lexical analyzer reads the stream of characters from the source program and
combines the characters into meaningful sequences called lexeme. For every lexeme, the lexer
(lexical analyser) produces a token of the form which is passed to the next phase of the compiler.
The token is of the form <token-name, attribute-value>, where token-name is an abstract symbol
that is used during syntax analysis and an attribute-value: points to an entry in the symbol table
for this token. During this phase, the symbol is created by the compiler, which has the
information about the lexeme. The lexical analyser, typically skips all blanks, unwanted white
spaces and comment lines that is being available in the source program.
Syntax Phase: The syntax phase of the compiler is the second phase. The phase is where the
input from the source program is parsed and hence this phase is referred to as the Parser (parsing
phase). The parser uses the tokens produced by the lexer to create a tree-like intermediate
representation that verifies the grammatical structure of the sequence of tokens. A typical
representation is a syntax tree in which each interior node represents an operation and the
children of the node represent the arguments of the operation
Semantic Phase: The semantic analyzer uses the output of the parser, which are the syntax tree
and the information in the symbol table to check for semantic consistency in the source program.
In this phase, the compiler gathers type information about the variables, operations, etc., and
saves it in either the syntax tree or the symbol table, for subsequent use during intermediate-code
generation. Type checking is done in this phase, where the compiler checks that each operator
has matching operands. For example, typically an array index need to be an integer and the
compiler must identify an error if a floating-point number is used to as an array index. Yet
another job of the Semantic phase is type conversion, referred to as coercion. For example, a
binary arithmetic operator may be applied to either a pair of integers or to a pair of floating-point
numbers. If the operator is applied to a floating-point number and an integer, the compiler may
convert or coerce the integer into a floating-point number.
Intermediate Code Generation: Compilers generate an explicit low-level or machine-like
intermediate representation. This representation is necessary for generating assembly language.
The characteristics of the intermediate representation are
• Ease of Generation
• Ease of translation to target assembly language.
The input to this phase is the syntax tree and output is intermediate code. A convention for
Intermediate code generation is the three address code. The three address code has at the most
three operands and 2 operators. For example,
x = y op z
x = op y  (1.2)
As expressed in statement 1.2, x, y, z are three operands which are typically addresses and ‘op’
refers to the operator in addition to the ‘=’ operator.

Code Optimization: This phase can operate either before or after code generation. The aim of
his phase is to improve the intermediate code so that it results in better target code. This phase
also aims at generating faster, shorter code, so that target code is generated that consumes less
power. The important characteristic of this phase is to carry out simple optimizations that
significantly improve the running time of the target program without slowing down compilation
Code Generation: This phase generation target assembly language. In this phase, the registers
or memory locations are selected for each of the variables used by the program. The inputs to
this phase which are the intermediate instructions are translated into sequences of machine
instructions to complete an operation. One of the important consideration of code generation is
the assignment of registers to hold variables as we have limited number of registers. This phase
also need to decide on the choice of instructions involving registers, memory or a mix of the two.
Symbol Table: The symbol table is implemented as a data structure containing a record for each
variable name, with fields for the attributes of the name. The symbol table is designed to help the
compiler to identify and fetch the record for each name quickly. The symbol table has attributes
that may provide information about the storage allocated for a name, its type, its scope. It also
provides details on the function or procedure names, such things as the number and types of its
arguments, the method of passing each argument and the return type.
Error Handler: The errors encountered in every phase are logged into the error handler for
subsequent reporting to the user. The compiler however, recovers from the errors in every phase
so that it can proceed with the compilation process. The compiler recovers from errors in either
the panic mode of error recovery or phrase mode of error recovery.
Multi-pass Compiler: Several phases can be implemented as a single pass consist of reading an
input file and writing an output file. A typical multi-pass compiler could do the following:
• First pass: preprocessing, macro expansion
• Second pass: syntax-directed translation, IR code generation
• Third pass: optimization
• Last pass: target machine code generation

1.5 Summary
This module discussed need for a compiler and the various phases of the compiler.

Compiler Design - Introduction
No ratings yet
Compiler Design - Introduction
6 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
9 pages
Chapter 1
No ratings yet
Chapter 1
11 pages
PCC 1
No ratings yet
PCC 1
8 pages
Automata Theory and Compiler Design (AT&CD) Vtu Sce 5th Sem 21cs51
No ratings yet
Automata Theory and Compiler Design (AT&CD) Vtu Sce 5th Sem 21cs51
12 pages
CD Notes Final
No ratings yet
CD Notes Final
72 pages
Introduction of Compiler Design
No ratings yet
Introduction of Compiler Design
63 pages
CD Unit-I-1
No ratings yet
CD Unit-I-1
42 pages
Overview of Compiler Design Phases
No ratings yet
Overview of Compiler Design Phases
45 pages
CS701 Compiler Design UNIT-1: 1. Introduction of Compiler: 1.1 Language Processing System
No ratings yet
CS701 Compiler Design UNIT-1: 1. Introduction of Compiler: 1.1 Language Processing System
15 pages
Compiler - Design - Notes - Unit - 1 - Part 1
No ratings yet
Compiler - Design - Notes - Unit - 1 - Part 1
29 pages
Intro Compiler
No ratings yet
Intro Compiler
7 pages
Principles of Compiler Design
100% (4)
Principles of Compiler Design
162 pages
Principles of Compiler Design PDF
No ratings yet
Principles of Compiler Design PDF
162 pages
Introduction to Compiler Construction
No ratings yet
Introduction to Compiler Construction
22 pages
Overview of Language Processors
No ratings yet
Overview of Language Processors
28 pages
Introduction to Compilers and Their Functions
No ratings yet
Introduction to Compilers and Their Functions
84 pages
Compiler Design Book PDF
100% (1)
Compiler Design Book PDF
101 pages
Chapter 1 Introduction To Compiler Design
No ratings yet
Chapter 1 Introduction To Compiler Design
13 pages
Compiler 2024
No ratings yet
Compiler 2024
179 pages
Compiler Design Overview and Concepts
No ratings yet
Compiler Design Overview and Concepts
67 pages
@CD ch1
No ratings yet
@CD ch1
23 pages
Compiler Design: B.Tech Cse Iii Year Ii Semester
No ratings yet
Compiler Design: B.Tech Cse Iii Year Ii Semester
25 pages
Compiler Basics and Functions
No ratings yet
Compiler Basics and Functions
7 pages
Compiler Design - AD - VI - 1709959837
No ratings yet
Compiler Design - AD - VI - 1709959837
87 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
9 pages
Compiler Unit - 1 PDF
No ratings yet
Compiler Unit - 1 PDF
16 pages
Compiler Construction
No ratings yet
Compiler Construction
63 pages
CD ch1
No ratings yet
CD ch1
23 pages
Introduction to Compiler Design Concepts
No ratings yet
Introduction to Compiler Design Concepts
6 pages
Introduction to Compiler Design Concepts
No ratings yet
Introduction to Compiler Design Concepts
29 pages
CD Unit I Part I Introduction
No ratings yet
CD Unit I Part I Introduction
67 pages
Unit Ii
No ratings yet
Unit Ii
29 pages
Com 413 Compiler - Notes1-1
No ratings yet
Com 413 Compiler - Notes1-1
6 pages
L1 - Introduction To Compiler
No ratings yet
L1 - Introduction To Compiler
33 pages
Compiler Design Concepts Worked Out Examples and M
100% (1)
Compiler Design Concepts Worked Out Examples and M
100 pages
Principles of Compiler Design PDF
0% (1)
Principles of Compiler Design PDF
177 pages
Compiler Design for CS Students
No ratings yet
Compiler Design for CS Students
67 pages
Elementary Programming Principles
No ratings yet
Elementary Programming Principles
63 pages
Structured Programming) (1) - KK
No ratings yet
Structured Programming) (1) - KK
61 pages
Compiler Design and Implementation Guide
No ratings yet
Compiler Design and Implementation Guide
20 pages
Manjakkudi
No ratings yet
Manjakkudi
158 pages
Compiler Design: Theme of The Subject
No ratings yet
Compiler Design: Theme of The Subject
19 pages
Compiler Design Lecture Notes
No ratings yet
Compiler Design Lecture Notes
188 pages
Compiler Design and Language Processing
No ratings yet
Compiler Design and Language Processing
11 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
68 pages
CD 1.1 Introduction To Compiler
No ratings yet
CD 1.1 Introduction To Compiler
5 pages
Compiler Design (Unit-1,2) (AKTU)
No ratings yet
Compiler Design (Unit-1,2) (AKTU)
99 pages
Wa0005.
No ratings yet
Wa0005.
8 pages
Atcd Unit-3 - Material
No ratings yet
Atcd Unit-3 - Material
34 pages
Module-2: Introduction, Lexical Analysis: Syllabus
No ratings yet
Module-2: Introduction, Lexical Analysis: Syllabus
28 pages
Compiler Design
No ratings yet
Compiler Design
65 pages
CD Unit1 Notes
No ratings yet
CD Unit1 Notes
28 pages
Introduction to Compiler Design Basics
No ratings yet
Introduction to Compiler Design Basics
59 pages
Elementary Programming Principles
No ratings yet
Elementary Programming Principles
110 pages
Language Processors: Assembler, Compiler, Interpreter
No ratings yet
Language Processors: Assembler, Compiler, Interpreter
11 pages
Introduction to Compilers and Their Functions
No ratings yet
Introduction to Compilers and Their Functions
23 pages
Apex Triggers
No ratings yet
Apex Triggers
23 pages
Introduction To Computers and Programming (CSC116) : Assignment 1
No ratings yet
Introduction To Computers and Programming (CSC116) : Assignment 1
3 pages
Stacks and Queues
No ratings yet
Stacks and Queues
3 pages
BDA Unit-2
100% (1)
BDA Unit-2
11 pages
Rudra D. 100+ Java Programs Examples. Best For Beginners... 2023
No ratings yet
Rudra D. 100+ Java Programs Examples. Best For Beginners... 2023
66 pages
Laboratory Exercise # 2: Debug Facility Part 2
No ratings yet
Laboratory Exercise # 2: Debug Facility Part 2
8 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
13 pages
NEHU Technology Placement Guide
No ratings yet
NEHU Technology Placement Guide
18 pages
Group 1
No ratings yet
Group 1
12 pages
Syllabus of Java
No ratings yet
Syllabus of Java
13 pages
Qualcomm/Sandisk/Broadcom Interview Questions Anuntri Mukherjee Interview Questions in Qualcomm
No ratings yet
Qualcomm/Sandisk/Broadcom Interview Questions Anuntri Mukherjee Interview Questions in Qualcomm
12 pages
Top Useful Gvim Command
No ratings yet
Top Useful Gvim Command
3 pages
Oracle SQL Interview Q&A Download
No ratings yet
Oracle SQL Interview Q&A Download
4 pages
Django Roadmap by C&D
No ratings yet
Django Roadmap by C&D
5 pages
Micro-Teaching: A Detailed Explanation For General Teaching
No ratings yet
Micro-Teaching: A Detailed Explanation For General Teaching
19 pages
TP It V 4.0 c10 Part B Unit 3 Tws 1
No ratings yet
TP It V 4.0 c10 Part B Unit 3 Tws 1
1 page
Bit1202 Introduction To Web Design and Development Reg Supp
No ratings yet
Bit1202 Introduction To Web Design and Development Reg Supp
4 pages
Bengaluru City University: As Per SEP 2024)
No ratings yet
Bengaluru City University: As Per SEP 2024)
24 pages
Xii Functions
No ratings yet
Xii Functions
4 pages
Ruby String Methods (Ultimate Guide) - RubyGuides
No ratings yet
Ruby String Methods (Ultimate Guide) - RubyGuides
23 pages
C Language
No ratings yet
C Language
70 pages
CS2210 May June 2024 Paper 2 Variant 1 Marking Scheme
No ratings yet
CS2210 May June 2024 Paper 2 Variant 1 Marking Scheme
3 pages
OOPs Lab Manual
0% (1)
OOPs Lab Manual
88 pages
Java Thread Quiz Application
No ratings yet
Java Thread Quiz Application
3 pages
MNFST
No ratings yet
MNFST
4 pages
34 Go Interview Questions (SOLVED and ANSWERED) To Crack Before Next Interview - FullStack - Cafe PDF
No ratings yet
34 Go Interview Questions (SOLVED and ANSWERED) To Crack Before Next Interview - FullStack - Cafe PDF
26 pages
Programming For Problem Solving-Cse1001 2024
No ratings yet
Programming For Problem Solving-Cse1001 2024
4 pages
2020 Winter Term 2 Unit 1b - Static Scalars and Arrays: CPSC 213 2020 ST1 © 2020 Jonatan Schroeder
No ratings yet
2020 Winter Term 2 Unit 1b - Static Scalars and Arrays: CPSC 213 2020 ST1 © 2020 Jonatan Schroeder
74 pages
Alphanumeric Number For HU
No ratings yet
Alphanumeric Number For HU
11 pages
Apache Jackrabbit Oak Architecture
No ratings yet
Apache Jackrabbit Oak Architecture
46 pages