0% found this document useful (0 votes)
11 views

Compiler 1

The document provides an overview of compiler design, detailing the evolution of programming languages from machine language to high-level languages and their classifications. It explains the phases of compilation, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation, along with the roles of each phase. Additionally, it discusses the importance of compilers in translating source code into executable programs and the tools used in compiler construction.

Uploaded by

ajebaderesa12
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Compiler 1

The document provides an overview of compiler design, detailing the evolution of programming languages from machine language to high-level languages and their classifications. It explains the phases of compilation, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation, along with the roles of each phase. Additionally, it discusses the importance of compilers in translating source code into executable programs and the tools used in compiler construction.

Uploaded by

ajebaderesa12
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Compiler Design

Chapter 1
Introduction
By Diriba Regasa (MSc)
The evolution of programming language

• The first electronic computers appeared in the 1940's and were


programmed in machine language by sequences of 0's and 1's
that explicitly told the computer what operations to execute
and in what order.
• The operations themselves were very low level: move data
from one location to another, add the contents of two registers,
compare two values , and so on.
• This kind of programming was slow, tedious, and error prone.
• And once written, the programs were hard to understand and
modify.

2

• An object-oriented language is one that supports object-


oriented programming, a programming style in which a
program consists of a collection of objects that interact with
one another.
– Simula 67 and Smalltalk are the earliest major object-
oriented languages.
– Languages such as C++, C#, Java, and Ruby are more
recent object-oriented languages.

3

• Scripting languages are interpreted languages with high-level


operators designed for "gluing together" computations.
• These computations were originally called “scripts”.
– Awk, JavaScript, Perl, PHP, Python, Ruby are popular
examples of scripting languages.
• Programs written in scripting languages are often much shorter
than equivalent programs written in languages like C.

4
Machine language

• The only language that is “understood” by a computer.


• Varies from machine to machine.
• The only choice in the 1940s.

5
Assemble language

• Also known as symbolic languages.


• First developed in the 1950s.
• Easier to read and write.
• Assembler converts to machine code.
• Still different for each type of machine.

6
High-level language

• Developed in 1960s and later.


• Much easier to read and write.
• Portable to many different computers.
• Languages include C, Pascal, C++, Java, Perl, etc.
• Still must be converted to machine code!

7

• The Move to Higher-level Languages: Today, there are thousands


of programming languages. They can be classified in a variety of
ways.
• One classification is by generation.
– First-generation languages are the machine languages.
– Second-generation the assembly languages.
– Third-generation the high-level languages like Fortran, Cobol,
Lisp, C, C++, C#, and Java.
– Fourth-generation languages are languages designed for
specific applications like NOMAD for report generation, SQL
for database queries, and Postscript for text formatting.
– Fifth-generation language has been applied to logic and
constraint-based languages like Prolog and OPS5.
8

What is Programming language?


• Programming languages are notations for describing
computations to people and to machines.
• The world as we know it depends on programming languages,
because all the software running on all the computers was
written in some programming language.
• But, before a program can be run, it first must be translated
into a form in which it can be executed by a computer.
• The software systems that do this translation are called
compilers.

9
Language processors

• A compiler is a program that can read a program in one


language(the source language) and translate it into an
equivalent program in another language (the target language).
• An important role of the compiler is to report any errors in the
source program that it detects during the translation process.

10

• If the target program is an executable machine-language


program, it can then be called by the user to process inputs and
produce outputs.

Running the target program

11

• An interpreter is another common kind of language processor.


• Instead of producing a target program as a translation, an
interpreter appears to directly execute the operations specified
in the source program on inputs supplied by the use.

An interpreter

12

• The machine-language target program produced by a compiler


is usually much faster than an interpreter at mapping inputs to
outputs .
• An interpreter, however, can usually give better error
diagnostics than a compiler, because it executes the source
program statement by statement.
• There is also a hybrid compiler which combines compilation
and interpretation.
• Example:
– Java language processors combine compilation and
interpretation.

13

• A Java source program may first be compiled into an


intermediate form called bytecodes.
• The bytecodes are then interpreted by a virtual machine.
• A benefit of this arrangement is that bytecodes compiled on
one machine can be interpreted on another machine, perhaps
across a network.
• To achieve faster processing of inputs to outputs, some Java
compilers, called just-in-time compilers, translate the
bytecodes into machine language immediately before they run
the intermediate program to process the input.

14

A Language processing system 15


Phase of Compilers

• The compilation procedure is nothing but a series of different


phases. Each stage acquires input from its previous phase.
• The analysis part breaks up the source program into
constituent pieces and imposes a grammatical structure on
them.
• It then uses this structure to create an intermediate
representation of the source program.
• If the analysis part detects that the source program is either
syntactically ill formed or semantically unsound, then it must
provide informative messages, so the user can take corrective
action.

16

• The analysis part also collects information about the source


program and stores it in a data structure called a symbol table,
which is passed along with the intermediate representation to
the synthesis part.
• The synthesis part constructs the desired target program from
the intermediate representation and the information in the
symbol table.
• The analysis part is often called the front end of the compiler,
and the synthesis part is the back end.

17

Phase of compiler

18
Lexical Analysis

• Lexical analysis or Lexical analyzer is the initial stage or


phase of the compiler.
• This phase scans the source code and transforms the input
program into a series of a token.
• A token is basically the arrangement of characters that defines
a unit of information in the source code.
• NOTE:
– In computer science, a program that executes the process of
lexical analysis is called a scanner, tokenizer, or lexer.

19

• Roles and Responsibilities of Lexical Analyzer


• It is accountable for terminating the comments and white
spaces from the source program.
• It helps in identifying the tokens.
• Categorization of lexical units.

20
Syntax Analysis

• The Syntax analysis is the second stage. Here the provided input
string is scanned for the validation of the structure of the standard
grammar.

• Basically, in the second phase, it analyses the syntactical structure


and inspects if the given input is correct or not in terms of
programming syntax.

• It accepts tokens as input and provides a parse tree as output.

• It is also known as parsing in a compiler.

21

• Roles and Responsibilities of Syntax Analyzer


• Note syntax errors.
• Helps in building a parse tree.
• Acquire tokens from the lexical analyzer.
• Scan the syntax errors, if any.

22
Semantic Analysis

• Semantic analysis is the third phase.


• It scans whether the parse tree follows the guidelines of
language.
• It also helps in keeping track of identifiers and expressions.
• Semantic analyzer defines the validity of the parse tree, and
the annotated syntax tree comes as an output.
• Roles and Responsibilities of Semantic Analyzer:
• Saving collected data to symbol tables or syntax trees.
• It notifies semantic errors.
• Scanning for semantic errors.

23
Intermediate code generator

• The parse tree is semantically confirmed, an intermediate code


generator develops three address codes.
• A middle-level language code generated by a compiler at the time of
the translation of a source program into the object code is known as
intermediate code or text.

24

• Few Important Pointers:


• A code that is neither high-level nor machine code, but a middle-level
code is an intermediate code.
• We can translate this code to machine code later.
• This stage serves as a bridge or way from analysis to synthesis.

• Roles and Responsibilities:


• Helps in maintaining the priority ordering of the source language.
• Translate the intermediate code into the machine code.
• Having operands of instructions.

25
Code optimizer

• It is used to enhance the intermediate code. This way, the output of


the program can run fast and consume less space.
• To improve the speed of the program, it eliminates the unnecessary
strings of the code and organizes the sequence of statements.

• Roles and Responsibilities:


• Remove the unused variables and unreachable code.
• Enhance runtime and execution of the program.
• Produce streamlined code from the intermediate expression.

26
Code generator

• The final stage of the compilation process is the code


generation process.
• In this final phase, it tries to acquire the intermediate code as
input which is fully optimized and map it to the machine code
or language.
• Later, the code generator helps in translating the intermediate
code into the machine code.
• Roles and Responsibilities:
– Translate the intermediate code to target machine code.
– Select and allocate memory spots and registers.

27
Symbol table

• The symbol table is mainly known as the data structure of the


compiler.
• It helps in storing the identifiers with their name and types.
• It makes it very easy to operate the searching and fetching process.
• The symbol table connects or interacts with all phases of the
compiler and error handler for updates.
• It is also accountable for scope management.
• It stores:
• It stores the literal constants and strings.
• It helps in storing the function names.
• It also prefers to store variable names and constants.
• It stores labels in source languages.

28
Questions

Q1. Why do we use parsing in compilers?


• The parser in the compilation process is utilized to split
the data into smaller components reaching from the
lexical analysis phase (first phase).
• It takes input in the form of a series of tokens and
creates output as the parse tree.

Q2. What is the output of Lexical analysis?


• The Lexical analysis creates a stream of tokens as
output.

29
Compiler Construction Tools

• The compiler writer, like any software developer, can


profitably use modern software development environments
containing tools such as language editors, debuggers, version
managers, profilers, test harnesses, and so on.
• In addition to these general software-development tools, other
more specialized tools have been created to help implement
various phases of a compiler.

• These tools use specialized languages for specifying and


implementing specific components, and many use quite
sophisticated algorithms.

30

• The most successful tools are those that hide the details of the
generation algorithm and produce components that can be
easily integrated into the remainder of the compiler.

Some commonly used compiler-construction tools include.


• Scanner generators produce lexical analyzers from a regular-
expression description of the tokens of a language.
• Parser generators produce syntax analyzers from a
grammatical description of a programming language.
• Syntax-directed translation engines produce collections of
routines for walking a parse tree and generating intermediate
code.
31

• Code-generator produce a code generator from a collection of


rules for translating each operation of the intermediate
language into the machine language for a target machine.

• Data-flow analysis engines facilitate the gathering of


information about how values are transmitted from one part of
a program to each other part.

• Data-flow analysis is a key part of code optimization.

• Compiler-construction toolk2ts provide an integrated set of


routines for constructing various phases of a compiler.
32
The End

You might also like