0% found this document useful (0 votes)
117 views

Compiler Construction

This document describes a course on compiler construction. It covers the structure and phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, code generation, and code optimization. The course objectives are to expose students to the links between programming languages and translators. It will explore the phases of compilation through examples and teach students to design new languages. The course is divided into 13 modules covering topics like grammars, parsing, semantics, and practical applications. Student work will be graded based on assignments, tests, and a final exam.

Uploaded by

Adegoke Bestman
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views

Compiler Construction

This document describes a course on compiler construction. It covers the structure and phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, code generation, and code optimization. The course objectives are to expose students to the links between programming languages and translators. It will explore the phases of compilation through examples and teach students to design new languages. The course is divided into 13 modules covering topics like grammars, parsing, semantics, and practical applications. Student work will be graded based on assignments, tests, and a final exam.

Uploaded by

Adegoke Bestman
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

CSC 405- Compiler

Construction
By
Oladeji Florence A. ([email protected])
2020/2021 SESSION

DEPT. OF COMPUTER SCIENCE, CBAS COLLEGE


07/06/2021 1
MOUNTAIN TOP UNIVERSITY
Course Contents

• Review of compilers assemblers and interpreters, structure and


functional aspects of a typical compiler, syntax semantics and
pragmatics, functional relationship between lexical analysis,
expression analysis and code generation. Internal form of course
programme. Use of a standard compiler as a working vehicle.

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 2
UNIVERSITY
Course Objectives
• To expose the students to the links among programming languages-
translators, string formation, strings and statements, formal languages
• To explore the phases of compilation with illustrative examples
• To expose students to apps for building systems software for chosen
languages
• Increased ability to design new languages: e.g. indigenous language
translator

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 3
UNIVERSITY
Course Distribution
• Module 1: Introduction to program translations(Review of compilers
assemblers and interpreters), strings, formal and informal languages
• Module 2: structure and functional aspects of a typical compiler
• Module 3: Lexical analysis, expression analysis
• Module 4:Syntax phase and grammar
• Module 5: Grammar formulation and types
• Module 6 & 7: Parsing of strings and automata
• Module 8 and 9: Semantic phase and code generation
• Module 10: Code optimization and object code production
• Module 11: Practical application and representation of compilation skills
• Module 12: End of course revision and test
• Module 13: tutorials DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP
07/06/2021 4
UNIVERSITY
Grading system
Continuous assessment 30%
• Assignments and Lab work 20 marks
• Mid-Semester Test 10 marks
End-Semester Exam 70 marks
• Textbook:
• 1.

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 5
UNIVERSITY
Grading system
Continuous assessment 30%
• Assignments and Lab work 20 marks
• Mid-Semester Test 10 marks
End-Semester Exam 70 marks
• Textbook:
. Compiler Construction by William M. Waite and Gerhard Goos : https://www.cs.cmu.edu/~aplatzer/course/Compilers/waitegoos.pdf

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 6
UNIVERSITY
Programming languages
• Comparative studies of C, C++, Java, Pascal and Prolog languages as
well as Visual Basic, Visual C++ and JBuilder, Microsoft .NET , Scripting
Languages (Java Scripts, PHP, PERL etc.)
• The Language produced by a grammar consists of all strings formed in
that language. This is obtained by first rewriting some of the
productions of the grammar, generate some statements or sentences
and form patterns from the generated sentences.

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 7
UNIVERSITY
Translators: Assemblers, Compilers And
Interpreters
• A translator is a program that takes as input a program written in one
Programming Language (the source language) and produces as output
a program in another language.
•  Assembler: An assembler is a translator that transforms program
instructions in symbolic codes (assembly codes) to their machine
language equivalent codes. mov ax, F“5” = a=5;
• Examples of symbolic codes are: ADD, SUB, JMP, MUL, MOV, MVJ,
ADC, NOP etc.
Symbolic instruction Assembler Executable machine code

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 8
UNIVERSITY
High level Compiler Intermediate
language
or assembly code

Translator…
• A compiler is a high-level language translator that transforms the
whole of the source code to symbolic codes through the process of
lexical, semantic analysis and machine code generation and
optimization.

• Machine language codes are expressed in binary digits.


DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP
07/06/2021 9
UNIVERSITY
Translator…
• An interpreter is a high level language translator that transforms each
complete statement into machine codes which can be executed
immediately to produce intermediate result. As each statement is
typed, it compiles, assemble and after the whole code is written, it
just require execution.  Just-in-time execution.
• An interpreter does not produce an executable machine code of a
complete code of a program

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 10
UNIVERSITY
Compiler learning
• Isn’t it an old discipline?
• Yes, it is a well-established discipline
• Algorithms, methods and techniques are researched and developed in early
stages of computer science growth
• There are many compilers around and many tools to generate them
automatically
• So, why we need to learn it?
• Although you may never write a full compiler
• But the techniques we learn is useful in many tasks like writing an interpreter
for a scripting language, validation checking for forms and so on
Abstract view

Source Machine
code Compiler code

errors
• Recognizes legal (and illegal) programs
• Generate correct code
• Manage storage of all variables and code
• Agreement on format for object (or assembly)
code
Front-end, Back-end division
Source IR Machine
Front end Back end
code code

errors
• Front end maps legal code into IR
• Back end maps IR onto target machine
• Simplify retargeting
• Allows multiple front ends
• Multiple passes -> better code
Front end
Source tokens IR
Scanner Parser
code

errors
• Recognize legal code
• Report errors
• Produce IR
• Preliminary storage maps
Front end
Source tokens IR
Scanner Parser
code

errors

• Scanner:
• Maps characters into tokens – the basic unit of syntax
• x = x + y becomes <id, x> = <id, x> + <id, y>
• Typical tokens: number, id, +, -, *, /, do, end
• Eliminate white space (tabs, blanks, comments)
• A key issue is speed so instead of using a tool like
LEX it sometimes needed to write your own
scanner
Front end
Source tokens IR
Scanner Parser
code

errors
• Parser:
• Recognize context-free syntax
• Guide context-sensitive analysis
• Construct IR
• Produce meaningful error messages
• Attempt error correction
• There are parser generators like YACC which
automates much of the work
Front end
• Context free grammars are used to represent programming language
syntaxes:

<expr> ::= <expr> <op> <term> | <term>


<term> ::= <number> | <id>
<op> ::= + | -
Front end
• A parser tries to map a
program to the syntactic
elements defined in the
grammar
• A parse can be represented
by a tree called a parse or
syntax tree
Front end
• A parse tree can be
represented more compactly
referred to as Abstract Syntax
Tree (AST)
• AST is often used as IR
between front end and back
end
Factors that determine the choice of a
compiler for programming projects
• Nature/purpose of the projects.
• Volume of Systems or Application programs
• The data structures, which the compiler can handle.
• Portability of generated or compiled code
• The size/speed of the compiler.
• Debugging facilities (e.g. trace, step)
• Version
• Programmer’s knowledge of the compiler

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 20
UNIVERSITY
Phases of compilation
int a; == token or lexeme
int sum=0;
a=sum/2;
printf(a);

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 21
UNIVERSITY
Lexical Analysis
• The lexical analyzer receives the source code as input and breaks it down into the tokens
(unit of information or recognizable unit of information) of the language.
• It separates characters of the source program into groups that logically belong together.
• These groups are called tokens. The tokens include keywords, identifiers, operators,
function names, etc.
• Tokens are typically represented by numbers or strings string that holds the identifier
name.
• For example, the token id(s) is associated with the string, "s". Similarly, the token num(3)
is associated with the number, 3.
• Tokens are specified by patterns, called regular expressions.
• For example, the regular expression [a-z]|[a-zA-Z0-9]* recognizes all identifiers with at
least one alphanumeric letter DEPT.
07/06/2021
whose firstSCIENCE,
OF COMPUTER letter is lower-case
CBAS-MOUNTAIN TOP alphabetic. 22
UNIVERSITY
Lexical Analysis…
• A scanner groups input characters into tokens. For example, if the input is
• s = s*(p+3);
• The scanner generates the following sequence of tokens:
•  id(s)
•=
• id(s)
•*
•(
• id(p)
•+
• num(3)
•)
•;
• where id(s) means the identifier with name s (a program variable in this case), and
• num(3) indicates the integer 3 DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP
07/06/2021 23
UNIVERSITY
• A typical scanner:
• Recognizes the keywords of the language (these are the reserved words
that have a special meaning in the language, such as the word class in C+
+);
• Recognizes special characters, such as ( and ), or groups of special
characters, such as := and ==;
• Recognizes identifiers, integers, real, decimals, strings, etc;
• Ignores white spaces (tabs and blanks) and comments;
• Recognizes and processes special directives (such as the #include "file"
directive in C) and macros.
DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP
07/06/2021 24
UNIVERSITY
Building a scanner
• Efficient scanners can be built by using regular expressions and finite
automata. There are automated tools called scanner generators, such
as flex for C and JLex for Java, which construct a fast scanner
automatically according to specifications (regular expressions).

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 25
UNIVERSITY
Exercise
• Compare the use of Java Jlex and C Flex scanning software
• Group Programming languages according to OOPL, Procedural,
Functional. state five in each group.

DEPT. OF COMPUTER SCIENCE, CBAS-MOUNTAIN TOP


07/06/2021 26
UNIVERSITY

You might also like