Introduction System Software new (1) (1)
Introduction System Software new (1) (1)
CS301
What is System Software?
Application Execution
Domain Domain
Application PL Execution
Domain Domain Domain
C++ Program
C++ C Program
preprocessor
Errors
C++ Program
C++ Machine Language
translator Program
Interpreters
An interpreter is a language processor which bridges an execution gap
without generating a machine language program.
An interpreter is a language translator according to classification.
Interpreter Domain
Application PL Execution
Domain Domain Domain
Language Processing Activities
• Program Generation Activities
• Program Execution Activities
Program Generation
Errors
Specification Gap
m/c
Source Translator language Target
Program
Program program
• A program must be translated before it can be executed.
• The translated program may be saved in a file. The saved program
may be executed repeatedly.
• A program must be retranslated following modifications.
Program Execution
Program interpretation
Interpreter Memory CPU Memory
PC PC Machine
Source
Language
Program
Program
+
Errors +
Data
Data
Source
Analysis Synthesis Target
Program Phase Phase Program
Errors Errors
i : integer; 1 i int
2 a real
a, b : real; 3 b real
a := b + i; 4 i* real
5 temp real
Note that int i first needed to be converted into real, that is why 4th entry is
added into the table.
Addition of entry 3 and 4, gives entry 5 (temp), which is value b + (i *).
real
:=
a b a +
a, b : real b i
a := b + i
Semantic Analysis
• It identifies the sequence of actions necessary to implement the meaning of a
source statement.
• It determines the meaning of a sub tree in the IC, it adds information to a table
or adds an action to the sequence of actions. The analysis ends when the tree
has been completely processed.
:= := :=
Lexical
Errors Scanning
Tokens
Symbol Table
Syntax Parsing Constants Table
Errors
Trees Other tables
Semantic Semantic
Errors
Analysis
IC
IR
Synthesis Phase (Back end)
It performs memory allocation and code generation.
Memory Allocation
• The memory requirement of an identifier is computed from its type,
length and dimensionality and memory is allocated to it.
• The address of the memory area is entered in the symbol table.
Symbol Type Length Address
1 i int 2000
2 a real 2001
3 b Real 2002
Synthesis Phase (Back end)
• Code Generation
• It uses knowledge of the target architecture, viz. knowledge of
instructions and addressing modes in the target computer, to select
the appropriate instructions.
• The synthesis phase may decide to hold the values of i* and temp in
machine registers and may generate the assembly code.
• a := b + i;
CONV_R AREG, I
ADD_R AREG, B
MOVEM AREG, A
Synthesis Phase (Back end)
IR
IC Memory
Allocation
Symbol Table
Constants Table
Other tables
Code
Generation
Target
Program
Fundamentals of Language Specification
• PL Grammars
• The lexical and syntactic features of a programming language are
specified by its grammar.
• A language L can be considered to be a collection of valid sentences.
• Each sentence can be looked upon as a sequence of words, and each
word as a sequence of letters or graphic symbols acceptable in L.
• A language specified in this manner is known as a formal language.
Example
1. Lexical Analysis
Purpose: The lexical analyzer (or lexer) scans the source code and breaks it down into tokens.
Tokens are the basic building blocks of the language, such as keywords, identifiers, operators, and literals.
Input: a := b + c * 5
Output (Tokens):
•a (identifier)
•:= (assignment operator)
•b (identifier)
•+ (addition operator)
•c (identifier)
•* (multiplication operator)
•5 (literal integer)
2. Syntax Analysis
Purpose: The syntax analyzer (or parser) takes the tokens produced by the lexer and
arranges them into a syntactic structure according to the grammar of the language.
This phase generates a parse tree or abstract syntax tree (AST).
Input (Tokens):
•a :=
•:=
•b id1
•+ +
•c
•* id2 *
•5
id1
+
id2 *
id3 5
4. Intermediate Code Generation
Purpose: The intermediate code generator translates the AST into an intermediate representation (IR),
which is more abstract than machine code but closer to the hardware level.
t1 = c * 5
t2 = b + t1
a = t2
Purpose: The optimizer improves the intermediate code to make it more efficient without changing its output.
This can involve eliminating redundant calculations, simplifying expressions, or reordering instructions.
t1 = c * 5
t2 = b + t1
a = t2
t1 = c * 5
a = b + t1
•The given example is quite simple, so optimization might not change much,
but in a more complex scenario, it could remove redundant computations or simplify operations.
7. Code generator
Purpose: The final phase in compiler model is the code generator. It takes as input an intermediate representation of
the source program and produces as output an equivalent target program (assembly, relocatable, absolute).
Input :
t1 = c * 5
a = b + t1
Output:
MOVF c. R2
MULF 5, R2
MOVF b, R1
ADDF R2, R1
MOVF R1, a
Phases of compiler
• Lexical Analysis: Tokenizing the source code.
• Syntax Analysis: Creating a parse tree or AST.
• Semantic Analysis: Checking types and ensuring correctness.
• Intermediate Code Generation: Producing an intermediate
representation.
• Code Optimization: Improving intermediate code efficiency.
• Code Generation: Producing machine or assembly code.