0% found this document useful (0 votes)
117 views

Complete Notes System Software

The document outlines the syllabus for the course CS1542 System Software. It covers four modules: introduction and assemblers, assemblers and macro processors, loaders and linkers, and compilers. For each module, it provides a brief description of topics to be covered, such as basic assembler functions, macro definition and expansion, design of loaders, and compiler functions like lexical analysis and code generation. It also provides background information on system software and describes the simplified instructional computer (SIC) architecture that will be used in the course.

Uploaded by

Manikandan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views

Complete Notes System Software

The document outlines the syllabus for the course CS1542 System Software. It covers four modules: introduction and assemblers, assemblers and macro processors, loaders and linkers, and compilers. For each module, it provides a brief description of topics to be covered, such as basic assembler functions, macro definition and expansion, design of loaders, and compiler functions like lexical analysis and code generation. It also provides background information on system software and describes the simplified instructional computer (SIC) architecture that will be used in the course.

Uploaded by

Manikandan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

CS1542 System Software S5 BSc CS

MUSLIM ASSOCIATION COLLEGE OF ARTS AND SCIENCE


Panavoor,Thiruvananthapuram,Kerala
(Affiliated to the University of Kerala)

Department of Computer Science


CS1542: SYSTEM SOFTWARE
Semester : 5

Name : ……………………………………………………………………………………………

Candidate Code: ……………………………………………………………………………..

Muslim Association College of Arts and Science Page 1


CS1542 System Software S5 BSc CS

CS1542: SYSTEM SOFTWARE

SYLLABUS

Module I: Introduction &Assemblers: System software and machine architecture – The


simplified Instructional Computer (SIC) - Machine architecture - Data and instruction formats –
addressing modes - instruction sets - I/O and programming. Basic assembler functions - A
simple SIC assembler– Assembler algorithm and data structures

Module II: Assemblers & Macro-processor: Machine dependent assembler features –


Instruction formats and addressing modes – Program relocation - Machine independent
assembler features -Literals – Symbol-defining statements – Expressions - One pass assemblers
and Multi pass assemblers. Basic macro processor functions - Macro Definition and Expansion –
Macro Processor system software tools

Module III: Loaders And Linkers: Basic loader functions - Design of an Absolute Loader Machine
dependent loader features - Relocation – Program Linking – Algorithm and Data Structures for
Linking Loader - Machine-independent loader features – Automatic Library Search – Loader
Options- Loader design options - Linkage Editors – Dynamic Linking – Bootstrap Loaders

Module IV: Compilers: Basic Compiler Functions: Grammars, Lexical Analysis, Syntactic Analysis,
Code Generation. Machine Dependent Compiler Features – Intermediate Form of the program,
Machine Dependent Code optimization. Machine Dependent Compiler features –Structured
variables, machine-independent code optimization, Storage allocation. Compiler design options
– Division into passes

Muslim Association College of Arts and Science Page 2


CS1542 System Software S5 BSc CS

Module I
Introduction & Assemblers
System software and machine architecture – The simplified Instructional Computer (SIC) -
Machine architecture - Data and instruction formats – addressing modes - instruction sets - I/O
and programming. Basic assembler functions - A simple SIC assembler– Assembler algorithm
and data structures

Muslim Association College of Arts and Science Page 3


CS1542 System Software S5 BSc CS

System Software

• System Software is the most important type of software required to administer the
resources of the computer system.
• System software runs and functions internally with application software and hardware.
Moreover, it works as a linking interface between a hardware device and the end-user.
• System software runs in the background and manages all functioning of the computer
itself. It is called Low-Level Software as it runs at the most basic level of computer and is
usually written in a low-level language. As soon as we install the operating system on
our device, it gets automatically installed on the same device.
• System software helps to generate the user interface and allows the operating system
to interact with the computer hardware.

There is a list of some important features of System Software:

• It is very difficult to design system software.


• System software is responsible to directly connect the computer with hardware that
enables the computer to run.
• Difficulties in manipulation.
• It is smaller in size.
• System Software is difficult to understand.
• It is usually written in a low-level language.
• It must be as efficient as possible for the smooth functioning of the computer system.

Muslim Association College of Arts and Science Page 4


CS1542 System Software S5 BSc CS

Types of System Software

The following are types of system software

1. Operating System
2. Programming Language Translators
3. Device Drivers
4. Firmware Software
5. Utility Software

1.Operating System

• An Operating System is the most basic type of System Software that helps to manage
computer hardware and software.
• It is the central part of any computer system which is responsible for the smooth
functioning of any computer device.
• An Operating system primarily operates your computer when you start it.
• If you haven't installed the operating system on your computer, then you will not be
able to start your computer.

Muslim Association College of Arts and Science Page 5


CS1542 System Software S5 BSc CS

• Some most common examples of OS are macOS, Linux, Android, and Microsoft
Windows.

2.Programming Language Translators

• Programming translators are the software that converts high-level language into
machine language.
• A computer can only understand the machine language or binary bits pattern, either 0
or 1.

• A CPU understands this machine language that is not easy to understand by a normal
human. Hence, First, the end-user interacts with the computer in a high-level language
like Java, Python, C, PHP, and C++, etc., then the translator converts these languages
into machine code.
• A CPU or computer processor executes these machine codes into binary. It means any
program written in a high-level programming language must be converted into binary
codes first. This entire process to convert high-level language into machine code or
binary codes are known as compilation.

3.Device Drivers

• The operating system communicates with hardware components internally. This


communication can easily be managed and controlled with the help of device drivers.
• The operating system contains a number of device drivers to drive the hardware
components.
• Most of the device drivers, such as a mouse, keyboards, etc., are already installed in the
computer system by the computer manufacturing companies.
• However, in case of any new device for the operating system, users can install them
through the internet also. Here are some devices that require drivers to perform the
smooth functions of any computer system: Keyboards,Mouse,Printers etc

Muslim Association College of Arts and Science Page 6


CS1542 System Software S5 BSc CS

4.Firmware Software

• These are the operational software installed on the computer motherboards that help
the operating system to identify the memories such as Flash, ROM, EPROM, EEPROM,
and memory chips.
• However, the primary function of any firmware software is to manage and control all
activities of individual devices.
• Initially, it uses non-volatile chips for installation purposes, but later it gets installed on
the flash chips.
• The BIOS (Basic Input/Output System) also works as a system program used for the
booting process of the system. First, it loads the OS into the main memory (RAM) of
your system and then hands it over to the OS. BIOS works as the substitute for the ROM
chip; hence, it is called firmware software.
• A Firmware exists inside the devices while a device driver is installed in the operating
system.

5.Utility Software

• Utility software works as an interface between system software and application


software.
• Utility software is a third-party tool designed to reduce maintenance issues and detect
errors in the computer system.
• It comes with the operating system in your computer system.

Here are some specific features of utility software:

• It helps users to protect against threats and viruses.


• It helps to reduce disk size such as WinRAR, WinZip.
• It facilitates users to back up the old data and enhance the security of the system.
• It helps to recover the lost data.

Muslim Association College of Arts and Science Page 7


CS1542 System Software S5 BSc CS

Simplified Instructional Computer (SIC)

Simplified Instructional Computer (SIC) is a virtual computer that has hardware features which
are often found in real machines. There are two versions of this machine:

• SIC standard Model


• SIC/XE(extra equipment or expensive)

Machine Architecture of SIC Standard Model

1.Memory of SIC

• It consists of bytes(8 bits) ,words (24 bits which are consecutive 3


bytes)addressed by the location of their lowest numbered byte. There are totally 32,768
bytes in memory.
• There are 2^15 bytes in computer memory (1 byte = 8 bits)
• 3 consecutive byte = 1 word (24 bits = 1 word)

2.Registers of SIC

There are 5 registers in SIC. Every register has an address associated with it known as register
number. Size of each register is 4 bytes. On basis of register size, integer size is dependent.

I. A(Accumulator-0): It is used for mathematical operations.

II. X(Index Register-1): It is used for addressing.

III. L(Linkage Register-2): It stores the return address of instruction in case of subroutines.

IV. PC(Program Counter-8): It holds the address of next instruction to be executed.

V. SW(Status Word-9): It contains the variety of information

Muslim Association College of Arts and Science Page 8


CS1542 System Software S5 BSc CS

3.Data Format in SIC

• Integers are represented by 24 bit.


• Negative numbers are represented in 2’s complement.
• Characters are represented by 8 bit ASCII value.
• No floating point representation is available.

4.Instruction Format in SIC

• All instructions in SIC have 24 bit format.


• If x=0 it means direct addressing mode.
• If x=1 it means indexed addressing mode.

5.Addressing Modes in SIC


The different ways of specifying the location of an operand in an instruction are called as
addressing modes.There are two types of addressing mode in SIC
1. Direct Addressing Mode:
2.Indexed addressing mode:

1. Direct Addressing Mode:


In direct addressing mode, address field in the instruction contains the effective address of
the operand and no intermediate memory access is required. Now a days it is rarely used.
Example:
Add the content of R1 and 1001 and store back to R1:
Add R1, (1001)
Here 1001 is the address where operand is stored.

Muslim Association College of Arts and Science Page 9


CS1542 System Software S5 BSc CS

2.Indexed addressing mode:


The operand’s offset is the sum of the content of an index register SI or DI and an 8 bit or
16 bit displacement.
Example:
MOV AX, [SI +05] There are two addressing modes available-Direct and Indexed

6.Instruction Set in SIC

The following are the types of instructions used in SIC

1. Load and Store Instructions: To move or store data from accumulator to memory or
vice-versa. For example LDA, STA, LDX, STX etc.
2. Comparison Instructions: Used to compare data in memory by contents in accumulator.
For example COMP data.
3. Arithmetic Instructions: Used to perform operations on accumulator and memory and
store result in accumulator. For example ADD, SUB, MUL, DIV etc.
4. Conditional Jump: compare the contents of accumulator and memory and performs
task based on conditions. For example JLT, JEQ, JGT
5. Subroutine Linkage: Instructions related to subroutines. For example JSUB, RSUB

7.Input and Output in SIC

• It is performed by transferring 1 byte at a time from or to rightmost 8 bits of


accumulator. Each device has 8 bit unique code.
• There are 3 I/O instructions:
• Test Device (TD) tests whether device is ready or not. Condition code in Status Word
Register is used for this purpose. If cc is < then device is ready otherwise device is busy.
• Read data(RD) reads a byte from device and stores in register A.
• Write data(WD) writes a byte from register A to the device.

Muslim Association College of Arts and Science Page 10


CS1542 System Software S5 BSc CS

Basic assembler functions

Assembler converts the assembly language into machine language. The main function of an
assembler is assign addresses to labels. The output of the assembler program is the object
program, a machine language translation of the source program.

Basic Assembler Functions

• Convert mnemonic operation codes to machine language equivalents


• Convert symbolic operands to machine addresses (pass 1)
• Build machine instructions
• Convert data constants to internal representations
• Write the object program and assembly listing files

The following assembler directives are used:

1. START: Specify name and starting address for the program.


2. END : Indicate the end of the source program and specify the first executable instruction
in the program.
3. BYTE: Generate character or hexadecimal constant, occupying as many bytes as needed
to represent the constant.
4. WORD: Generate one- word integer constant.
5. RESB: Reserve the indicated number of bytes for a data area.
6. RESW: Reserve the indicated number of words for a data area.

Muslim Association College of Arts and Science Page 11


CS1542 System Software S5 BSc CS

A Simple SIC Assembler


The translation of source program to object code requires the following functions:
1. Convert mnemonic operation codes to their machine language equivalents. Eg:
Translate STL to 14 (line 10).
2. Convert symbolic operands to their equivalent machine addresses. Eg:TranslateRETADR
to 1033 (line 10).
3. Build the machine instructions in the proper format.
4. Convert the data constants specified in the source program into their internal machine
representations. Eg: Translate EOF to 454F46(line 80).
5. Write the object program and the assembly listing.

All fuctions except function 2 can be established by sequential processing of source program
one line at a time.

The assembler must write the generated object code onto some output device. This object
program will later be loaded into memory for execution.

Object program format contains three types of records:

1. Header record: Contains the program name, starting address and length.
2. Text record: Contains the machine code and data of the program.
3. End record: Marks the end of the object program and specifies the address in the
program where execution is to begin.

Muslim Association College of Arts and Science Page 12


CS1542 System Software S5 BSc CS

Assembler Algorithm and Data Structures

Assembler uses three major internal data structures: OPTAB,SYMTAB and LOCCTR

1.Operation Code Table (OPTAB) :


• Contains the mnemonic operation and its machine language equivalent.
• Also contains information about instruction format and length.
• In Pass 1, OPTAB is used to lookup and validate operation codes in the source program.
• In Pass 2, it is used to translate the operation codes to machine language program.

2.Symbol Table (SYMTAB) :

• Includes the name and value for each label in the source program and flags to indicate
error conditions.
• During Pass 1 of the assembler, labels are entered into SYMTAB as they are encountered
in the source program along with their assigned addresses.
• During Pass 2, symbols used as operands are looked up in SYMTAB to obtain the
addresses to be inserted in the assembled instructions.

3.Location Counter (LOCCTR) :

• It is initialized to the beginning address specified in the START statement.


• After each source statement is processed, the length of the assembled instruction or
data area is added to LOCCTR.
• Whenever a label is reached in the source program, the current value of LOCCTR gives
the address to be associated with that label.

Muslim Association College of Arts and Science Page 13


CS1542 System Software S5 BSc CS

Algorithm

Step1: Load the source program

Step2: Starting the execution by the content of LOCCTR

Step3:OPTAB find all the operation code and evaluate it

Step4:SYMTAB find values of all operations and convert it into object code

Step5: list the object code program

************************END OF MODULE 1**************************

Muslim Association College of Arts and Science Page 14


CS1542 System Software S5 BSc CS

Module II
Assemblers & Macro-processor
Machine dependent assembler features – Instruction formats and addressing modes – Program
relocation - Machine independent assembler features -Literals – Symbol-defining statements –
Expressions - One pass assemblers and Multi pass assemblers. Basic macro processor functions
- Macro Definition and Expansion – Macro Processor system software tools

Muslim Association College of Arts and Science Page 15


CS1542 System Software S5 BSc CS

Assembler
• An assembler is a program that converts assembly language into machine code.
• It takes the basic commands and operations from assembly code and converts them into
binary code that can be recognized by a processor.
• Assemblers are similar to compilers in that they produce executable code. However,
assemblers are more simplistic since they only convert low-level code (assembly
language) to machine code.
• Each assembly language is designed for a specific processor, assembling a program is
performed using a simple one-to-one mapping from assembly code to machine code.

Machine Dependent Assembler Feature


• Instruction formats and Addressing Modes
• Program relocation.

1.Instruction formats

• An instruction format defines layout of instruction


• Each assembly language statement is split into an opcode and an operand. The opcode
is the instruction that is executed by the CPU and the operand is the data or memory
location used to execute that instruction.
• Format must, implicitly or explicitly, indicate addressing mode for each operand.
• For most instruction sets, more than on instruction format is used.
• General Instruction Format
Opcode_Field Address_Field
• Example
MOV B, A
This supports three different types of instruction types, they are:
1 byte instruction
2 byte instruction
3 byte instruction

Muslim Association College of Arts and Science Page 16


CS1542 System Software S5 BSc CS

1. One-byte instructions –
In 1-byte instruction, the opcode and the operand of an instruction are represented in one
byte.
Example: Copy the contents of accumulator in register B.
MOV B, A
Opcode- MOV
Operand- B, A

2. Two-byte instructions –
Two-byte instruction is the type of instruction in which the first 8 bits indicates the opcode and
the next 8 bits indicates the operand.
Example: Load the hexadecimal data 32H in the accumulator.
MOV A, 32H
Opcode- MVI
Operand- A, 32H

3. Three-byte instructions –
Three-byte instruction is the type of instruction in which the first 8 bits indicates the opcode
and the next two bytes specify the 16-bit address.
Example-1: Load contents of memory 2050H in the accumulator.
LDA 2050H
Opcode- LDA
Operand- 2050H

Addressing Modes

The different ways in which location of an operand is specified in an instruction are referred to
as “Addressing Modes”.The three basic modes of addressing are −
1. Register addressing
2. Immediate addressing
3. Memory addressing

Muslim Association College of Arts and Science Page 17


CS1542 System Software S5 BSc CS

1.Register Addressing Mode


• Register addressing mode involves the use of registers to hold the data to be
manipulated.
• In this addressing mode, a register contains the operand.
• Depending upon the instruction, the register may be the first operand, the second
operand or both
Eg : Mov A,R0
Mov R1,R0

2.Immediate Addressing Mode

• In this addressing mode, the source operand is a constant.


• In immediate addressing mode, as the name indicates when the instruction is
assembled, the operand comes immediately after the opcode.
• An immediate operand has a constant value or an expression.
• When an instruction with two operands uses immediate addressing, the first operand
may be a register or memory location, and the second operand is an immediate
constant.
Eg: ADD A, 65 ; An immediate operand 65 is added with A

3.Indirect Addressing Mode


• In this addressing mode, the instruction does not give the operand or its address
explicitly.
• This addressing mode provides information from which the memory address or the
operand can be determined.
• This address is the effective address of the operand. When we use this addressing mode
here actually the register acts as a pointer, where pointer is nothing but the register or
memory location that contains the address of an operand.
Eg: Move N,R1
Move #Num1,R2

Muslim Association College of Arts and Science Page 18


CS1542 System Software S5 BSc CS

Program Relocation
Relocation means moving stuff from one place to another. Here, there is a program which
contains some absolute addresses, which make sense if the program is located at a certain
address named A. If the program is loaded to a different address named B, we need to update
all of these addresses, translating them by B−A. This is address relocation. A loader (a program
loading another program to memory) which does this is called a relocating loader.
Absolute Program

• Program with starting address specified at assembly time


• The address may be invalid if the program is loaded into somewhere else.

Machine Independent Assembler Features


1.Literals
Assembly language source code can contain numeric, string, Boolean, and single character
literals.
Literals can be expressed as:
• Decimal numbers, for example 123.
• Hexadecimal numbers, for example 0x7B.
• Numbers in any base from 2 to 9, for example 5_204 is a number in base 5.
• Floating point numbers, for example 123.4.
• Boolean values {TRUE} or {FALSE}.
• Single character values enclosed by single quotes, for example 'w'.
• Strings enclosed in double quotes, for example "This is a string".

2.Symblol
• Symbol in almost all assemblers is a combination of letters and digits which begins with
a letter. It usually has a mnemonic name.
• Numeric symbol answers the question how many and address symbol answers the
question where.
• An assembler symbol is human-readable representation of a number, or a position in
the program.

Muslim Association College of Arts and Science Page 19


CS1542 System Software S5 BSc CS

• The first kind of symbol is called numeric symbol The second one is called address
symbol
• Example
ADD A,100

3.Symbol-Defining Statements
Symbol defining statements are EQU and Org

1.EQU : The EQU directive gives a symbolic name to a numeric constant, a register-relative
value or a PC-relative value.It is udes to assign a value or expression to variable or register

Syntax
name EQU expr{, type}
• name is the symbolic name to assign to the value.
• Expr is a value or expression
Example
• abc EQU 2
Assigns the value 2 to the symbol abc.

2.ORG: This directive is used at the time of assigning starting address for a module or segment.

Syntax
ORG memory_location
Example
ORG 1050H

By this instruction, the assembler gets to know that the statements following this instruction,
must be stored in the memory location beginning with address 1050H.

Muslim Association College of Arts and Science Page 20


CS1542 System Software S5 BSc CS

Expressions
Expressions consist of one or more integer literals or symbol references, combined using
operators.
There are two types of expressions-Simple and Complex

1.Simple Expression

A simple expression takes the form:

var1 := term1 op term2;

Var1 is a variable, term1 and term2 are variables or constants.


Op is the mnemonic that corresponds to the specified operation (e.g., "+" = add, "-" = sub, etc.).
Example
Var1=A add B

2.Complex Expressions

A complex expression is any arithmetic expression involving more than two terms and one
operator.

A simple expression takes the form:

var1 := term1 op term2 op term2;

A complex expression that is easy to convert to assembly language is one that involves three
terms and two operators, for example:

X := W * Y + Z;

Muslim Association College of Arts and Science Page 21


CS1542 System Software S5 BSc CS

One Pass Assemblers and Multi Pass Assemblers


An assembler is a program that converts assembly language into machine code. It takes the
basic commands and operations from assembly code and converts them into binary code that
can be recognized by a specific type of processor.

The following assembler directives are used:

• START: Specify name and starting address for the program.


• END : Indicate the end of the source program
• BYTE: Specify how many bytes as needed to represent the constant.
• WORD: Generate one- word integer constant.
• RESB: Reserve the indicated number of bytes for a data area.
• RESW: Reserve the indicated number of words for a data area.

Types of assembler:
1. Single Pass Assembler /One Pass Assembler
2. Two Pass Assembler

1.Single Pass Assembler /One Pass Assembler


A single pass assembler scans the program only once and creates the equivalent binary
program.
The assembler substitute all of the symbolic instruction with machine code in one pass
Single pass assemblers are used when
• It is necessary or desirable to avoid a second pass over the source program
• The external storage for the intermediate file between two passes is slow or is
inconvenient to use

Muslim Association College of Arts and Science Page 22


CS1542 System Software S5 BSc CS

Forward References Problem in One Pass Assembler


• Rules for an assembly program states that the symbol should be defined somewhere in
the program.
• But in some cases a symbol may be used prior to its definition.Such a reference is called
forward reference.
• Due to this assembler cannot assemble the instructions and such a problem is called
forward reference.
Eliminating forward references:
• All labels used in forward references are defined in the source program
before they are referenced

2.Two Pass Assembler

• In two pass assembler Processing of source program can be done in two passes.
• The internal tables and subroutines that are used only during Pass 1.
• The main problems to assemble a program in one pass involves forward references. It can
eliminate in two pass assembler by doing the conversion in two separate passes
PASS 1 Operations
• Assign addresses to all statements in the program.
• Addresses of symbolic labels are stored.
PASS 2 Operations
• Translate opcode and symbolic operands.
• Generate data values defined by BYTE,WORD etc.
• Assemble directives will be processed.
• Write the object program and assembly listing.

Muslim Association College of Arts and Science Page 23


CS1542 System Software S5 BSc CS

Basic macro processor functions


• Macro processor is system software that replaces each macroinstruction with the
corresponding group of source language statements. This is also called as expanding of
macros
• A macro processor is a program that copies a stream of text from one place to another.
• Macro processors are often embedded in other programs, such as assemblers and
compilers. Sometimes they are standalone programs that can be used to process any
kind of text.
• A macro represents a commonly used group of statement in the source
programming language.
• Macro instructions allow the programmer to write a shorthand version of a program.
• Macros can be defined used in many programming languages, like C, C++ etc.
Example of MAcro
• #define max (a, b) a>b? A: b
Defines the macro max, taking two arguments a and b. This macro may be called like
any C function, using identical syntax. Therefore, after preprocessing
z = max(x, y);
Becomes z = x>y? X:y;

Macro Definition
A macro definition consists of name, set of formal parameters and body of code.
Two assembler directives (MACRO and MEND) are used in macro definitions.
The statements inside these directives called as macro body that defined with the macro name.
• MACRO: identify the beginning of a macro definition
• MEND: identify the end of a macro definition
• Each parameter begins with ‘&’

The structure of macro definition is


Macroname MACRO Parameters
Statement 1
Statement 2.
NMEND

Muslim Association College of Arts and Science Page 24


CS1542 System Software S5 BSc CS

Macro Expansion.
A macro call leads to macro expansion. During macro expansion, the macro statement is
replaced by sequence of assembly statements.
The macro called the entire code when the program is called.
#define max(a,b) a>b?a:b
Main ()
{
int x , y;
x=4; y=6;
z = max(x, y);
}
The above program was written using C programming statements. Defines the macro max,
taking two arguments a and b. This macro may be called like any C function, using identical
syntax. Therefore, after preprocessing
Becomes z = x>y ? x: y;

Macro Processor System Software Tools


Macro processor System software tools are used to process programs inside computer system
and that makes efficient execution of application software. The following are the basic tool
1. Translator
2. Assembler
3. Compiler
4. Interpreter
5. Pre processor
6. Linker
7. Loader

Muslim Association College of Arts and Science Page 25


CS1542 System Software S5 BSc CS

1. Translator
• A Translator is a system program that converts a program in one language to a program
in another language.
• A Translator can be denoted by following symbol: STTBSTBT
Where S = Source Language
T = Target Language
B = Base Language in which the translator is written
2. Assembler

• Assembler is a language translator which takes as input a program in assembly language


of machine A and generates its equivalent machine code for machine A, and the

3. Compiler
• A Compiler is a language translator that takes as input a source program in some HLL
and converts it into a lower-level language (i.e. machine or assembly language).

• So, an HLL program is first compiled to generate an object file with machine-level
instructions (i.e. compile time) and then instructions in object file are executed (i.e. run
time).

4. Interpreters

• An Interpreter is similar to a compiler, but one big difference is that it executes each line
of source code as soon as its equivalent machine code is generated. (This approach is
different from a compiler, which compiles the entire source code into an object file that
is executed separately).
• If there are any errors during interpretation, they are notified immediately to the
programmer and remaining source code lines are not processed.
• The main advantage with Interpreters is that, since it immediately notifies an error to
the programmer, debugging becomes a lot easier.

Muslim Association College of Arts and Science Page 26


CS1542 System Software S5 BSc CS

5. Pre-processors

• A Pre-processor converts one HLL into another HLL.


• Typically pre-processors are seen as system software used to perform some additional
functions (such as removal of white spaces and comments) before the actual translation
process can begin.
• So, input and output of pre-processor is generally the same HLL, only some additional
functions have been performed on the source.

6. Linker

• A Linker (or a Linkage Editor) takes the object file, loads and compiles the external sub-
routines from the library and resolves their external references in the main-program.
• A Compiler generates an object file after compiling the source code. But this object file
cannot be executed immediately after it gets generated.
• This is because the main program may use separate subroutines in its code (locally
defined in the program or available globally as language subroutines). The external sub-
routines have not been compiled with the main program and therefore their addresses
are not known in the program.

7. Loaders
• A Loader does the job of coordinating with the OS to get the initial loading address for
the program, prepares the program for execution (i.e. generates an .exe file) and loads it
at that address.

• Also, during the course of its execution, a program may be relocated to a different area
of main memory by the OS (when memory is needed for other programs).

• An important job of Loader is to modify these address-sensitive instructions, so that


they run correctly after relocation.

************************END OF MODULE 2**************************

Muslim Association College of Arts and Science Page 27


CS1542 System Software S5 BSc CS

Module III
Loaders And Linkers

Basic loader functions - Design of an Absolute Loader Machine dependent loader features -
Relocation – Program Linking – Algorithm and Data Structures for Linking Loader - Machine-
independent loader features – Automatic Library Search – Loader Options- Loader design
options - Linkage Editors – Dynamic Linking – Bootstrap Loaders

Muslim Association College of Arts and Science Page 28


CS1542 System Software S5 BSc CS

Basic Loader Functions


• A loader is a major component of an operating system that ensures all necessary
programs and libraries are loaded, which is essential during the startup phase of running
a program. It places the libraries and programs into the main memory in order to
prepare them for execution.
• Loading involves reading the contents of the executable file that contains the
instructions of the program and then doing other preparatory tasks

Loader Function: The loader performs the following functions:


1) Allocation
2) Linking
3) Relocation
4) Loading

1.Allocation:
• Allocates the space in the memory where the object program would be loaded for
Execution.
• It allocates the space for program in the memory, by calculating the size of the program.
This activity is called allocation.

Muslim Association College of Arts and Science Page 29


CS1542 System Software S5 BSc CS

• Allocation is done by the programmer and hence it is the duty of the programmer to
ensure that the programs do not get overlap.

2.Linking:
• It links two or more object codes and provides the information needed to allow
references between them.
• It resolves the symbolic references (code/data) between the object modules by
assigning all the user subroutine and library subroutine addresses. This activity is called
linking.
• In absolute loader linking is done by the programmer as the programmer is aware about
the runtime address of the symbols.
• Linking is done by the loader
3.Relocation:
• It modifies the object program by changing the certain instructions so that it can be
loaded at different address from location originally specified.
• There are some address-dependent locations in the program, such address constants
must be adjusted according to allocated space, such activity done by loader is called
relocation.
• Relocation is done by the assembler as the assembler is aware of the starting address of
the program.

4.Loading:
• It brings the object program into the memory for execution.
• It places all the machine instructions and data of corresponding programs into the
memory. Thus program now becomes ready for execution, this activity is called loading.
• Loading is done by the loader and hence the assembler must supply to the loader the
object program.

Muslim Association College of Arts and Science Page 30


CS1542 System Software S5 BSc CS

Design of an Absolute Loader


The operation of absolute loader is -the object code is loaded to specified locations in the
memory. At the end,the loader jumps to the specified address to begin execution of the
loaded program. The role of absolute loader is as shown in the figure

The advantage of absolute loader is simple and efficient.


In absolute loader each byte of assembled code is given using its hexadecimal representation in
character form.
It is Easy to read by human beings.
Each byte of object code is stored as a single byte.
Begin
read Header record
verify program name and length
read first Text record
while record type is <> „E‟ do begin
move object code to specified location in memory read next object program record
end
End

Muslim Association College of Arts and Science Page 31


CS1542 System Software S5 BSc CS

Machine-Dependent Loader Features


• The main problem in loader is - the need for the programmer need to specify the actual
address at which it will be loaded into memory.
• On a simple computer with a small memory the actual address at which the program
will be loaded can be specified easily.
• On a larger and more advanced machine, We do not know in advance where a program
will be loaded. Hence we write relocatable programs instead of absolute ones.
• Loaders that allow for program relocation are called relocating loaders or relative
loaders.

1.Relocation
• In general, the user does not know a priori where the program will reside in memory. A
relocating loader is capable of loading a program to begin anywhere in memory:
• The addresses produced by the compiler run from 0 to L–1. After the program has been
loaded, the addresses must run from N to N +L–1.
• Therefore, the relocating loader adjusts, or relocates, each address in the program.
• Fields that are relocated are called relative; those which are not relocated are
called absolute.

• L-is the last addressable data


• N-Nth addressable data

Muslim Association College of Arts and Science Page 32


CS1542 System Software S5 BSc CS

2.Program Linking
A whole program usually is not written in a single file. Apart from code and data definitions in
multiple files, a user code often makes references to code and data defined in some
"libraries".
Linking is the process in which references to "externally" defined objects (code and data).
Traditionally linking used to be performed as a task after basic translation of the user program
files.
Two important aspects in linking are - locating the individual object modules in the combined
executable program image, and adjusting the addresses used for external references in the
various places in the program.
Linking is performed at the last step in compiling a program.
Source code -> compiler -> Assembler -> Object code -> Linker -> Executable file -> Loader
Linking is of two types:
1. Static Linking – It is performed during the compilation of source program. It takes collection
of re locatable object file and command-line argument and generate fully linked object file that
can be loaded and run.
2. Dynamic linking – Dynamic linking is performed during the run time. This linking is
accomplished by placing the name of a shareable library in the executable image

Muslim Association College of Arts and Science Page 33


CS1542 System Software S5 BSc CS

Algorithm and Data Structures for Linking Loader


A linking loader usually makes two passes over its input, just as an assembler does.
In terms of general function, the two passes of a linking loader are quite similar to the two
passes of an assembler:
• Pass 1 assigns addresses to all external symbols.
• Pass 2 performs the actual loading, relocation, and linking.
Data Structure used in Linking Loader
1. ESTAB
2. PROGADDR
3. CSADDR
1.ESTAB
• It is used to store the name and address of each external symbol in the set of control
sections being loaded.
2.PROGADDR
• It is the beginning address in memory where the linked program is to be loaded. Its
value is supplied to the loader by the OS.
3.CSADDR
• It contains the starting address assigned to the control section currently being scanned
by the loader. This value is added to all relative addresses within the control section to
convert them to actual addresses.

Muslim Association College of Arts and Science Page 34


CS1542 System Software S5 BSc CS

Algorithm for Linking Loader


There are two passes in lining loader
During Pass 1, the loader is concerned only with Header and Define record types in the control
sections.
Pass 2 performs the actual loading, relocation, and linking of the program.

Algorithm for Pass 1 of a Linking loader


Step 1) The beginning load address for the linked program (PROGADDR) is obtained from the
OS.
Step 2) The control section name from Header record is entered into ESTAB, with value given by
CSADDR.
Step 3) When the End record is read, the control section length CSLTH (which was saved from
the End record) is added to CSADDR
• At the end of Pass 1, ESTAB contains all external symbols defined in the set of control
sections together with the address assigned to each.

Algorithm for Pass 2 of a Linking loader


Step 1) As each Text record is read, the object code is moved to the specified address
Step 2) When a Modification record is encountered, the symbol whose value is to be used for
modification is looked up in ESTAB.
Step 3) This value is then added to or subtracted from the indicated location in memory.
Step 4) The last step performed by the loader is usually the transferring of control to the loaded
program to begin execution.

Muslim Association College of Arts and Science Page 35


CS1542 System Software S5 BSc CS

Machine-Independent Loader Features


Loading and linking are OS service functions. They include the use of an automatic library
search process for handling external reference and some common options that can be selected
at the time of loading and linking.The following aremachine independent loader features

1.Automatic Library Search


• Many linking loaders can automatically link with a subprogram library into the program
being loaded.
• Linking loaders that support automatic library search must keep track of external
symbols that are referred to, but not defined, in the primary input to the loader.
• At the end of Pass 1, the symbols in ESTAB that remain undefined represent unresolved
external references.
• The loader searches the library to find programs(subroutines) that contain the
definitions of these symbols, and processes
• The subroutines fetched from a library in this way may themselves contain
external references.
• It is therefore necessary to repeat the library search process until all references are
resolved.
• If unresolved external references remain after the library search is completed, these
must be treated as errors.

Muslim Association College of Arts and Science Page 36


CS1542 System Software S5 BSc CS

2.Loader Options
Many loaders allow the user to specify options that modify the standard processing
Loader option 1: Allows the selection of alternative sources of input.
• Ex : INCLUDE program-name (library-name) might direct the loader to read the
designated object program from a library and treat it as input.
Loader option 2: Allows the user to delete external symbols or entire control sections.
• Ex : CHANGE name1, name2 might cause the external symbol name1 to be changed to
name2 wherever it appears in the object programs.
Loader option 3: Involves the automatic inclusion of library routines to satisfy external
references.
• Ex. : LIBRARY MYLIB
Such user-specified libraries are normally searched before the standard system libraries.
This allows the user to use special versions of the standard routines.

Muslim Association College of Arts and Science Page 37


CS1542 System Software S5 BSc CS

Loader Design Options


Linking loaders perform all linking and relocation at load time. There are two alternatives:
1. Linkage editors, which perform linking prior to load time.
2. Dynamic linking, in which the linking function is performed at execution time.

1.Linkage Editors
• The linkage editor performs relocation of all control sections relative to the start of the
linked program.
• Thus, all items that need to be modified at load time have values
• This means that the loading can be accomplished in one pass with no external symbol
table required.
• Linkage editors can perform many useful functions besides simply preparing an object
program for execution.
• Linkage editors can also be used to build packages of subroutines or other control
sections that are generally used together.
• This can be useful when dealing with subroutine libraries that support high-level
programming languages

Muslim Association College of Arts and Science Page 38


CS1542 System Software S5 BSc CS

2.Dynamic Linking Loader:


• Dynamic Linking Loader is a general re-locatable loader
• It allow the programmer to access multiple program segments and multiple data
segments
• They provide programmer to complete freedom in referencing data or instruction
contained in other segments.
• Dynamic linking defers much of the linking process until a program starts running.
• It provides a variety of benefits that are hard to get otherwise.
• Dynamically linked shared libraries are easier to create than static linked shared
libraries.
• Dynamically linked shared libraries are easier to update than static linked shared
libraries.
• Dynamic linking permits a program to load and unload routines at runtime, a facility that
can otherwise be very difficult to provide.

Muslim Association College of Arts and Science Page 39


CS1542 System Software S5 BSc CS

Bootstrap loader
• A bootstrap loader is a program that resides in the computer's EPROM, ROM, or another
non-volatile memory.
• It is automatically executed by the processor when turning on the computer.
• The bootstrap loader reads the hard drives boot sector to continue to load the
computer's operating system.
• When the computer is turned on or restarted, the bootstrap loader first performs the
power-on self-test, also known as POST. If the POST is successful and no issues are
found, the bootstrap loader loads the operating system for the computer into memory.
The computer can then access, load, and run the operating system.
• The term bootstrap comes from the old phrase "Pull yourself up by your bootstraps."
• Alternatively referred to as bootstrapping, bootloader, or boot program,

************************END OF MODULE 3**************************

Muslim Association College of Arts and Science Page 40


CS1542 System Software S5 BSc CS

Module IV
Compilers
Basic Compiler Functions: Grammars, Lexical Analysis, Syntactic Analysis, Code Generation.
Machine Dependent Compiler Features – Intermediate Form of the program, Machine
Dependent Code optimization. Machine Dependent Compiler features –Structured variables,
machine-independent code optimization, Storage allocation. Compiler designoptions – Division
into passes

Muslim Association College of Arts and Science Page 41


CS1542 System Software S5 BSc CS

Compiler
• A compiler is a translator that converts the high-level language into the machine
language.
• High-level language is written by a developer and machine language can be understood
by the processor.
• Compiler is used to show errors to the programmer.
• The main purpose of compiler is to change the code written in one language without
changing the meaning of the program.
• When you execute a program which is written in HLL programming language then it
executes into two parts.
• In the first part, the source program compiled and translated into the object program
(low level language).
• In the second part, object program translated into the target program through the
assembler.

Muslim Association College of Arts and Science Page 42


CS1542 System Software S5 BSc CS

Basic Compiler Functions


The compilation process contains the sequence of various phases. Each phase takes source
program in one representation and produces output in another representation. Each phase
takes input from its previous stage.

Muslim Association College of Arts and Science Page 43


CS1542 System Software S5 BSc CS

1.Lexical Analysis:

• Lexical analyzer phase is the first phase of compilation process.


• It takes source code as input.
• It reads the source program one character at a time and converts it into meaningful
lexemes.

2.Syntax Analysis

• Syntax analysis is the second phase of compilation process.


• It takes tokens as input and generates a parse tree as output.
• In syntax analysis phase, the parser checks that the expression made by the tokens is
syntactically correct or not.

3.Semantic Analysis

• Semantic analysis is the third phase of compilation process.


• It checks whether the parse tree follows the rules of language.
• Semantic analyzer keeps track of identifiers, their types and expressions.
• The output of semantic analysis phase is the annotated tree syntax.

3.Intermediate Code Generation

• In the intermediate code generation, compiler generates the source code into the
intermediate code.
• Intermediate code is generated between the high-level language and the machine
language.
• The intermediate code should be generated in such a way that you can easily translate it
into the target machine code.

Muslim Association College of Arts and Science Page 44


CS1542 System Software S5 BSc CS

4.Code Optimization

• Code optimization is an optional phase. It is used to improve the intermediate code so


that the output of the program could run faster and take less space.
• It removes the unnecessary lines of the code and arranges the sequence of statements
in order to speed up the program execution.

5.Code Generation

• Code generation is the final stage of the compilation process.


• It takes the optimized intermediate code as input and maps it to the target machine
language.
• Code generator translates the intermediate code into the machine code of the specified
computer.

Machine Dependent Compiler Features


• Translating the programs from the high-level language to the machine language is the
most prominent purpose of a compiler. The design of the high-level programming
languages is relatively independent of the machine which has been used.
• The codes are machine dependent in the case of elementary level. It is because of the
requirement of the instruction set for the generation of the code of a computer.
• The following are machine dependent compiler features

1.Intermediate form
It is the form which is considered for the code optimization technique of the program which has
to be compiled. Here, the analysis of the semantics and the syntax of the source statements
have already taken place.
Operation op1, op2, result
Here, the function which has to be performed by the object code is the operation. The
operands for this operation are op1, op2. The resulting value is placed in the result.

Muslim Association College of Arts and Science Page 45


CS1542 System Software S5 BSc CS

Intermediate code-generation routines create the quadruples which we use in the machine-
dependent coding.

2.Machine-Dependent Code Optimization


• Here, we will begin with the use of registers and their assignment.
• There are several general-purpose registers which are used for holding the constants as
well as the values of the variables.
• The registers which can be used for the addressing are present here.

Machine-Independent Compiler Features


1.Structured Variables
• Arrays, records, sets and strings use the structured variables.
• The allocation of memory to these structured variables is an important .
• Moderate amount of arrays is used here in detail.
Example
A: ARRAY [1….10] OF INTEGER
Here we have ten elements in the array. If each integer in the memory uses one word,
then we need to declare the allocation of ten words in the memory for array storage.
We can consider a two-dimensional array.
B: ARRAY [ 0…3, 1...6] OF INTEGER
In the first subscript, there are four different values. They are from 0 to 3. There are six
different values in the second transcript. So, the total amount of memory words is
4*6=24 words for storing array.

2.Machine-Independent Code Optimization


Elimination of the common subexpression is the source of the optimization of the source code.
There are various points in the program and then computes the same value by the sub
expressions.
Detection of the common sub expressions
It happens with the analysis of the program’s intermediate form. One more source of
optimization of the code is the removing the loop invariants. These are one of the types of sub

Muslim Association College of Arts and Science Page 46


CS1542 System Software S5 BSc CS

expressions different from the rest. It is because they do not change themselves going from one
iteration to the next of the loop.

3.Storage Allocation Techniques


It is the process of allocating storage space for the program.There are three types of storage
allocation
1. Static Storage Allocation
2. Stack Storage Allocation
3. Heap Storage Allocation
I. Static Storage Allocation
• For any program if we create memory at compile time, memory will be created in the
static area.
• For any program if we create memory at compile time only, memory is created only
once.
• It don’t support dynamic data structure i.e memory is created at compile time and
deallocated after program completion.
• The drawback is size of data should be known at compile time and recursion is not
supported
• Eg- FORTRAN was designed to permit static storage allocation.
II. Stack Storage Allocation
• Storage is organised as a stack and activation records are pushed and popped.
• Locals are contained in activation records so they are bound to fresh storage in each
activation.
• Recursion is supported in stack allocation
III. Heap Storage Allocation
• Memory allocation and deallocation can be done at any time and at any place
depending on the requirement of the user.
• Heap allocation is used to dynamically allocate memory to the variables and claim it
back when the variables are no more required.
• Recursion is supported.
Muslim Association College of Arts and Science Page 47
CS1542 System Software S5 BSc CS

Compiler Design Options


A compiler can broadly be divided into two phases based on the way they compile.

1.Analysis Phase
• Known as the front-end of the compiler, the analysis phase of the compiler reads the
source program, divides it into core parts and then checks for lexical, grammar and
syntax errors.
• The analysis phase generates an intermediate representation of the source program
and symbol table, which should be fed to the Synthesis phase as input.

2.Synthesis Phase
Known as the back-end of the compiler, the synthesis phase generates the target program
with the help of intermediate source code representation and symbol table.
A compiler can have many phases and passes.
• Pass : A pass refers to the traversal of a compiler through the entire program.
• Phase : A phase of a compiler is a distinguishable stage, which takes input from the
previous stage, processes and yields output that can be used as input for the next
stage. A pass can have more than one phase.

************************END OF MODULE 4**************************

Muslim Association College of Arts and Science Page 48

You might also like