0% found this document useful (0 votes)
36 views

CD Assignment Question Bank

The document discusses various topics related to compilers: 1. It explains the different phases of a compiler including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation. 2. It defines terms like token, lexeme, and pattern used in lexical analysis. 3. It discusses different error recovery strategies used by parsers like panic mode, phrase level recovery, error productions, global correction, and using symbol tables. 4. It provides details about the lexical analyzer and how it works, and also explains the lexical analyzer tool LEX. 5. It discusses some applications of compiler technology like in artificial intelligence, gaming, security, and embedded systems.

Uploaded by

team pagal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

CD Assignment Question Bank

The document discusses various topics related to compilers: 1. It explains the different phases of a compiler including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation. 2. It defines terms like token, lexeme, and pattern used in lexical analysis. 3. It discusses different error recovery strategies used by parsers like panic mode, phrase level recovery, error productions, global correction, and using symbol tables. 4. It provides details about the lexical analyzer and how it works, and also explains the lexical analyzer tool LEX. 5. It discusses some applications of compiler technology like in artificial intelligence, gaming, security, and embedded systems.

Uploaded by

team pagal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Assignment 1

1. Explain various phases of compiler with diagram


Ans.
1. A compiler is a software program that converts the high-level source code written in a
programming language into low-level machine code that can be executed by the computer
hardware.
2. The process of converting the source code into machine code involves several phases or
stages, which are collectively known as the phases of a compiler.
3. Basically there are two phases of compilers, namely the Analysis phase and Synthesis phase.
4. The analysis phase creates an intermediate representation from the given source code and
the synthesis phase creates an equivalent target program from the intermediate representation.
5. The typical phases of a compiler are:
a. Lexical Analysis: The first phase of a compiler is lexical analysis, also known as scanning. It
takes source code as input. It reads the source program one character at a time and converts it
into meaningful lexemes. Lexical analyzer represents these lexemes in the form of tokens.
b. Syntax Analysis: The second phase of a compiler is syntax analysis, also known as parsing. It
takes tokens as input and generates a parse tree as output. In syntax analysis phase, the parser
checks that the expression made by the tokens is syntactically correct or not. The output of this
phase is usually an Abstract Syntax Tree (AST).
c. Semantic Analysis: The third phase of a compiler is semantic analysis. This phase checks
whether the code is semantically correct, i.e., whether it conforms to the language’s type
system and other semantic rules.
d. Intermediate Code Generation: The fourth phase of a compiler is intermediate code
generation. This phase generates an intermediate representation of the source code that can be
easily translated into machine code.
e. Optimization: The fifth phase of a compiler is optimization. This phase applies various
optimization techniques to the intermediate code to improve the performance of the generated
machine code.
f. Code Generation: The final phase of a compiler is code generation. This phase takes the
optimized intermediate code and generates the actual machine code that can be executed by
the target hardware

2. Explain Token , Pattern , Lexeme with example


Ans.
1. A compiler is system software that translates the source program written in a high-level
language into a low-level language. The compilation process of source code is divided into
several phases in order to ease the process of development and designing. The phases work in
sequence as the output of the previous phase is utilized in the next phase.
2. Lexical Analysis Phase: In this phase, input is the source program that is to be read from left to
right and the output we get is a sequence of tokens that will be analyzed by the next Syntax
Analysis phase. During scanning the source code, white space characters, comments, carriage
return characters, preprocessor directives, macros, line feed characters, blank spaces, tabs, etc.
are removed. The Lexical analyzer or Scanner also helps in error detection. To exemplify, if the
source code contains invalid constants, incorrect spelling of keywords, etc. is taken care of by
the lexical analysis phase. Regular expressions are used as a standard notation for specifying
tokens of a programming language.
3. Token: It is basically a sequence of characters that are treated as a unit as it cannot be further
broken down. In programming languages like C language- keywords (int, char, float, const, goto,
continue, etc.) identifiers (user-defined names), operators (+, -, *, /), delimiters/punctuators like
comma (,), semicolon(;), braces ({ }), etc. , strings can be considered as tokens. This phase
recognizes three types of tokens: Terminal Symbols (TRM)- Keywords and Operators, Literals
(LIT), and Identifiers (IDN).
Example: int a = 10; //Input Source code
Tokens
int (keyword),a(identifier), =(operator), 10(constant) and ;(punctuation-semicolon)
3. Lexeme: It is a sequence of characters in the source code that are matched by given predefined
language rules for every lexeme to be specified as a valid token.
4. Example:
main is lexeme of type identifier(token)
(,),{,} are lexemes of type punctuation(token)
5. Pattern: It specifies a set of rules that a scanner follows to create a token.
Example of Programming Language (C, C++):
i. For a keyword to be identified as a valid token, the pattern is the sequence of characters that
make the keyword.
ii. For identifier to be identified as a valid token, the pattern is the predefined rules that it must
start with alphabet, followed by alphabet or a digit

3. Explain error recovery strategy in parser


Ans.
The error may occur at various levels of compilation, so error handling is important for the
correct execution of code. There are mainly five error recovery strategies, which are as follows:
1. Panic mode
2. Phrase level recovery
3. Error production
4. Global correction
5. Symbol table
Panic Mode:

This strategy is used by most parsing methods. In this method of discovering the error, the
parser discards input symbols one at a time. This process is continued until one of the
designated sets of synchronizing tokens is found.

Phrase Level Recovery:

In this strategy, on discovering an error, parser performs local correction on the remaining
input. It can replace a prefix of the remaining input with some string.

Error Production:

It requires good knowledge of common errors that might get encountered, then we can augment
the grammar for the corresponding language with error productions that generate the
erroneous constructs.

Global Correction:

We often want such a compiler that makes very few changes in processing an incorrect input
string to the correct input string

Symbol Table:

In semantic errors, errors are recovered by using a symbol table for the corresponding identifier
and if data types of two operands are not compatible, automatically type conversion is done by
the compiler.

4. Explain Lexical analyzer in detail


Ans.
1. Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
2. The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
3. It reads the input stream and produces the source code as output through implementing the
lexical analyzer in the C program.
4. The function of Lex is as follows:
a. Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler
runs the lex.1 program and produces a C program lex.yy.c.
b. Finally C compiler runs the lex.yy.c program and produces an object program a.out.
c. a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
5. Lex file format: A Lex program is separated into three sections by %% delimiters. The formal
of Lex source is as follows:
{ definitions }
%%
{ rules }
%%
{ user subroutines }
6. . Definitions: include declarations of constant, variable and regular definitions.
 Rules: define the statement of form p1 {action1} p2 {action2}....pn {action}.
Where pi describes the regular expression and action1 describes the actions what action the
lexical analyzer should take when pattern pi matches a lexeme.
 User subroutines: are auxiliary procedures needed by the actions. The subroutine can be
loaded with the lexical analyzer and compiled separately.

5. Explain Lexical Analyzer tool LEX


Ans.
Write Same answer of above question no. 4

6. Explain application of compiler technology.


Ans.
1. A compiler is a piece of software that translates high-level programming language source
code into machine code. It translates code written in one programming language into another
without changing its meaning.
2. Applications of compiler technology:

i. Artificial Intelligence: Compilers are used in the field of artificial intelligence (AI) to
optimize and generate code for deep learning models, computer vision, natural
language processing, and other AI applications. AI compilers can optimize code for
specific hardware architectures and can generate highly efficient code for AI
workloads.
ii. Gaming: Game development often involves the use of compilers to generate code that
runs on game consoles and PCs. Gaming compilers are optimized for performance,
allowing game developers to create immersive, high-performance games.
iii. Security: Compilers are used in security applications to create code that is resistant to
various forms of attacks, including buffer overflows, code injections, and other
security vulnerabilities. Security compilers can generate code that is highly resistant
to reverse engineering and tampering.
iv. Embedded Systems: Embedded systems are computer systems that are designed to
perform specific functions in various devices such as automobiles, medical
equipment, and consumer electronics. Compilers are used to generate machine
code that runs on these devices, ensuring efficient use of resources and optimal
performance.

Assignment No 2

1. Explain Recursive Descent Parser with example


Ans.
1. Recursive Descent Parser (RDP) works by recursively descending through the grammar rules
of a language to analyze and process the syntax of the input text.

2. This process begins at the top level of the grammar, where the parser identifies the non-
terminal symbol that corresponds to the starting rule of the language.

3. The parser then calls the corresponding parsing function for that non-terminal symbol, which
recursively calls parsing functions for each of its sub-rules.

4. During this process, the parser compares the current input token to the expected token for the
current rule, using lookahead to determine which rule to follow next.

5. If the current input token matches the expected token, the parser moves on to the next token
and continues the parsing process.

6. If the input token does not match the expected token, the parser generates a syntax error and
stops processing the input text.

7. As the parser descends through the grammar rules, it builds a parse tree that represents the
structure of the input text according to the language’s grammar rules.

8. This parse tree can then be used to further analyze and process the input text, such as by
generating code or performing semantic analysis.

9. Overall, the RDP method provides a simple and efficient way to analyze and process the
syntax of a language.

10. By recursively descending through the grammar rules, the parser can efficiently handle a
wide variety of context-free grammar, making it a popular choice for parsing algorithms in
programming language design.
2. Explain non recursive predictive parsing with example
Ans.
1. A non-recursive predictive parser can be built by maintaining a stack a stack explicitly,
rather than implicitly via recursive calls. The parser mimics a leftmost derivation. If w is
the input that has been matched so far, then the stack holds a sequence of grammar
symbols α such that,
2. S ⇒∗ wα
3. The parser we are going to use is called table driven parser with following
arrangements:
• It has an input buffer that contains the string to be parsed with $ as the end marker
symbol,
• a stack containing a sequence of grammar symbols, and
• a parsing table.
4. The output is parse-tree. The bottom of the stack also holds the end marker symbol $.
Initially the symbol on top of $ symbol in stack is start symbol of the grammar.
5. The non-recursive predictive parser constructs a top-down parse-tree.
6. The parser is con trolled by a program that read X, the symbol on top of the stack, and a
– the current input symbol. If X is non-terminal, the parser chooses an X-production by
consulting entry M[X, a] in the parsing table M. In addition, a code be executed here,
say, the code for constructing a node for a parse-tree. If X is a terminal symbol, then it
checks for a match between the terminal symbol X and current symbol input a. if
matched, the terminal is popped from stack, input pointer is advanced to next symbol,
and the process repeats for next symbol on the stack. In case X is terminal but not
matching with input symbol, then it is case of error. The behaviour of the parser can be
described in terms of configurations, which give the stack contents and the remaining
input.

3. Explain syntax analyzer in detail


Ans.
1. Syntax Analysis or Parsing is the second phase, i.e. after lexical analysis.
2. It checks the syntactical structure of the given input, i.e. whether the given input is in the
correct syntax (of the language in which the input has been written) or not. It does so by
building a data structure, called a Parse tree or Syntax tree.
3. The parse tree is constructed by using the pre-defined Grammar of the language and the
input string.
4. If the given input string can be produced with the help of the syntax tree (in the derivation
process), the input string is found to be in the correct syntax. if not, the error is reported by
the syntax analyzer.
5. Syntax analysis, also known as parsing, is a process in compiler design where the compiler
checks if the source code follows the grammatical rules of the programming language.
6. This is typically the second stage of the compilation process, following lexical analysis.
7. The main goal of syntax analysis is to create a parse tree or abstract syntax tree (AST) of
the source code, which is a hierarchical representation of the source code that reflects the
grammatical structure of the program.

8. Advantages :

9. Advantages of using syntax analysis in compiler design include:


10. Structural validation: Syntax analysis allows the compiler to check if the source code
follows the grammatical rules of the programming language, which helps to detect and
report errors in the source code.

11. Disadvantages:

12. Disadvantages of using syntax analysis in compiler design include:


13. compilation process, which can reduce the performance of the compiler.

4. Explain shift reduce parser with suitable example


Ans.
1. Shift Reduce parser attempts for the construction of parse in a similar manner as done in
bottom-up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up).
2. A more general form of the shift-reduce parser is the LR parser.
3. This parser requires some data structures i.e. a. An input buffer for storing the input string. b.
A stack for storing and accessing the production rules.
5. Shift reduce parsing is a process of reducing a string to the start symbol of a grammar.
6. Shift reduce parsing uses a stack to hold the grammar and an input tape to hold the string.
7. Sift reduce parsing performs the two actions: shift and reduce. That's why it is known as shift
reduces parsing.
8. At the shift action, the current symbol in the input string is pushed to a stack.
9. At each reduction, the symbols will replaced by the non-terminals. The symbol is the right
side of the production and non-terminal is the left side of the production.
10. There are two main categories of shift reduce parsing as follows:
1. Operator-Precedence Parsing
2. LR-Parser
11. Advantages:
 Shift-reduce parsing is efficient and can handle a wide range of context-free grammars.
 It can parse a large variety of programming languages and is widely used in practice.
12. Disadvantages:
 Shift-reduce parsing has a limited lookahead, which means that it may miss some syntax errors
that require a larger lookahead.
 In some cases, the parse tree generated by shift-reduce parsing may be more complex than
other parsing techniques.

5. Explain LR parser with suitable example


Ans.
1. LR parser is a bottom-up parser for context-free grammar that is very generally used
by computer programming language compiler and other associated tools.
2. LR parser reads their input from left to right and produces a right-most derivation. It is
called a Bottom-up parser because it attempts to reduce the top-level grammar
productions by building up from the leaves.
3. LR parsers are the most powerful parser of all deterministic parsers in practice.
4. Description of LR parser :
o LR parsing is one type of bottom up parsing. It is used to parse the large class of
grammars.
o In the LR parsing, "L" stands for left-to-right scanning of the input.
o "R" stands for constructing a right most derivation in reverse.
o "K" is the number of input symbols of the look ahead used to make number of parsing
decision.
6.LR parsing is divided into four parts: LR (0) parsing, SLR parsing, CLR parsing and
LALR parsing.
7.Rules for LR parser: The rules of LR parser as follows. o The first item from the given
grammar rules adds itself as the first closed set. o If an object is present in the closure of
the form A→ α. β. γ, where the next symbol after the symbol is non-terminal, add the
symbol’s production rules where the dot precedes the first item. o Repeat steps (B) and
(C) for new items added under (B)

6.Explain Parser generator tool YACC


Ans.
1. YACC stands for Yet Another Compiler Compiler.
2. YACC provides a tool to produce a parser for a given grammar.
3. YACC is a program designed to compile a LALR (1) grammar.
4. It is used to produce the source code of the syntactic analyzer of the language produced
by LALR (1) grammar.
5. The input of YACC is the rule or grammar and the output is a C program.
6. Input: A CFG- file.y YACC input file is divided into three parts.
/* definitions */
....
%%
/* rules */
....
%%
/* auxiliary routines */
7. a. Input File: Definition Part: The definition part includes information about the tokens
used in the syntax definition:
b. Input File: Rule Part: The rules part contains grammar definitions in a modified BNF
form. Actions is C code in { } and can be embedded inside (Translation schemes).
c. Input File: Auxiliary Routines Part: The auxiliary routines part is only C code.
8. Output: A parser y.tab.c (yacc)
o The output file "file.output" contains the parsing tables.
o The file "file.tab.h" contains declarations.
o The parser called the yyparse ().
o Parser expects to use a function called yylex () to get tokens.
9. The basic operational sequence is as follows:
o gram.y : This file contains the desired grammar in YACC format.
o yacc : It shows the YACC program.
o y.tab.c : It is the c source program created by YACC.
o Cc or gcc : C Compiler.
o a.out : Executable file that will parse grammar given in gram.Y

7. Explain Canonical LR parsing table with suitable example


Ans.
1. The CLR parser stands for canonical LR parser.
2. It is a more powerful LR parser.
3. It makes use of lookahead symbols.
4. This method uses a large set of items called LR(1) items.
5. The main difference between LR(0) and LR(1) items is that, in LR(1) items, it is possible to
carry more information in a state, which will rule out useless reduction states. 6. This extra
information is incorporated into the state by the lookahead symbol.
7.The general syntax [A->∝.B, a] where A->∝.B is the production and a is a terminal or right end
marker $ LR(1) items=LR(0) items + look ahead.
8. Steps for constructing CLR parsing table :
o Writing augmented grammar
o LR(1) collection of items to be found
o Defining 2 functions: goto[list of terminals] and action[list of non-terminals] in the CLR
parsing table
9. CLR parsing use the canonical collection of LR (1) items to build the CLR (1) parsing table.
10. CLR (1) parsing table produces the more number of states as compare to the SLR (1) parsing.
11. In the CLR (1), we place the reduce node only in the lookahead symbols.
12. Various steps involved in the CLR (1) Parsing:
o For the given input string write a context free grammar
o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table

Assignment No 3
1.Explain backpatching in detail
Ans.
1. Backpatching is basically a process of fulfilling unspecified information. This
information is of labels.
2. It basically uses the appropriate semantic actions during the process of code
generation. It may indicate the address of the Label in goto statements while
producing TACs for the given expressions.
3. Here basically two passes are used because assigning the positions of these label
statements in one pass is quite challenging.
4. It can leave these addresses unidentified in the first pass and then populate them
in the second round.
5. Backpatching is the process of filling up gaps in incomplete transformations and
information.
6. Backpatching is mainly used for two purposes:
a. Boolean expression:
Boolean expressions are statements whose results can be either true or
false.
b. Flow of control statements:
The flow of control statements needs to be controlled during the execution
of statements in a program.
7. Applications of Backpatching:
i. Backpatching is used to translate flow-of-control statements in one pass
itself.
ii. Backpatching is used for producing quadruples for boolean expressions
during bottom-up parsing.
iii. It is the activity of filling up unspecified information of labels during the
code generation process.
iv. It helps to resolve forward branches that have been planted in the code.

2. Explain various way of representing Three-address


instructions(quadruples, triples and indirect triples)
Ans.
1. Quadruple – It is a structure which consists of 4 fields namely op, arg1, arg2 and result. op
denotes the operator and arg1 and arg2 denotes the two operands and result is used to store the
result of the expression.
Advantage –
 Easy to rearrange code for global optimization.
 One can quickly access value of temporary variables using symbol table.
Disadvantage –
 Contain lot of temporaries.
 Temporary variable creation increases time and space complexity.
2. Triples – This representation doesn’t make use of extra temporary variable to represent a
single operation instead when a reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and arg2.
Disadvantage –
 Temporaries are implicit and difficult to rearrange code.
 It is difficult to optimize because optimization involves moving intermediate code.
When a triple is moved, any other triple referring to it must be updated also. With help
of pointer one can directly access symbol table entry.
3. Indirect Triples – This representation makes use of pointer to the listing of all references to
computations which is made separately and stored. Its similar in utility as compared to
quadruple representation but requires less space than it. Temporaries are implicit and easier to
rearrange code.

3. Explain Inherited and Synthesized Attributes with example


Ans.
S.NO Synthesized Attributes Inherited Attributes

An attribute is said to be Synthesized An attribute is said to be Inherited attribute


attribute if its parse tree node value is if its parse tree node value is determined by
determined by the attribute value at child the attribute value at parent and/or siblings
1. nodes. node.

The production must have non-terminal as The production must have non-terminal as a
2. its head. symbol in its body.

A synthesized attribute at node n is A Inherited attribute at node n is defined


defined only in terms of attribute values at only in terms of attribute values of n’s
3. the children of n itself. parent, n itself, and n’s siblings.

It can be evaluated during a single bottom- It can be evaluated during a single top-down
4. up traversal of parse tree. and sideways traversal of parse tree.

Synthesized attributes can be contained by Inherited attributes can’t be contained by


5. both the terminals or non-terminals. both, It is only contained by non-terminals.

Synthesized attribute is used by both S- Inherited attribute is used by only L-


6. attributed SDT and L-attributed SDT. attributed SDT.

7.

4. Explain application of Syntax directed translation


Ans.
Syntax Directed Translation :
It is used for semantic analysis and SDT is basically used to construct the parse tree with
Grammar and Semantic action. In Grammar, need to decide who has the highest priority will be
done first and In semantic action, will decide what type of action done by grammar.
Example :
SDT = Grammar+Semantic Action
Grammar = E -> E1+E2
Semantic action= if (E1.type != E2.type) then print "type
mismatching"
Application of Syntax Directed Translation :
 SDT is used for Executing Arithmetic Expression.
 In the conversion from infix to postfix expression.
 In the conversion from infix to prefix expression.
 It is also used for Binary to decimal conversion.
 In counting number of Reduction.
 In creating a Syntax tree.
 SDT is used to generate intermediate code.
 In storing information into symbol table.
 SDT is commonly used for type checking also.

Assignment No 4

1. Explain various Storage Organization.


Ans.
1. When the target program executes then it runs in its own logical address
space in which the value of each program has a location.
2. The logical address space is shared among the compiler, operating system
and target machine for management and organization. The operating system
is used to map the logical address into physical address which is usually
spread throughout the memory.
3. Runtime storage comes into blocks, where a byte is used to show the smallest
unit of addressable memory. Using the four bytes a machine word can form.
Object of multibyte is stored in consecutive bytes and gives the first byte
address.
4. Run-time storage can be subdivide to hold the different components of an
executing program:

 Generated executable code


 Static data objects
 Dynamic data-object- heap
 Automatic data objects- stack

2. Explain Partition Algorithm for Basic Blocks


Ans.

Basic Block

Basic block contains a sequence of statement. The flow of control enters at the
beginning of the statement and leave at the end without any halt (except may be the last
instruction of the block).
The following sequence of three address statements forms a basic block:

Basic block construction:

Algorithm: Partition into basic blocks

Input: It contains the sequence of three address statements

Output: it contains a list of basic blocks with each three address statement in exactly one
block

Method: First identify the leader in the code. The rules for finding leaders are as follows:

o The first statement is a leader.


o Statement L is a leader if there is an conditional or unconditional goto statement
like: if....goto L or goto L
o Instruction L is a leader if it immediately follows a goto or conditional goto
statement like: if goto B or goto B

For each leader, its basic block consists of the leader and all statement up to. It doesn't
include the next leader or end of the program.

Consider the following source code for dot product of two vectors a and b of length 10:

1. begin
2. prod :=0;
3. i:=1;
4. do begin
5. prod :=prod+ a[i] * b[i];
6. i :=i+1;
7. end
8. while i <= 10
9. end

3. What is garbage collection? Explain Design Goals for Garbage


Collectors
Ans.
1. Garbage collection (GC) is a dynamic technique for memory management and
heap allocation that examines and identifies dead memory blocks before
reallocating storage for reuse. Garbage collection's primary goal is to reduce
memory leaks.
2. Garbage collection frees the programmer from having to deallocate and return
objects to the memory system manually. Garbage collection can account for a
considerable amount of a program's total processing time, and as a result, can
have a significant impact on performance.
3. Stack allocation, region inference, memory ownership, and combinations of
various techniques are examples of related techniques.
4. The basic principles of garbage collection are finding data objects in a program
that cannot be accessed in the future and reclaiming the resources used by those
objects. Garbage collection does not often handle resources other than memory,
such as network sockets, database handles, user interaction windows, files, and
device descriptors. Methods for managing such resources, especially destructors,
may be sufficient to manage memory without the requirement for GC. Other
resources can be associated with a memory sector in some GC systems, which,
when collected, causes the task of reclaiming these resources.
5. Many programming languages, such as RPL, Java, C#, Go, and most scripting
languages, require garbage collection either as part of the language specification
or effectively for practical implementation (for example, formal languages like
lambda calculus); these are referred to as garbage-collected languages.
6. Other languages, such as C and C++, were designed for use with manual memory
management but included garbage-collected implementations.
7. Some languages, such as Ada, Modula-3, and C++/CLI, allow for both garbage
collection and manual memory management in the same application by using
separate heaps for collected and manually managed objects; others, such as D,
are garbage-collected but allow the user to delete objects manually and
completely disable garbage collection when speed is required.

4. Explain Heap allocation


Ans.
1. The memory is allocated during the execution of instructions written by programmers.
2. Note that the name heap has nothing to do with the heap data structure. It is called a
heap because it is a pile of memory space available to programmers to allocate and de-
allocate.
3. Every time when we made an object it always creates in Heap-space and the
referencing information to these objects is always stored in Stack-memory. Heap
memory allocation isn’t as safe as Stack memory allocation because the data stored in
this space is accessible or visible to all threads.
4. If a programmer does not handle this memory well, a memory leak can happen in the
program.
5. The Heap-memory allocation is further divided into three categories:- These three
categories help us to prioritize the data(Objects) to be stored in the Heap-memory or in
the Garbage collection.
 Young Generation – It’s the portion of the memory where all the new
data(objects) are made to allocate the space and whenever this memory is
completely filled then the rest of the data is stored in Garbage collection.
 Old or Tenured Generation – This is the part of Heap-memory that contains the
older data objects that are not in frequent use or not in use at all are placed.
 Permanent Generation – This is the portion of Heap-memory that contains the
JVM’s metadata for the runtime classes and application methods.
6. This memory allocation scheme is different from the Stack-space allocation, here no
automatic de-allocation feature is provided. We need to use a Garbage collector to
remove the old unused objects in order to use the memory efficiently.
7. The processing time(Accessing time) of this memory is quite slow as compared to
Stack-memory. Heap memory is also not as threaded-safe as Stack-memory because
data stored in Heap-memory are visible to all threads.

5. Explain a Basic Mark-and-Sweep Collector


Ans.
1. There are many garbage collection algorithms that run in the background, of which one
of them is mark and sweep.
2. Any garbage collection algorithm must perform 2 basic operations. One, it should be
able to detect all the unreachable objects and secondly, it must reclaim the heap space
used by the garbage objects and make the space available again to the program. The
above operations are performed by Mark and Sweep Algorithm in two phases as listed
and described further as follows:
 Mark phase
 Sweep phase

Phase 1: Mark Phase

When an object is created, its mark bit is set to 0(false). In the Mark phase, we set the marked
bit for all the reachable objects (or the objects which a user can refer to) to 1(true). Now to
perform this operation we simply need to do a graph traversal, a depth-first search
approach would work for us. Here we can consider every object as a node and then all the
nodes (objects) that are reachable from this node (object) are visited and it goes on till we have
visited all the reachable nodes
Algorithm: Mark phase
Mark(root)
If markedBit(root) = false then
markedBit(root) = true
For each v referenced by root
Mark(v)

Phase 2: Sweep Phase


As the name suggests it “sweeps” the unreachable objects i.e. it clears the heap memory for all
the unreachable objects. All those objects whose marked value is set to false are cleared from
the heap memory, for all other objects (reachable objects) the marked bit is set to true.
Now the mark value for all the reachable objects is set to false since we will run the algorithm
(if required) and again we will go through the mark phase to mark all the reachable objects.
Algorithm: Sweep Phase
Sweep()
For each object p in heap
If markedBit(p) = true then
markedBit(p) = false
else
heap.release(p)

Assignment No 5

1. Explain Peephole Optimization


Ans.
1. Peephole optimization is a type of code Optimization performed on a small part of the
code. It is performed on a very small set of instructions in a segment of code.
2. The small set of instructions or small part of code on which peephole optimization is
performed is known as peephole or window.
3. It basically works on the theory of replacement in which a part of code is replaced by
shorter and faster code without a change in output. The peephole is machine-dependent
optimization.
4. Objectives of Peephole Optimization:
The objective of peephole optimization is as follows:
 To improve performance
 To reduce memory footprint
 To reduce code size
5.Peephole Optimization Techniques
A. Redundant load and store elimination: In this technique, redundancy is eliminated.
B. Constant folding: The code that can be simplified by the user itself, is simplified. Here
simplification to be done at runtime are replaced with simplified code to avoid additional
computation.
C. Strength Reduction: The operators that consume higher execution time are replaced by
the operators consuming less execution time.
D. Null sequences/ Simplify Algebraic Expressions : Useless operations are deleted.
E. Combine operations: Several operations are replaced by a single equivalent operation.
F. Deadcode Elimination:- Dead code refers to portions of the program that are never
executed or do not affect the program’s observable behavior. Eliminating dead code helps
improve the efficiency and performance of the compiled program by reducing unnecessary
computations and memory usage.
2. Explain Issues in the Design of a Code Generator
Ans.
The following issue arises during the code generation phase:
Input to code generator – The input to the code generator is the intermediate code generated
by the front end, along with information in the symbol table that determines the run-time
addresses of the data objects denoted by the names in the intermediate representation.
Intermediate codes may be represented mostly in quadruples, triples, indirect triples, Postfix
notation, syntax trees, DAGs, etc. The code generation phase just proceeds on an assumption
that the input is free from all syntactic and state semantic errors, the necessary type checking
has taken place and the type-conversion operators have been inserted wherever necessary.
 Target program: The target program is the output of the code generator. The
output may be absolute machine language, relocatable machine language, or
assembly language.
 Absolute machine language as output has the advantages that it can be
placed in a fixed memory location and can be immediately executed.
For example, WATFIV is a compiler that produces the absolute
machine code as output.
 Relocatable machine language as an output allows subprograms and
subroutines to be compiled separately. Relocatable object modules can
be linked together and loaded by a linking loader. But there is added
expense of linking and loading.
 Assembly language as output makes the code generation easier. We can
generate symbolic instructions and use the macro-facilities of
assemblers in generating code. And we need an additional assembly
step after code generation.
 Memory Management – Mapping the names in the source program to the
addresses of data objects is done by the front end and the code generator. A name
in the three address statements refers to the symbol table entry for the name. Then
from the symbol table entry, a relative address can be determined for the name.
Instruction selection – Selecting the best instructions will improve the efficiency of the
program. It includes the instructions that should be complete and uniform. Instruction speeds
and machine idioms also play a major role when efficiency is considered. But if we do not care
about the efficiency of the target program then instruction selection is straightforward.

3. Explain Dead Code Elimination


Ans.
Dead code refers to sections of code within a program that is never executed during runtime
and has no impact on the program’s output or behavior. Identifying and removing dead code is
essential for improving program efficiency, reducing complexity, and enhancing
maintainability.
Benefits of Dead Code Elimination
 Enhanced Program Efficiency: By removing dead code, unnecessary
computations and memory usage are eliminated, resulting in faster and more
efficient program execution.
 Improved Maintainability: Dead code complicates the understanding and
maintenance of software systems. By eliminating it, developers can focus on
relevant code, improving code readability, and facilitating future updates and bug
fixes.
 Reduced Program Size: Dead code elimination significantly reduces the size of
executable files, optimizing resource usage and improving software distribution.
Process of Dead Code Elimination
Dead code elimination is primarily performed by compilers or interpreters during
the compilation or interpretation process. Here’s an overview of the process:
 Static Analysis: The compiler or interpreter analyzes the program’s source code or
intermediate representation using various techniques, including control flow
analysis and data flow analysis.
 Identification of Dead Code: Through static analysis, the compiler identifies
sections of code that are provably unreachable or have no impact on the program’s
output.
 Removal of Dead Code: The identified dead code segments are eliminated from
the final generated executable, resulting in a more streamlined and efficient
program.

4. Explain code generation


Ans.

Code generator is used to produce the target code for three-address statements. It uses
registers to store the operands of the three address statement.

Example:

Consider the three address statement x:= y + z. It can have the following sequence of
codes:

MOV x, R0
ADD y, R0

Register and Address Descriptors:


o A register descriptor contains the track of what is currently in each register. The
register descriptors show that all the registers are initially empty.
o An address descriptor is used to store the location where current value of the
name can be found at run time.
A code-generation algorithm:

The algorithm takes a sequence of three-address statements as input. For each three
address statement of the form a:= b op c perform the various actions. These are as
follows:

1. Invoke a function getreg to find out the location L where the result of
computation b op c should be stored.
2. Consult the address description for y to determine y'. If the value of y currently in
memory and register both then prefer the register y' . If the value of y is not
already in L then generate the instruction MOV y' , L to place a copy of y in L.
3. Generate the instruction OP z' , L where z' is used to show the current location of
z. if z is in both then prefer a register to a memory location. Update the address
descriptor of x to indicate that x is in location L. If x is in L then update its
descriptor and remove x from all other descriptor.
4. If the current value of y or z have no next uses or not live on exit from the block
or in register then alter the register descriptor to indicate that after execution of x
: = y op z those register will no longer contain y or z.

You might also like