0% found this document useful (0 votes)
15 views18 pages

CD Imp Ques 1

The document outlines key concepts in compiler design, including the differences between compilers and interpreters, the role of lexical analyzers, and error handling techniques like Panic Mode Recovery. It details the six phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. Additionally, it covers parsing techniques such as shift-reduce and predictive parsing, along with input buffering methods and operator precedence parsing.

Uploaded by

Kiruthika GS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views18 pages

CD Imp Ques 1

The document outlines key concepts in compiler design, including the differences between compilers and interpreters, the role of lexical analyzers, and error handling techniques like Panic Mode Recovery. It details the six phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. Additionally, it covers parsing techniques such as shift-reduce and predictive parsing, along with input buffering methods and operator precedence parsing.

Uploaded by

Kiruthika GS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CD IMP QUES 1

UNIT 1

2M

1.DIFF BW COMPILER AND INTEPRETER

2.DEFINE PATTERN AND LEXEME


3.DEFINE SYMBOL TABLE MANAGEMENT

4.WHAT IS ERROR HANDLING

5.WHAT IS ROLE OF LEXA

6.DEFINE PANIC MODE RECOVERY

Panic Mode Recovery is a common error recovery technique used in compilers to handle syntax
errors efficiently. It allows the parser to recover from errors and continue parsing the rest of the
input instead of stopping abruptly.

How Panic Mode Recovery Works

➢ When a syntax error is detected, the parser discards input symbols until it finds a
synchronizing token (such as a semicolon; or closing bracket }).
➢ The parser then resumes normal parsing from this point, preventing an infinite loop of
errors.
➢ The synchronizing tokens are chosen based on the language’s grammar (e.g., keywords,
delimiters).
7.DEFINE FA

8.DEFINE RE
12M

1.EXPLAIN THE 6 PHASES OF A COMPILER

LEXICAL ANALYSER

Reads streams of characters making up the source program & groups the characters into meaningful
sentences called lexemes. For each lexeme, LEXA producer as output a token of the form <token
name attribute value> where token name represents abstract symbol & attribute value points to an
entry in symbol table.

SYNTAX ANALYSER

Also called as parser. This phase groups the token produced by LEXA into syntactic variables.

☆ It generates a tree like representation called parse tree.

☆ Parse tree describes syntactic structure of the I/P.

SEMANTIC ANALYSIS

Checks for sematic errors. Concentrates on type checking whether operands are type compatible.

☆It further checks for

I) array bound checking

Ii) logical analysis of parse tree

Iii) type mismatch error's

Iv) undeclared variables

V) misuse of result words.

Vi) multiple declaration of variable in single scope.

Vii) accessing out of scope variables

Viii) mismatch between actual & formal parameters

INTERMEDIATE CODE GENERATION

After semantic analysis, some compilers generate an explicit intermediate representation of the
source program. This representation should be easy to produce and easy to translate into the target
program. There are variety of forms.

• Three address code

The commonly used representation is three address formats. The format consists of a sequence of
instructions, each of which has at most three operands.
CODE OPTIMISATION

This phase attempts to improve the intermediate code, so that faster running machine code will
result. There are various techniques used by most of the optimizing compilers, such as:

1. Common sub-expression elimination

2. Dead Code elimination

3. Constant folding

4. Copy propagation

5. Induction variable elimination

6. Code motion

7. Reduction in strength

CODE GENERATION

The final phase of the compiler is the generation of target code, consisting of relocatable machine
code or assembly code. The intermediate instructions are each translated into sequence of machine
instructions that perform the same task. A crucial aspect is the assignment of variables to registers.
2.CONVERT NFA TO M-DFA
3.EXPLAIN INPUT BUFFERING 6M

Input Buffering in Compiler Design

Input buffering is a technique used in lexical analysis to efficiently read and process input characters
from a source file while minimizing the overhead of frequent system calls. Since reading characters
one at a time from disk or memory is slow, buffering improves efficiency by reading large blocks of
data at once.

Need for Input Buffering

➢ Scanning characters one at a time is slow due to repeated function calls.


➢ Lookahead is often required to distinguish between tokens (e.g., == vs. = in C).
➢ Reduces the number of input operations by reading a block of characters at a time.

Buffering Techniques

1. Single Buffering

A single buffer holds a chunk of the input file.

➢ The lexical analyzer processes characters from this buffer.


➢ Limitation: When the buffer is exhausted, a new block must be read, causing delays.

2. Double Buffering (Two-Buffer Scheme)

➢ The input is divided into two N-character buffers (typically 1024 or 4096 bytes).

Working Mechanism:

➢ Initially, the first buffer is filled with input.


➢ As characters are processed, a forward pointer moves.
➢ When the forward pointer reaches the middle (or end) of one buffer, the other buffer is
loaded with new input.
➢ This allows continuous processing without frequent delays.

Advantages:

➢ Reduces the number of I/O operations.


➢ Supports lookahead without additional overhead.

3. Sentinels for Buffer End Detection

➢ Instead of checking for the end of the buffer in every character read (which adds extra
comparisons), a special sentinel character (EOF) is placed at the end of each buffer.
➢ When the forward pointer encounters EOF, it triggers buffer reloading.
➢ Benefit: Eliminates extra condition checks and speeds up processing.

Pointers Used in Input Buffering

➢ Lexeme Beginning Pointer (lexeme_beginning): Marks the start of the current lexeme
(token).
➢ Forward Pointer (forward): Moves ahead to identify tokens.
➢ End-of-buffer Handling: If the forward pointer reaches the end of a buffer, the other buffer
is loaded, and scanning continues seamlessly.
UNIT 2

1.DEFINE PARSER AND ROLE OF PARSER

PARSER

A parser is a component of a compiler or interpreter that analyses the syntax of a given input
(usually a program) according to the rules of a formal grammar. It ensures that the structure of the
code is correct before further processing.

ROLE OF THE PARSER

Parser for any grammar is program that takes as input string w (obtain set of strings tokens from the
lexical analyser) and produces as output either a parse tree for w , if w is a valid sentences of
grammar or error message indicating that w is not a valid sentences of given grammar.

2.DEFINE CFG

A Context-Free Grammar (CFG) is a formal grammar used to define the syntax of programming
languages and natural languages. It consists of a set of rules (productions) that describe how strings
in a language can be generated.

Components of a CFG

A CFG is defined as a 4-tuple:

G = (V, T, P, S)

where

V (Non-Terminals) – A finite set of syntactic variables representing different components of the


language.

T (Terminals) – A finite set of symbols (tokens) that make up the actual language.

P (Productions) – A finite set of rules of the form A→α

where

A is a non-terminal

α is a string of terminals and/or non-terminals (can include ε, the empty string).

S (Start Symbol) – A special non-terminal from which derivation begins.

3. Consider the G,

E → E + E | E * E | (E) | - E | id
Derive the string id + id * id using leftmost derivation and rightmost derivation.
4.DIFF BW AMBIGUOUS AND UNAMBIGUOUS GRAMMAR

5. Statement Mode recovery

➢ In this method, when a parser encounters an error, it performs the necessary correction on
the remaining input so that the rest of the input statement allows the parser to parse ahead.
➢ The correction can be deletion of extra semicolons, replacing the comma with semicolons, or
inserting a missing semicolon.
➢ While performing correction, utmost care should be taken for not going in an infinite loop.
➢ A disadvantage is that it finds it difficult to handle situations where the actual error occurred
before pointing of detection.
12M

1.EXPLAIN SHIFT REDUCE PARSER WITH AN EXAMPLE

Shift-Reduce Parser

A Shift-Reduce Parser is a bottom-up parsing technique that reduces an input string to the start
symbol of a grammar using shifting and reducing operations. It is widely used in bottom-up parsers
like LR, SLR, LALR, and CLR parsers.

Basic Operations of Shift-Reduce Parsing

Shift

Move the next input symbol onto the stack.

Reduce

Replace a sequence of symbols on the stack with a corresponding non-terminal based on a grammar
rule.

Accept

If the stack contains only the start symbol and input is fully parsed, the string is accepted.

Error

If no valid shift or reduce operation is possible, an error is reported.


Key Features of Shift-Reduce Parsing

➢ Handles Bottom-Up Parsing Efficiently


➢ Used in LR Parsers
➢ Efficient for Programming Language Parsing
2.EXPLAIN PREDICTIVE PARSER WITH AN EXAMPLE

Predictive Parser

A grammar after eliminating left recursion and left factoring can be parsed by a recursive descent
parser that needs no backtracking is a called a predictive parser. Let us understand how to eliminate
left recursion and left factoring.

i)Eliminating Left Recursion

A grammar is said to be left recursive if it has a non-terminal A such that there is a derivation A=>Aα
for some string α. Top-down parsing methods cannot handle left-recursive grammars. Hence, left
recursion can be eliminated as follows:

If there is a production A → Aα | β it can be replaced with a sequence of two productions

A → βA'

A' → αA' | ε

Without changing the set of strings derivable from A.

Example: Consider the following grammar for arithmetic expressions:

E → E+T | T

T → T*F | F

F → (E) | id

First eliminate the left recursion for E as

E → TE'

E' → +TE' | ε

Then eliminate for T as

T → FT '

T'→ *FT ' | ε

Thus, the obtained grammar after eliminating left recursion is

E → TE'

E' → +TE' | ε

T → FT '

T'→ *FT ' | ε

F → (E) | id
ii)Eliminating Left factoring

Left factoring is a grammar transformation that is useful for producing a grammar suitable for
predictive parsing. When it is not clear which of two alternative productions to use to expand a non-
terminal A, we can rewrite the A-productions to defer the decision until we have seen enough of the
input to make the right choice.

If there is any production A → αβ1 | αβ2, it can be rewritten as

A → αA'

A’ → αβ1 | αβ2

Consider the grammar,

S → iEtS | iEtSeS | a

E→b

Here,i,t,e stand for if ,the,and else and E and S for “expression” and “statement”.

After Left factored, the grammar becomes

S → iEtSS' | a

S' → eS | ε

E→b

Algorithm for construction of predictive parsing table


Input: Grammar G
Output: Parsing table M
1. For each production A → α of the grammar, do steps 2 and 3.
2. For each terminal a in FIRST(α), add A → α to M[A, a].
3. If ε is in FIRST(α), add A → α to M[A, b] for each terminal b in FOLLOW(A). If ε is in FIRST(α) and $
is in FOLLOW(A) , add A → α to M[A, $].
4. Make each undefined entry of M be error.
EXAMPLE
Step 5: Parsing the given string

With input id+id*id the predictive parser makes the sequence of moves
3.OPERATOR PRECEDENCE PARSING

Operator precedence parser – An operator precedence parser is a bottom-up parser that interprets
an operator grammar. This parser is only used for operator grammars. Ambiguous grammars are not
allowed in any parser except operator precedence parser.

There are two methods for determining what precedence relations should hold between a pair of
terminals:

Use the conventional associativity and precedence of operator.

The second method of selecting operator-precedence relations is first to construct an unambiguous


grammar for the language, a grammar that reflects the correct associativity and precedence in its
parse trees.

This parser relies on the following three precedence relations: ⋖, ≐, ⋗ a ⋖ b This means a “yields
precedence to” b. a ⋗ b This means a “takes precedence over” b. a ≐ b This means a “has same
precedence as” b.

Figure – Operator precedence relation table for grammar E->E+E/E*E/id

There is not given any relation between id and id as id will not be compared and two variables can
not come side by side. There is also a disadvantage of this table – if we have n operators then size of
table will be n*n and complexity will be 0(n2).

In order to decrease the size of table, we use operator function table. Operator precedence parsers
usually do not store the precedence table with the relations; rather they are implemented in a
special way.

Operator precedence parsers use precedence functions that map terminal symbols to integers, and
the precedence relations between the symbols are implemented by numerical comparison. The
parsing table can be encoded by two precedence functions f and g that map terminal symbols to
integers.

We select f and g such that:

f(a) < g(b) whenever a yields precedence to b

f(a) = g(b) whenever a and b have the same precedence

f(a) > g(b) whenever a takes precedence over b


Ex: - 1

Use Stack implementation of operator precedence parser to check this sentence id + id by this
grammar: E→ E+E | E*E | id

Sol:

You might also like