0% found this document useful (0 votes)
39 views

Pograms

The document describes an algorithm to convert a regular expression to a non-deterministic finite automaton (NFA). It explains Thompson's construction method which uses basic rules to recursively build the NFA for subexpressions. A Python program is provided that implements this algorithm by taking a regular expression as input and generating the state transition table for the resulting NFA.

Uploaded by

Thamizh Arasi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Pograms

The document describes an algorithm to convert a regular expression to a non-deterministic finite automaton (NFA). It explains Thompson's construction method which uses basic rules to recursively build the NFA for subexpressions. A Python program is provided that implements this algorithm by taking a regular expression as input and generating the state transition table for the resulting NFA.

Uploaded by

Thamizh Arasi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

EX.

NO: 01(a)
Implementation of Lexical Analyzer
DATE:

Aim:
To implement Lexical Analysis for given example text file using python coding.

Algorithm:
Lexical Analysis is the first phase of the compiler also known as a scanner. It converts the High- level
input program into a sequence of Tokens. This sequence of tokens is sent to the parser for syntax
analysis. Lexical Analysis can be implemented with the Deterministic finite Automata.

A lexical token is a sequence of characters that can be treated as a unit in the grammar of the
programming languages The types of token are identifier, numbers, operators, keywords, special
symbols etc.

Following are the examples of tokens:

Keywords: Examples-for, while, if etc.


Identifier: Examples-Variable name, function name, etc. Operators: Examples '+', '++', '-' etc.
Separators: Examples ', ' ';' etc.

The algorithm for lexical analysis is as follows:


1. Read the input expression.
2. If the input is a keyword, store it as a keyword.
3. If the input is an operator, store it as operator.
4. If the input is a delimiter, store it as a delimiter.
5. Check whether input is a sequence of alphabets and/or digits then store it an identifier.
6. If the input is a sequence of digits, then store it as a number.

Program:

import re

patterns = {
'IMPORTS' : r'<stdio.h>|<conio.h>|<stdlib.h>',
'STRING' : r'\".*\"',
'KEYWORD' : r'#include|if|else|for|break|int|float|void|String|char|double',
'FUNCTION' : r'printf|scanf|clrscr|getch',
'FLOAT' : r'\d+\.\d+',
'INT' : r'\d+',
'OPERATOR' : r'\+?\+|-|\*|/|=|==|<|>',
'ID' : r'[a-zA-z_][a-zA-Z0-9_]*',
'LPARAN' : r'\(',
'RPARAN' : r'\)',
'SEPRATOR' : r'[;:,]',
'LBRACE' : r'\{',
'RBRACE' : r'\}',

def lex_anz(input):
tokens = []

regex_patt = '|'.join(f'(?P<{tok}>{patterns[tok]})' for tok in patterns)


1
for match in re.finditer(regex_patt, input):
tok_type = match.lastgroup
tok_val = match.group()
tokens.append((tok_type, tok_val))
return tokens

code = open('text.cpp').read()
result = lex_anz(code)
for t, v in result: print(f'{v} -> {t}')

Input:

#include <stdio.h>

void main(){
int x = 3;
if ( x < 10 ) {
printf("hello world!");
}
}

Output:

#include -> KEYWORD


<stdio.h> -> IMPORTS
void -> KEYWORD
main -> ID
( -> LPARAN
) -> RPARAN
{ -> LBRACE
int -> KEYWORD
x -> ID
= -> OPERATOR
3 -> INT
; -> SEPRATOR
if -> KEYWORD
( -> LPARAN
x -> ID
< -> OPERATOR
10 -> INT
) -> RPARAN
{ -> LBRACE
printf -> FUNCTION
( -> LPARAN
"hello world!" -> STRING
) -> RPARAN
; -> SEPRATOR
} -> RBRACE
} -> RBRACE

EX.NO: 01(b)
Implementation of Lexical Tool
DATE: 2
Aim:
To implement Lexical Analyzer using Lexical Tool in python coding.

Algorithm:
1. Define the set of tokens or lexemes for the programming language you want to process. These can be
keywords, identifiers, operators, literals, etc. Store them in a dictionary or a list for easy access.
2. Read the input source code file to be processed.
3. Initialize a cursor or pointer to the beginning of the input source code.
4. Create a loop that iterates through the source code, character by character.
5. Implement a finite state machine (FSM) to recognize tokens. The FSM can have states representing
different types of tokens, such as "keyword", "identifier", "operator", etc.
6. For each character encountered in the source code, update the FSM state based on the current
character and the current state. If the current state is a final state, store the recognized token and reset the
FSM state to the initial state. If the current state is not a final state and the current character does not
transition to any valid state, then raise a lexical error.
7. Continue the loop until the end of the input source code is reached.
8. Output the recognized tokens along with their corresponding lexeme value or token type.
Program:

import ply.lex as lex

tokens = (
'IMPORTS',
'STRING',
'KEYWORD',
'FUNCTION',
'FLOAT',
'INT',
'OPERATOR',
'ID',
'LPARAN',
'RPARAN',
'SEPRATOR',
'LBRACE',
'RBRACE',
)

t_IMPORTS = r'<stdio.h>|<conio.h>|<stdlib.h>'
t_STRING = r'\".*\"'
t_KEYWORD = r'\#include|if|else|for|break|int|float|void|String|char|double|while|do'
t_FUNCTION = r'printf|scanf|clrscr|getch'
t_FLOAT = r'\d+\.\d+'
t_INT = r'\d+'
t_OPERATOR = r'\+|-|\*|/|=|==|<|>|>=|<='
t_ID = r'[a-zA-z_][a-zA-Z0-9_]*'
t_LPARAN = r'\('
t_RPARAN = r'\)'
t_SEPRATOR = r'[;:,]'
t_LBRACE = r'\{'
t_RBRACE = r'\}'
t_ignore = r' |\t'

def t_NEWLINE(t):
3
r'\n+'
t.lexer.lineno += len(t.value)

def t_error(t):
print("Illegal character '%s'" % t.value[0])
t.lexer.skip(1)

lexer = lex.lex()

code = open('text.cpp').read()

lexer.input(code)

while True:
tok = lexer.token()
if not tok:
break
print(tok)
Input:

#include <stdio.h>

void main(){
int x = 3;
if ( x < 10 ) {
printf("hello world!");
}
}

Output:
LexToken(KEYWORD,'#include',1,0)
LexToken(IMPORTS,'<stdio.h>',1,9)
LexToken(KEYWORD,'void',3,20)
LexToken(ID,'main',3,25)
LexToken(LPARAN,'(',3,29)
LexToken(RPARAN,')',3,30)
LexToken(LBRACE,'{',3,31)
LexToken(KEYWORD,'int',4,35)
LexToken(ID,'x',4,39)
LexToken(OPERATOR,'=',4,41)
LexToken(INT,'3',4,43)
LexToken(SEPRATOR,';',4,44)
LexToken(KEYWORD,'if',5,48)
LexToken(LPARAN,'(',5,51)
LexToken(ID,'x',5,53)
LexToken(OPERATOR,'<',5,55)
LexToken(INT,'10',5,57)
LexToken(RPARAN,')',5,60)
LexToken(LBRACE,'{',5,62)
LexToken(FUNCTION,'printf',6,69)
LexToken(LPARAN,'(',6,75)
4
LexToken(STRING,'"hello world!"',6,76)
LexToken(RPARAN,')',6,90)
LexToken(SEPRATOR,';',6,91)
LexToken(RBRACE,'}',7,95)
LexToken(RBRACE,'}',8,97)

EX.NO: 02
Regular Expression to NFA
DATE:

Aim:
To write a python program to convert the given Regular Expression to NFA.

Algorithm:
Thompson’s Construction of an NFA from a Regular Expression:

Input: A regular expression r over the


alphabet. Output: An NFA N accepting L(r)

METHOD: Begin by parsing r into its constituent subexpressions. The rules for constructing an NFA
consist of the following basics rules.

For expression e construct the NFA,

Here, i is a new state, the start state of this NFA, and f is another new state, the accepting
state for the NFA.
For any subexpressions , construct the NFA,

INDUCTION: Suppose N(s) and N (t) are NFA's for regular expressions s and t, respectively.

a) For the regular expression s|t ,

b) For the regular expression st,

Result:
Thus, the python program to implement the Lexical Analyzer using Lexical Tool is executed
successfully and verified 5
6
c)For the regular expression S*,

d)Finally, suppose r = (s), then L(r) = L(s), and we can use the NFA, N(s) as N(r)

Program:

import re

t = 0; f = 1

def nodret(ip):
global t, f
e = u'\u03b5'
nodes = []
if re.match(re.compile(r'^[a-z]$'), ip):
nodes = [
(t, t + 1, ip)
]
t += 1
elif re.match(re.compile(r'^[a-z]\*$'), ip):
nodes = [
(t, t + 1, e),
(t, t + 3, e),
(t + 1, t + 2, ip[0]),
(t + 2, t + 1, e),
(t + 2, t + 3, e)
]
t += 3
elif re.match(re.compile(r'^[a-z]\/[a-z]$'), ip):
nodes = [
(t, t + 1, e),
(t, t + 3, e),
(t + 1, t + 2, ip[0]),
(t + 3, t + 4, ip[2]),
(t + 2, t + 5, e),
(t + 4, t + 5, e),
]
t += 5
else:
print("please enter basic expressions(linear combination of a, a*, a/b, a b)")
f=0
return nodes

def tab_gen(v):
ips = list(set([e for e1, e2, e in v]))
7
ips.sort()

a = [[[] for j in range(len(ips))] for i in range(t)]


for s, d, i in v:
a[s][ips.index(i)].append(d)
print('state', end="")
for x in ips:
print(f'\t{x}', end='')

print('\n', '-' * (len(ips) * 10))


for i in range(t):
print(f'{i}', end='')
for j in range(len(ips)):
print(f'\t{a[i][j]}', end='')
print()
print(f'State {t} is the final state')

ip = input("enter regex(leave space between characters): ")

nodes = []
for ch in ip.split():
nodes += nodret(ch)
if f:
tab_gen(nodes)
Input and output:
enter regex(leave space between characters): a* b* c/d
state a b c d ε
--------------------------------------------------
0 [] [] [] [] [1, 3]
1 [2] [] [] [] []
2 [] [] [] [] [1, 3]
3 [] [] [] [] [4, 6]
4 [] [5] [] [] []
5 [] [] [] [] [4, 6]
6 [] [] [] [] [7, 9]
7 [] [] [8] [] []
8 [] [] [] [] [11]
9 [] [] [] [10] []
10 [] [] [] [] [11]
State 11 is the final state

EX.NO: 03
Elimination of Left Recursion
DATE:

Aim:
To write a python program to implement Elimination of Left Recursion for given sample grammer.
Algorithm:
A grammar is left recursive if it has a nonterminal A such that there is a derivation A => Aα for some
8
string. Top-down parsing methods cannot handle left-recursive grammars, so a transformation is needed
to eliminate left recursion.

For each production rule `x` in the list of productions `p`:


• Initialize empty lists `alpha` and `beta`.
• Separate the productions of `x` into `alpha` (left recursive) and `beta` (non-left recursive) based
on whether the production starts with the name of the non-terminal or not.
• If `alpha` is not empty:
• Modify the right-hand side of productions in `beta` by appending the non-terminal's prime
symbol.
• Modify the right-hand side of productions in `alpha` by appending the non-terminal's prime
symbol and epsilon.
• Update the production rules for `x` with the modified `beta` productions.
• Add new production rules for the non-terminal's prime symbol with the modified `alpha`
productions to the list of productions `p`.
Program:

e=u'\u03b5'
p=[]

class Prod:
def __init__(self, name, products):
self.name=name
self.products=products

def print(self):
s = f'{self.name} -> '
for p in self.products:
s+= f' {p} |'
s=s.rstrip('|')
print(s)

def trans():
for x in p:
alpha=[];beta=[]
for product in x.products:
if x.name==product[0]:
alpha.append(product[1:])
else:
beta.append(product)

if alpha:
for i in range(len(beta)):
beta[i]=f"{beta[i]}{x.name}'"
for i in range(len(alpha)):
alpha[i]=f"{alpha[i]}{x.name}'"
alpha.append(e)
x.products =beta
p.append(Prod(f"{x.name}'",alpha))

n = int(input("No of production: "))


for i in range(n):
ip = input(f"Production {i+1}: ")
9
name, prods = ip.split(' -> ')
products = prods.split(' | ')
p.append(Prod(name, products))

print('Productions:')
for x in p: x.print()
print('Transforming...')
trans()
for x in p: x.print()

Input and Output:


No of production: 3
Production 1: E -> E+T | T
Production 2: T -> T*F | F
Production 3: F -> ( E ) | id
Productions:
E -> E+T | T
T -> T*F | F
F -> (E) | id
Transforming...
E -> TE'
T -> FT'
F -> (E) | id
E' -> +TE' | ε
T' -> *FT' | ε

EX.NO: 04
Left Factoring
DATE:

Aim:
To write a python program to implement Left Factoring for given sample data.
Algorithm:
Left factoring is a grammar transformation that is useful for producing a grammar suitable for
predictive, or top-down, parsing.When the choice between two alternative A-productions is not clear,we
may be able to rewrite the productions to defer the decision until enough of the input has been seen that
we can make the right choice.

1. For each nonterminal A, find the longest prefix α common to two or more of its
alternatives.

2. If a=∈ ,there is a non trivial common prefix and hence replace all of A-productions,
A → αβ1| αβ2 |…..| αβn| γ, where γ represents all alternatives that do not begin with α,
by A →αΑ’| γ Α’ →β1| β2|…… |βn

3. Here, A’ is new non terminal. Repeatedly apply this transformation until two alternatives for a

10
non-terminal have common prefix

Program:

e = u'\u03b5'
p = []

class Prod:
def __init__(self, name, products):
self.name = name
self.products = products

def print(self):
s = f'{self.name} -> '
for p in self.products:
s += f' {p} |'
s = s.rstrip('|')
print(s)

def trans():
a = p[0]
temp = a.products
temp.sort()
a.products = []
while temp:
group = []
alpha = ''
beta = []
for i in range(1, len(temp)):
if temp[0][0] == temp[i][0]:
group.append(temp[i])
if group:
group.insert(0, temp[0])
temp = [j for j in temp if j not in group]
for j in range(len(group)):
group[j] += e
for c in group[0]:
f1 = 0
for j in group:
if c != j[0]:
f1 = 1
if f1:
beta = group
break
else:
alpha += group[0][0]
for j in range(len(group)):
group[j] = group[j][1:]
for j in range(len(beta)):
if beta[j][0] != e:
beta[j] = beta[j][:-1]
a.products.append(alpha + alpha[0] + "'")
p.append(Prod(alpha[0] + "'", beta))
else:
11
a.products.append(temp[0])
temp.pop(0)

ip = input(f"Enter production: ")


name, prods = ip.split(' -> ')
products = prods.split(' | ')
p.append(Prod(name, products))

print('Productions:')
for x in p:
x.print()
print('Transforming...')
trans()
for x in p:
x.print()

12
Input and Output:
Enter production: A -> ABs | AB | Sed | Swa | p
Productions:
A -> ABs | AB | Sed | Swa | p
Transforming...
A -> ABA' | SS' | p
A' -> ε | s
S' -> ed | wa

EX.NO: 05
Computation of First and Follow Sets
DATE:

Aim:
To write a python program to Compute First and Follow Sets for given sample data.

Algorithm:
1. Initialize an empty list `p` to store production rules.
2. Define the `Prod` class with attributes `name`, `products`, `first`, and `follow`.
3. Define a function `is_terminal` to check if a symbol is a terminal.
4. Define a function `find_prod` to find a production by name.
5. Define a function `calc_first` to calculate the `first` set for each non-terminal symbol.
6. Define a function `calc_follow` to calculate the `follow` set for each non-terminal symbol.
7. Define a function `find_follow` to find the `follow` set for a given non-terminal symbol.
8. Define a function `first` to retrieve the `first` set for a given non-terminal symbol.
9. Define a function `follow` to retrieve the `follow` set for a given non-terminal symbol.
10. Accept input for the number of productions `n`.
a. For each production:
b. Input the production rule in the format `A -> B1 | B2 | ... | Bn`.
c. Split the input to extract the non-terminal symbol `name` and the list of productions
`prods`.
d. Split the list of productions `prods` into individual productions and create a `Prod` object
for each non-terminal symbol.
11. Calculate the `first` and `follow` sets for each non-terminal symbol using `calc_first` and
`calc_follow` functions.
12. Print the `first` and `follow` sets for each non-terminal symbol.
Program:

import re

p=[]
class Prod:
def __init__(self, name, products):
self.name=name
self.products=products
self.first=[]
self.follow=[]

def is_terminal(s):

13
if re.match(re.compile('^[A-Z]$'),s):
return False
else:
return True

#find production by name:


def find_prod(name):
for x in p:
if x.name == name:
return x

def first(name):
for x in p:
if name == x.name:
return x.first

def follow(name):
for x in p:
if name == x.name:
return x.follow

def calc_first():
for i in reversed(range(len(p))):
for x in p[i].products:
if is_terminal(x[0]):
p[i].first.append(x[0])
else:
f = find_prod(x[0]).first
p[i].first.extend(f)
c=1
while 'e' in f:
if is_terminal(x[c]):
f=x[c]
else:
f=find_prod(x[c]).first

p[i].first.extend(f)
c+=1
if c == len(x): break
p[i].first = list(set(p[i].first))

def calc_follow():
p[0].follow.append('$')
for x in p:
find_follow(x)

def find_follow(x):
for y in p:
for pr in y.products:
for c in range(len(pr)):
if pr[c] == x.name:
if c+1 >= len(pr):
x.follow.extend(y.follow)
elif is_terminal(pr[c+1]):
x.follow.append(pr[c+1])
14
elif 'e' not in first(pr[c+1]):
x.follow.extend(first(pr[c+1]))
elif follow(pr[c+1]):
x.follow.extend(first(pr[c+1]) + follow(pr[c+1]))
else:
x.follow.extend(first(pr[c+1]) + find_follow(find_prod(pr[c+1])))
x.follow = list(set(x.follow)-{'e'})
return x.follow

n = int(input("No of production: "))


print("Epsilon = e")
for i in range(n):
ip = input(f"Production {i+1}: ")
name, prods = ip.split(' -> ')
products = prods.split(' | ')
p.append(Prod(name, products))

calc_first()
calc_follow()

#print first and follow


for x in p: print(f'first({x.name}) = {x.first}')
for x in p: print(f'follow({x.name}) = {x.follow}')

Input and Output:

No of production: 5
Epsilon = e
Production 1: E -> TX
Production 2: X -> +TX | e
Production 3: T -> FY
Production 4: Y -> *FY | e
Production 5: F -> (E) | i
first(E) = ['(', 'i']
first(X) = ['e', '+']
first(T) = ['(', 'i']
first(Y) = ['e', '*']
first(F) = ['(', 'i']
follow(E) = [')', '$']
follow(X) = [')', '$']
follow(T) = ['+', ')', '$']
follow(Y) = ['+', ')', '$']
follow(F) = ['+', '*', ')', '$']

EX.NO: 06
Recursive Descent Parser
DATE:
15
Aim:
To write a Python program that uses a Recursive Descent Parser to check if a given string is valid
according to a specified grammar.
Algorithm:
Recursive descent parsing is a top-down parsing technique for context-free grammars that uses recursive
functions to parse the input string. Each non-terminal symbol in the grammar has a corresponding
parsing function. The parsing functions are called recursively to parse the input string, and they check if
the current input character matches the expected symbol. Recursive descent parsing is simple and
widely used, but left recursion in the grammar can cause infinite recursion and must be eliminated.

Input grammer :
E -> TE'
E' -> +TE' | ε
T -> FT'
T' -> *FT' | ε
F -> ε | i

1. Define functions `match(a)`, `F()`, `Tx()`, `T()`, `Ex()`, and `E()` to handle the grammar rules.
2. Initialize `s` as the input string converted to a list and `i` as the current index.
3. In `match(a)`, check if the current character matches `a` and increment `i` if it does.
4. In `F()`, check if the current character is '(' and if so, recursively check E and match ')'; if not, check
if the current character is 'i'.
5. In `Tx()`, if the current character is '*', recursively check F and then Tx(); otherwise, return True.
6. In `T()`, recursively check F and then Tx().
7. In `Ex()`, if the current character is '+', recursively check T and then Ex(); otherwise, return True.
8. In `E()`, recursively check T and then Ex().
9. Call `E()` to check if the input string satisfies the grammar rules.
10. If `E()` returns True and `i` reaches the end of the input string, print "String is accepted".
11. If `E()` returns True but `i` does not reach the end of the input string, print "String is not accepted".
12. If `E()` returns False, print "String is not accepted".
Program:

print("Recursive Desent Parsing For following grammar\n")


print("E->TE'\nE'->+TE'/@\nT->FT'\nT'->*FT'/@\nF->(E)/i\n")
print("Enter the string want to be checked\n")
global s
s=list(input())
global i
i=0
def match(a):
global s
global i
if(i>=len(s)):
return False
elif(s[i]==a):
i+=1
return True
else:
return False
def F():
if(match("(")):
if(E()): return match(")")
else: return False
16
else: return match("i")

def Tx():
if(match("*")):
if(F()): return Tx()
else: return False
else:
return True
def T():
if(F()): return Tx()
else: return False
def Ex():
if(match("+")):
if(T()): return Ex()
else: return False
else: return True
def E():
if(T()): return Ex()
else: return False
if(E()):
if(i==len(s)): print("String is accepted")
else: print("String is not accepted")
else: print("string is not accepted")
Output:
Recursive Desent Parsing For following grammar
E->TE'
E'->+TE'/@
T->FT'
T'->*FT'/@
F->(E)/i
Enter the string want to be checked
i+i*(i)
String is accepted

EX.NO: 07
Implementation of Shift Reduce Parsing Algorithm
DATE:

Aim:
To write a python program to implement the shift-reduce parsing algorithm.

Algorithm:

Result:
Thus, the python program to check if a given string is valid using Recursive Descent Parser is
executed successfully and tested with various samples

Shift-reduce parsing is a bottom-up parsing technique for context-free grammars that involves shifting
input symbols onto a stack and reducing stack symbols using production rules. It is used in LR parsing,
SLR parsing, and LALR parsing algorithms. The parsing process continues until the entire input string
has been parsed or an error is detected. If the parsing process ends with the stack containing only the
17
start symbol and the input buffer being empty, the input string is accepted; otherwise, it is rejected.
A lexical token is a sequence of characters that can be treated as a unit in the grammar of the
programming languages The types of token are identifier, numbers, operators, keywords, special
symbols etc.

While the input buffer is not empty:


1. For each production in the start symbol's right-hand side:
If the production is in the stack, replace it with the start symbol and print a reduction action.
2. If the input buffer has more than one character:
Add the first character of the input buffer to the stack and shift it to the right.
3. If the stack is equal to the end-of-string symbol followed by the start symbol, check if the input
buffer is also empty.
If the input buffer is empty, print "Accepted".
If the input buffer is not empty, print "Rejected" and break the loop.

Program:
gram = {
"S": ["S+S", "S*S",'S-S','(S)', "id"]
}
start = "S"
inp = "(id+id)$"
stack = "$"
print(f'{"Stack": <15}' + "|" + f'{"Input Buffer": <15}' + "|" + 'Parsing Action')
print(f'{"-":-<50}')
while True:
i=0
for i in range (len(gram[start])):
if gram[start][i] in stack:
stack = stack.replace(gram[start][i], start)
print(f'{stack: <15}' + "|" + f'{inp: <15}' + "|" + f'Reduce > {gram[start][i]}')
if len(inp) > 1:
stack += inp[0]
inp = inp[1:]
print(f'{stack: <15}' + "|" + f'{inp: <15}' + "|" + 'Shift')
if stack == ("$" + start):
if inp == '$':
print(f'{stack: <15}' + "|" + f'{inp: <15}' + "|" + 'Accepted')
else:
print(f'{stack: <15}' + "|" + f'{inp: <15}' + "|" + 'Rejected')
break
Output:

Stack |Input Buffer |Parsing Action


--------------------------------------------------
$( |id+id)$ |Shift
$(i |d+id)$ |Shift
$(id |+id)$ |Shift
$(S |+id)$ |Reduce > id
$(S+ |id)$ |Shift
$(S+i |d)$ |Shift
$(S+id |)$ |Shift
$(S+S |)$ |Reduce > id
$(S+S) |$ |Shift
18
$(S) |$ |Reduce > S+S
$S |$ |Reduce > (S)
$S |$ |Accepted

EX.NO: 08
Implementation of Intermediate Code Generator
DATE:

Aim:
To write a python program to implement Intermediate Code Generator

Algorithm:
Intermediate code generation is a step in compiler optimization that involves translating high-level
source code into an intermediate representation for further analysis and optimization. This intermediate
representation, often in the form of assembly-like instructions, is easier to analyze and optimize than
high-level source code.

1. Define the set of operators `OPERATORS` and the precedence dictionary `PRI`.
2. Implement the `infix_to_postfix` function that takes a string formula as input and returns a postfix
string.
3. Initialize an empty stack `stack` and an empty output string `output`.
4. Iterate through each character `ch` in the formula.
4.1. If `ch` is not an operator, append it to the output string.
4.2. If `ch` is an opening parenthesis, push it onto the stack.
4.3. If `ch` is a closing parenthesis, pop and output stack elements until an opening parenthesis is
reached.
4.4. If `ch` is an operator, pop and output stack elements with higher or equal precedence until an
operator with lower precedence or an opening parenthesis is reached.
5. After the loop, output any remaining elements in the stack.
6. Implement the `generate3AC` function that takes a postfix string `pos` as input and generates three-
address code.
6.1. Initialize an empty expression stack `exp_stack` and a temporary variable counter `t`.
6.2. Iterate through each character `i` in the postfix string.
6.2.1.If `i` is not an operator, push it onto the expression stack.
6.2.2.If `i` is an operator, pop and output the top two elements from the expression stack,
perform the operation, and push the result onto the expression stack.
6.3. After the loop, the expression stack should contain only the final result.
7. Get the input expression from the user and convert it to postfix form using `infix_to_postfix`.
8. Generate three-address code using `generate3AC`.
Program:

OPERATORS = set(['+', '-', '*', '/', '(', ')'])


PRI = {'+':1, '-':1, '*':2, '/':2}

def infix_to_postfix(formula):

Result:
Thus, the python program to implement the shift-reduce parsing algorithm is executed
successfully and tested with various samples
19
stack = []
output = ''
for ch in formula:
if ch not in OPERATORS:
output += ch
elif ch == '(':
stack.append('(')
elif ch == ')':
while stack and stack[-1] != '(':
output += stack.pop()
stack.pop()
else:
while stack and stack[-1] != '(' and PRI[ch] <= PRI[stack[-1]]:
output += stack.pop()
stack.append(ch)
# leftover
while stack:
output += stack.pop()
print(f'POSTFIX: {output}')
return output

def generate3AC(pos):
print("### THREE ADDRESS CODE GENERATION ###")
exp_stack = []
t=1

for i in pos:
if i not in OPERATORS:
exp_stack.append(i)
else:
print(f't{t} := {exp_stack[-2]} {i} {exp_stack[-1]}')
exp_stack=exp_stack[:-2]
exp_stack.append(f't{t}')
t+=1

expres = input("INPUT THE EXPRESSION: ")


pos = infix_to_postfix(expres)
generate3AC(pos)

Input and Output:


INPUT THE EXPRESSION: a=3+4*(8-7*(c-b)+2)
POSTFIX: a=3487cb-*-2+*+
### THREE ADDRESS CODE GENERATION ###
t1 := c - b
t2 := 7 * t1
t3 := 8 - t2
t4 := t3 + 2
t5 := 4 * t4
t6 := 3 + t5

20

You might also like