0% found this document useful (0 votes)
10 views

CSC 461 Final

Uploaded by

lapu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

CSC 461 Final

Uploaded by

lapu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 170

CHAPTER 4

LEXICAL AND SYNTAX


ANALYSIS
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT W.
SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA. ELEVENTH
EDITION. 2016

Md. Rawnak Saif Adib


1
Lecturer
Department of Computer
Science and Engineering
LEXICAL ANALYSIS
❖ Lexical analysis is the first phase of a compiler or
interpreter in the context of programming
language processing. Its primary goal is to read
the source code and break it down into smaller,
meaningful units called tokens.
Key Steps in Lexical Analysis:
❖ Input: The source code (plain text) written in a
programming language.
❖ Tokenization: The process of identifying and
grouping characters into meaningful units like
keywords, identifiers, operators, literals, etc.
2
LEXICAL ANALYSIS
❖ For example, in the statement int a = 5; the
tokens are:
❖ int (keyword)
❖ a (identifier)
❖ = (assignment operator)
❖ 5 (integer literal)
❖ ; (delimiter)
❖ Pattern Matching: Lexical analyzers use
regular expressions or finite automata to match
patterns for different token types.
❖ Symbol Table Creation: Relevant tokens, such
as identifiers and constants, are often stored in a
symbol table for use in later phases. 3
SYNTAX ANALYSIS
❖ Syntax analysis, also known as parsing, is the
second phase of a compiler or interpreter. Its goal
is to analyze the sequence of tokens produced by
the lexical analyzer and determine if they follow
the syntax rules (grammar) of the programming
language.
Key Characteristics of Syntax Analysis
❖ Input: A sequence of tokens generated by the
lexical analyzer.
❖ Output: A parse tree or abstract syntax tree
(AST) that represents the hierarchical structure
of the program. If there are errors in the syntax,
it generates error messages. 4
SYNTAX ANALYSIS
❖ Purpose: To check the grammatical structure of
the code. To ensure that the code conforms to the
rules of the programming language (e.g., proper
use of operators, balanced parentheses, etc.). To
provide structure to the code, making it easier for
subsequent phases like semantic analysis.
❖ Grammatical Rules: Syntax analysis uses formal
grammars such as context-free grammars (CFG),
typically written in Backus-Naur Form (BNF) or
Extended Backus-Naur Form (EBNF).

5
PROCESS OF SYNTAX ANALYSIS
❖ Parsing: The process of deriving a parse tree or
AST by applying production rules to tokens.
❖ Parser Types:
❖ Top-Down Parsers: Start parsing from the root of the
parse tree and proceed to the leaves. Example:
Recursive Descent Parser, LL Parser.
❖ Bottom-Up Parsers: Start parsing from the leaves of
the parse tree and build up to the root. Example: LR
Parser, SLR Parser, LALR Parser.
❖ Error Handling: If the token sequence does not
conform to the grammar, the syntax analyzer
flags a syntax error (e.g., missing semicolon, 6
unbalanced parentheses).
PROCESS OF SYNTAX ANALYSIS
❖ Code:
if (x > 0) {
x = x + 1;
}
❖ Tokens (from lexical analysis):
❖ if, (, x, >, 0, ), {, x, =, x, +, 1, }, ;.
❖ Parse Tree: The syntax analyzer generates a tree
structure that shows how these tokens relate
based on the grammar:

7
PROCESS OF SYNTAX ANALYSIS
if_stmt
/ | \
if cond stmt_block
| / \
x>0 x=x+1

8
RECURSIVE-DESCENT PARSER
❖ A Recursive-Descent Parser is a type of top-down
parser used in the syntax analysis phase of
compilers and interpreters. It employs a set of
recursive procedures to process the input tokens
and construct a parse tree or abstract syntax tree
(AST) based on the grammar of the programming
language.

9
RECURSIVE-DESCENT PARSER
 Key Characteristics
1. Top-Down Parsing:
1. Begins parsing from the highest-level rule (start symbol) and works
its way down to the individual tokens.
2. Recursive Procedures:
1. Each non-terminal symbol in the grammar is typically represented
by a separate recursive function or procedure.
3. Predictive Parsing:
1. Often implemented as a predictive parser, especially for grammars
that are LL(1) (Left-to-right scanning of input and Leftmost
derivation with 1 lookahead token).
4. Ease of Implementation:
1. Simple to implement manually, making it suitable for smaller or
simpler languages.
5. Grammar Requirements:
1. Cannot handle left-recursive grammars directly; left recursion must
be eliminated.
2. Works best with grammars that are free of ambiguity and left 10
recursion.
RECURSIVE-DESCENT PARSER
 How It Works
1. Start Symbol:
1. The parser begins with the start symbol of the grammar.
2. Recursive Calls:
1. For each non-terminal, the corresponding function
attempts to match the input tokens according to the
production rules.
3. Token Matching:
1. Terminal symbols (tokens) are matched directly against
the input. If a match is found, the parser consumes the
token and proceeds; otherwise, it triggers a syntax error.
4. Backtracking:
1. Basic recursive-descent parsers may require backtracking
if multiple production rules exist for a non-terminal.
However, predictive parsers eliminate the need for
backtracking by using lookahead tokens to decide which 11
production to use.
RECURSIVE-DESCENT PARSER
 Expr -> Term Expr'
 Expr' -> + Term Expr' | ε
 Term -> Factor Term'
 Term' -> * Factor Term' | ε
 Factor -> ( Expr ) | number
 Here, ε represents an empty string (i.e., no input).
 Grammar Explanation
 Expr: An expression consists of a term followed by an
optional + and another term.
 Term: A term consists of a factor followed by an
optional * and another factor.
 Factor: A factor is either a parenthesized expression
12
or a number.
SHIFT-REDUCE PARSING
Shift-reduce parsing is a bottom-up parsing
technique that builds a parse tree from the leaves
(input tokens) to the root (start symbol). It uses a
stack and an input buffer, performing shift and
reduce operations according to the grammar's
production rules.
Consider the following grammar:
S → S + S | S – S | (S) | a
Input string: a1-(a2+a3)

13
Stack Input String Action
$ a1-(a2+a3) Shift a1
$ a1 -(a2+a3) Reduce by S → a
$S -(a2+a3) Shift –
$S– (a2+a3) Shift (
$S–( a2+a3) Shift a2
$ S – ( a2 +a3) Reduce by S → a
$S–(S +a3) Shift +
$S–(S+ a3 ) Shift a3
$ S – ( S + a3 ) Reduce by S → a
$S–(S+S ) Shift )
$S–(S+S) $ Reduce by S → S + S
$S–(S) $ Reduce by S → (S)
$S–S $ Reduce by S → S – S 14

$S $
SHIFT-REDUCE PARSING
Parse Tree:
S
/ | \
S - S
| / |\
a1 ( S )
/ |\
S + S
| |
a2 a3 15
CHAPTER 5
NAMES, BINDINGS, AND
SCOPES
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT W.
SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA. ELEVENTH
EDITION. 2016

Md. Rawnak Saif Adib


16
Lecturer
Department of Computer
Science and Engineering
INTRODUCTION
❖ Imperative languages are abstractions of von
Neumann architecture
❖ Memory
❖ Processor

❖ The architecture’s two primary components are


its memory, which stores both instructions and
data and its processor, which provides operations
for modifying the contents of the memory.

17
INTRODUCTION
❖ The abstractions in a language for the memory
cells of the machine are variables.
❖ A variable can be characterized by a collection of
properties or attributes.
❖ Among the most important of these issues are the
scope and lifetime of variables.

18
NAMES
❖ A name is a string of characters to identify some entity in a
program.
❖ Variables, functions, labels, keywords, …
❖ Design issues for names:
❖ Maximum length?
❖ Are connector characters allowed?
❖ Are names case sensitive?
❖ Are special words reserved words or keywords?
❖ Length
❖ If too short, they cannot be connotative(suggest something in
addition to its main meaning)
❖ Language examples:
❖ FORTRAN I: maximum 6
❖ COBOL: maximum 30
❖ FORTRAN 90 and ANSI C: maximum 31

❖ Ada and Java: no limit and all are significant 19


❖ C++: no limit, but implementers often impose one
NAMES
❖ Case sensitivity
❖ Disadvantage: readability (names that look alike are
different)
❖ worse in C++ and Java because predefined names are
mixed cases (e.g. IndexOutOfBoundsException)
❖ C, C++, and Java names are case sensitive
❖ The names in many other languages are not

20
NAMES
❖ Special words
❖ An aid to readability; used to delimit or
separate statement clauses
❖ A keyword is a word that is special only in certain
contexts
❖ e.g., in Fortran

❖ Real VarName (Real is a data type followed by a


name, therefore Real is a keyword)
❖ Real = 3.4 (Real is a variable)
❖A reserved word is a special word that cannot
be used as a user-defined name
❖ Better keep keywords reserved 21
NAMES
❖ There is one potential problem with reserved
words: If the language includes a large number of
reserved words, the user may have difficulty
making up names that are not reserved.
❖ COBOL has 300 reserved words.

❖ Unfortunately, some of the most commonly


chosen names by programmers are in the list of
reserved words—for example, LENGTH,
BOTTOM, DESTINATION, and COUNT.

22
VARIABLES
❖ A variable is an abstraction of a memory cell
❖ Abstract memory cell - the physical cell or collection
of cells associated with a variable.
❖ Variables can be characterized as a set of six
attributes:
❖ Name
❖ Address
❖ Value
❖ Type
❖ Lifetime
❖ Scope
23
VARIABLES ATTRIBUTES
Name
Address - the memory address with which
it is associated
❖ This association is not as simple as it may
appear. In many languages, it is possible for
the same variable to be associated with
different addresses at different times during
the execution of the program
❖ If two variable names can be used to access the
exact memory location, they are called aliases
❖ Aliases are created via pointers, reference variables,
C and C++ unions
❖ Aliases are harmful to readability (program readers
must remember all of them) 24
VARIABLES ATTRIBUTES
Scope
❖ The scope of a variable is the range of statements
over which it is visible.
❖ A variable is visible in a statement if it can be
referenced in that statement.
❖ The scope rules of a language determine how
references to names are associated with variables
❖ Local and nonlocal variables
❖ A nonlocal variable of a program unit
❖ visible but not declared there
❖ A local variable
25
❖ declared in the same program unit
VARIABLES ATTRIBUTES
❖ Type - determines the range of values of
variables and the set of operations that are
defined for values of that type; in the case of
floating point, type also determines the precision
❖ Value - the contents of the location with which
the variable is associated
❖ The l-value of a variable is its address
❖ The r-value of a variable is its value

26
BINDING
❖ A binding is an association between an attribute
and an entity, such as between a variable and its
type or value, or between an operation and a
symbol.
❖ The time at which a binding takes place is called
binding time.

27
POSSIBLE BINDING TIMES
❖ Language design time -- bind operator symbols to
operations(the asterisk symbol (*) is usually bound to the
multiplication operation at language design time. )
❖ Language implementation time-- bind floating point type to
a representation (A data type, such as int in C, is bound to
a range of possible values at language implementation
time)
❖ Compile time -- a variable in a Java program is bound to a
particular data type at compile time.
❖ Link Time -- A call to a library subprogram is bound to the
subprogram code at link time.
❖ Load time -- bind a FORTRAN 77 variable to a memory cell
(or a C static variable)
❖ Runtime -- bind a non-static local variable to a memory
cell; in some cases, binding does not happen until run time,
as with variables declared in Java methods in Pascal
subprograms or in C functions. 28
POSSIBLE BINDING TIMES
❖ Consider the following Java assignment statement:
count = count + 5;
❖ Some of the bindings and their binding times for the parts
of this assignment statement are as follows:
❖ • The type of count is bound at compile time.
❖ • The set of possible count values is bound at compiler
design time.
❖ • The meaning of the operator symbol + is bound at
compile time when the types of its operands have been
determined.
❖ • The internal representation of the literal 5 is bound at
compiler design time.
❖ • The value of count is bound at execution time with this
statement.
❖ Understanding the binding times for the attributes of
program entities is a prerequisite for understanding the 29
semantics of a programming language.
TYPES OF BINDING
Bindings can occur at different times in a program's
lifecycle:
1. Static Binding: A binding is static if it first occurs
before run time and remains unchanged throughout
program execution
1. Also known as early binding.
2. Occurs at compile time.
3. Example: Type binding in statically typed languages.
2. Dynamic Binding: A binding is dynamic if it first
occurs during execution or can change during
execution of the program
1. Also known as late binding.
2. Occurs at runtime.
3. Example: Dynamic method dispatch in object-oriented 30
programming (e.g., method overriding in Java).
TYPE BINDING
❖ Binding a variable to a data type
❖ How is a type specified?
❖ When does the binding take place?
❖ Types can be specified statically through some form of
explicit or implicit declaration.
❖ In type binding, explicit and implicit declarations
refer to how a variable's type is specified in a
programming language.

31
TYPE BINDING
❖ An explicit declaration occurs when the
programmer directly specifies the type, attributes, or
other properties of a variable, function, or other
entities in code. This approach is clear and leaves no
ambiguity about the nature of the entity being
defined.
❖ Characteristics:
❖ The type or property is explicitly stated in the code.
❖ Common in statically typed languages where type
safety is enforced.
❖ Reduces errors by making the code more self-
explanatory.
❖ Requires more verbosity compared to implicit
32
declarations.
TYPE BINDING
❖ An implicit declaration occurs when the
programming language automatically determines
the type or properties of a variable or function
based on its usage or context, without requiring
explicit specification by the programmer.
❖ Characteristics:

❖ The type or properties are inferred by the


compiler or interpreter.
❖ Common in dynamically typed languages.

❖ Can lead to ambiguity or runtime errors if not


handled carefully.
33
❖ Allows for concise and less verbose code.
DYNAMIC TYPE BINDING
❖ Dynamic Type Binding refers to the process where the type
of a variable is determined at runtime rather than at
compile time. This type of binding is common in
dynamically typed languages such as Python, JavaScript,
Ruby, and Perl.
❖ Dynamic Type Binding (JavaScript and PHP)
❖ Specified through an assignment statement
❖ e.g., JavaScript
❖ list = 17.3;
❖ Advantage: Flexibility
❖ Disadvantages:
❖ High cost (dynamic type checking and interpretation)

❖ Type error detection by the compiler is difficult


34
TYPE INFERENCE
❖ Type inference is a feature of some programming languages
where the compiler automatically determines the type of a
variable or expression based on the value assigned to it or
its usage. This eliminates the need for explicit type
declarations while maintaining strong typing.
❖ The compiler uses rules of type inference to deduce the type
based on:
❖ The initial value assigned to a variable.
❖ The function’s return type.
❖ The types of arguments passed to a function.

35
LIFETIME AND STORAGE BINDINGS
Storage bindings and lifetime are fundamental concepts
in programming languages related to how variables are
stored and managed in memory during a program's
execution.
❖ Storage Bindings: Storage binding refers to the
association of a variable with a memory location. This
association determines where and how the variable's
value is stored and how long it exists during the
program's execution.
❖ Allocation - getting a cell from some pool of available cells
❖ Deallocation - putting a cell back into the pool.
❖ Lifetime: Lifetime refers to the duration during which
a variable exists in memory and retains its value. It is
directly influenced by the type of storage binding. 36
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ Variables can be categorized by their lifetime based on how
long they exist in memory during a program's execution.
❖ Static
❖ Lifetime: The entire duration of the program.
❖ Storage Area: Allocated in the data segment of memory.
❖ Binding: Static storage binding, determined at compile
time.
❖ Initialization: It is initialized only once, typically at
program start. If not explicitly initialized, it defaults to
zero.
❖ Scope: Can have global or local scope.
❖ Examples: Global Variables: Visible throughout the
program. Static Local Variables: Retain their value 37
between function calls.
static int counter = 0; // Retains value between function calls
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ Automatic (Stack) Variables
❖ Lifetime: Exists only during the execution of the block
or function in which they are declared.
❖ Storage Area: Allocated on the stack.
❖ Binding: Stack storage binding, determined at runtime
when the function or block is entered.
❖ Initialization: Must be explicitly initialized
(uninitialized variables contain garbage values in most
languages).
❖ Scope: Limited to the block or function where declared.
❖ Examples: Local variables in a function.
void foo() {
int x = 5; // Lifetime ends when the function exits 38
}
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ Dynamic Variables
❖ Lifetime: Exists until explicitly deallocated (or garbage
collected in languages that support automatic memory
management).
❖ Storage Area: Allocated on the heap.
❖ Binding: Dynamic storage binding, determined at
runtime when memory allocation occurs.
❖ Initialization: Not automatically initialized unless
explicitly done during allocation.
❖ Scope: The variable itself may have limited scope, but
the memory it points to remains valid until deallocated.
❖ Examples: Dynamically allocated objects or arrays.
int* ptr = new int(10); // Lifetime extends until `delete ptr` 39
delete ptr;
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ The Heap
❖ The heap is a region of your computer's memory
that is not managed automatically for you and is
not as tightly managed by the CPU. It is a more
free-floating region of memory (and is larger). To
allocate memory on the heap, you must
use malloc() or calloc(), built-in C functions. Once
you have allocated memory on the heap, you are
responsible for using free() to de-allocate it once
you don't need it anymore.
❖ Heap memory is slightly slower to be read from
and written to because one has to use pointers to
access memory on the heap. 40
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ Explicit heap-dynamic
❖ Allocated and deallocated by explicit directives specified
by the programmer, which take effect during execution
❖ Referenced only through pointers or references, e.g.,
dynamic objects in C++ (via new and delete), all objects in
Java
❖ Advantage:
❖ provides for dynamic storage management
❖ Disadvantage:
❖ inefficient and unreliable
C++ example
int *intNode;
intNode = new int;
… 41
delete intNode;
CATEGORIES OF VARIABLES BY
LIFETIMES
❖ Implicit heap-dynamic
❖ Allocation and de-allocation caused by assignment
statements.
❖ Implicit heap-dynamic variables are bound to heap storage
only when assigned values. All their attributes are bound
every time they are assigned. For example, consider the
following JavaScript assignment statement:
❖ highs = [74, 84, 86, 90, 71];
❖ Regardless of whether the variable named highs was
previously used in the
❖ program or what it was used for, it is now an array of five
numeric values.
❖ all variables in APL; all strings and arrays in Perl and
JavaScript
❖ Advantage:
❖ flexibility
❖ Disadvantages: 42
❖ Inefficient because all attributes are dynamic.
STATIC SCOPE
❖ ALGOL 60 introduced the method of binding
names to nonlocal variables called static scoping,
which has been copied by many subsequent
imperative and non-imperative languages as
well. Static scoping is so named because the
scope of a variable can be statically determined—
that is, before execution—based on the program
text.
❖ To connect a name reference to a variable, you (or
the compiler) must find the declaration.
❖ Search process: search declarations, first locally,
then in increasingly larger enclosing scopes, until
one is found for the given name. 43
STATIC SCOPE
Consider the following JavaScript Under static scoping, the
function,
reference to the variable x in
big, in which the two functions sub2 is to the x declared in the
sub1and Sub2
procedure big. This is true
are nested:
because the search for x begins in
Function big() { the procedure in which the
function sub1() { reference occurs, sub2, but no
output: x=3,y=3
Var x = 7; y look for x
no x in that space declaration for x is found there.
sub2();
outer scope x = 3
y=3 The search continues in the static
} parent of sub2, big, where the
declaration of x is found. The x
function sub2() {
declared in sub1 is ignored
var y = x;
because it is not in the static
} ancestry of sub2.
var x = 3; 44
sub1();
}
DYNAMIC SCOPE
❖ Based on calling sequences of program units, not
their textual layout (temporal versus spatial).
❖ References to variables are connected to
declarations by searching back through the chain
of subprogram calls that forced execution to this
point.

45
DYNAMIC SCOPE
Consider again the function big Assume that dynamic-scoping rules apply to
from nonlocal references. The meaning of the
identifier x referenced in sub2 is dynamic—it
Previous Section , which is cannot be determined at compile time. It may
reproduced reference the variable from either declaration
here, minus the function calls: of x, depending on the calling sequence. One
Function big() { way the correct meaning of x can be
determined during execution is to begin the
Function sub1() { search with the local declarations. This is also
Var x = 7; when sub2() reference x how the process begins with static scoping,
from sub1, x=7
} from big, x=3 but that is where the similarity between the
two techniques ends. When the search of local
Function sub2() {
declarations fails, the declarations of the
Var y = x; dynamic parent, or calling function, are
Var z = 3; searched. If a declaration for X is not found
there, the search continues in that function’s
}
dynamic parent, and so forth, until a
Var x = 3; declaration for x is found. It is a run-time
} error if there is none in any dynamic ancestor.46
SCOPE EXAMPLE
❖ Consider the two different call sequences for sub2
in the earlier example.
❖ First, big calls sub1, which calls sub2. In this
case, the search proceeds from the local
procedure, sub2, to its caller, sub1, where a
declaration for x is found. So, in this case, the
reference to x in sub2 is to the x declared in sub1.
❖ Next, sub2 is called directly from big. In this
case, the dynamic parent of sub2 is big, and the
reference is to the x declared in big.
❖ Note that if static scoping were used in either
calling sequence discussed, the reference to x in
sub2 would be to big’s x. 47
SCOPE EXAMPLE
❖ Static scoping
❖ Reference to x is to big's x
❖ Dynamic scoping
❖ Reference to x is to SUB1's x
❖ Evaluation of Dynamic Scoping:
❖ Advantage: convenience
❖ Disadvantage: poor readability

48
NAMED CONSTANTS
❖ A named constant is a variable that is bound to a
value only when it is bound to storage
❖ Advantages: readability and modifiability can be
improved, for example, by using the name pi
instead of the constant 3.14159265.
❖ The binding of values to named constants can be
either static or dynamic
❖ Languages:
❖ FORTRAN 90: constant-valued expressions
❖ Ada, C++, and Java: expressions of any kind.

49
VARIABLE INITIALIZATION
❖ The binding of a variable to a value at the time it
is bound to storage is called initialization
❖ Initialization is often done on the declaration
statement, e.g., in Java
int sum = 0;

50
CHAPTER 6
DATA TYPES
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT
W. SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA.
ELEVENTH EDITION. 2016

Md. Rawnak Saif Adib


51
Lecturer
Department of Computer
Science and Engineering
INTRODUCTION
❖ A data type:
= a collection of data objects + a set of
predefined operations
❖ The design issues
❖ How to represent data objects?
❖ What operations are defined, and how are they
specified?

52
DESCRIPTOR
❖ A descriptor is the collection of the behavior of the
attributes of a variable. In an implementation, a
descriptor is an area of memory that stores the
attributes of a variable.
❖ If the attributes are all static, descriptors are
required only at compile time. The compiler builds
these descriptors, usually as a part of the symbol
table, and is used during compilation.
❖ For dynamic attributes, however, part or all of the
descriptor must be maintained during execution. In
this case, the run-time system uses the descriptor.
❖ In all cases, descriptors are used for type checking
and building the code for the allocation and de-
allocation operations. 53
DESCRIPTOR
❖ __get__(self, instance, owner) - Defines behavior
when the attribute is accessed.
❖ __set__(self, instance, value) - Defines behavior
when the attribute is set.
❖ __delete__(self, instance) - Defines behavior when
the attribute is deleted.
❖ Descriptors are often used to implement
properties, methods, and attribute validation.

54
PRIMITIVE DATA TYPES
❖ Primitive data types: the most basic data types
provided by a programming language. They are
used to represent simple values like numbers,
characters, or truth values and typically serve as
the building blocks for more complex data
structures.
❖ Those not defined in terms of other data types.
❖ Almost all programming languages provide a set of
primitive data types.
❖ Some primitive data types are merely reflections of
the hardware; others require little non-hardware
support.
55
PRIMITIVE DATA TYPES: INTEGER
❖ Almost always an exact reflection of the hardware, so the mapping is
trivial
❖ Example:
❖ byte(The byte data type is an 8-bit signed two's complement
integer. It has a minimum value of -128 and a maximum value of
127 (inclusive). ),
❖ Short(The short data type is a 16-bit signed two's complement
integer. It has a minimum value of -32,768 and a maximum value
of 32,767 (inclusive),
❖ Int(By default, the int data type is a 32-bit signed two's
complement integer, which has a minimum value of -231 and a
maximum value of 231-1.),
❖ Long( The long data type is a 64-bit two's complement integer.
The signed long has a minimum value of -263 and a maximum
value of 263-1).
❖ A signed integer value is represented in a computer by a string of
bits, with one of the bits (typically the leftmost) representing the sign.
Most integer types are supported directly by the hardware. 56
PRIMITIVE DATA TYPES: FLOATING
POINT
❖ Model real numbers, but only as approximations
❖ Languages for scientific use support at least two
floating-point types
❖ float and double
❖ IEEE Floating-Point Standard

57
PRIMITIVE DATA TYPES: DECIMAL
❖ For business applications (money)
❖ Essential to COBOL
❖ C# offers a decimal data type

❖ The decimal keyword indicates a 128-bit data


type. The decimal type is appropriate for
financial and monetary calculations compared to
floating-point types.
❖ Evaluation:
❖ Advantage: accurate computation in range
❖ Comparing to floating point number
❖ Disadvantages: limited range, wastes memory
58
PRIMITIVE DATA TYPES: BOOLEAN
❖ Simplest of all
❖ Range of values: two elements
❖ one for “true” and one for “false”
❖ Could be implemented as bits, but often as bytes
❖ Advantage: readability

59
PRIMITIVE DATA TYPES: CHARACTER
❖ Stored as numeric coding.
❖ Most commonly used coding: 8-bit ASCII code (0
to 127 to code 128 different characters)
❖ American Standard Code for Information Interchange
❖ An alternative, 16-bit coding: Unicode
❖ Includes characters from most natural languages
❖ Originally used in Java
❖ C# and JavaScript also support Unicode

60
CHARACTER STRING TYPES
❖ A character string type is one in which the values
consist of sequences of characters. Character
string constants are used to label output, and the
input and output of all kinds of data are often
done in terms of strings. Of course, character
strings also are an essential type for all programs
that do character manipulation.
❖ Design issues:
❖ Is it a primitive type or just a special kind of array?
❖ Should the length of strings be static or dynamic?

61
CHARACTER STRING TYPES
OPERATIONS
❖ Typical operations:
❖ Assignment and copying
❖ Comparison (=, >, etc.)
❖ Catenation
❖ Substring reference (A substring reference refers to a
substring of a given string.)
❖ Pattern matching (Pattern matching is another
fundamental character string operation. In some
❖ languages, pattern matching is supported directly
in the language. In others, it is provided by a function
or class library.)
62
CHARACTER STRING TYPES
OPERATIONS
❖ Typical operations:
❖ Assignment and copying
❖ Comparison (=, >, etc.)
❖ Catenation
❖ Substring reference (A substring reference refers to a
substring of a given string.)
❖ Pattern matching (Pattern matching is another
fundamental character string operation. In some
❖ languages, pattern matching is supported directly
in the language. In others, it is provided by a function
or class library.)
63
CHARACTER STRING LENGTH OPTIONS
❖ Three options:
❖ Static: (the length can be static and set when the
string is created. Such a string is called a static
length string.) COBOL, Java’s String class
❖ Limited Dynamic Length: allow strings to have
varying lengths up to a declared and fixed maximum
set by the variable’s definition, as exemplified by the
strings in C and the C-style strings of C++. These are
called limited dynamic length strings.
❖ In C-based language, a special character is used to indicate
the end of a string’s characters rather than maintaining the
length
❖ Dynamic (no maximum): SNOBOL4, Perl, JavaScript
64
❖ Ada supports all three string length options
CHARACTER STRING LENGTH OPTIONS
❖ Three options:
❖ Static: (the length can be static and set when the
string is created. Such a string is called a static
length string.) COBOL, Java’s String class
❖ Limited Dynamic Length: allow strings to have
varying lengths up to a declared and fixed maximum
set by the variable’s definition, as exemplified by the
strings in C and the C-style strings of C++. These are
called limited dynamic length strings.
❖ In C-based language, a special character is used to indicate
the end of a string’s characters rather than maintaining the
length
❖ Dynamic (no maximum): SNOBOL4, Perl, JavaScript
65
❖ Ada supports all three string length options
CHARACTER STRING IMPLEMENTATION
❖ Static length: compile-time descriptor
❖ Limited dynamic length: may need a run-time
descriptor for length (but not in C and C++)
❖ Dynamic length: need run-time descriptor;
allocation/de-allocation is the biggest
implementation problem

A compile-time
descriptor for static A run-time descriptor for
66
strings limited dynamic strings
USER-DEFINED ORDINAL TYPES
❖ An ordinal type is one in which the range of
possible values can be easily associated with the
set of positive integers an ordinal data type is
a data type with the property that its values can
be counted. The values can be put in a one-to-one
correspondence with the positive integers. For
example, characters are ordinal because we can
call 'A' the first character, 'B' the second, etc.
❖ Examples: primitive ordinal types in Java
❖integer
❖ char
❖ Boolean
❖ Programming languages have supported two user-defined ordinal
types: enumeration and subrange. 67
ENUMERATION TYPES
❖ All possible values, which are named constants,
are provided in the definition.
❖ Enumeration types provide a way of defining and
grouping collections of named constants called
enumeration constants.
❖ C# example
❖ enum days {mon, tue, wed, thu, fri, sat, sun};
❖ No arithmetic operations are legal on enumeration
types
❖ days d1, d2; d1 + d2 =?
❖ No enumeration type variables are not implemented

into integer types


68
❖ days d1; d1 = 4; ?
ARRAY TYPES
❖ An array is a data structure used in
programming to store a collection of elements,
typically of the same type, in a contiguous block
of memory. Each element in an array is identified
by an index or a key, allowing for quick access
and modification of its contents.
❖ Example: int a[100];
❖ Design issues
❖ What types are legal for subscripts?
❖ When are subscript ranges bound?
❖ When does allocation take place?
❖ What is the maximum number of subscripts? 69
ARRAY INDEXING
❖ Array indexing refers to the process of accessing elements
of an array using their position (index) within the array.
Indexing is a fundamental concept in array manipulation,
enabling efficient retrieval and modification of elements.
❖ Indexing or sub-scripting
❖ Mapping from indices to elements
❖ a(index_value) → an element
❖ Syntax:
❖ FORTRAN, PL/I, Ada use parentheses, others use brackets
❖ Index Types
❖ Integer type only: Fortran, C, Java
❖ Any ordinal type: Pascal
❖ Integer or enum: Ada
❖ Index range check
❖ No: C, C++, Perl, Fortran
❖ Yes: Java, ML, C# 70
ARRAY CATEGORIES
❖ Determined by the binding of the subscript
(index) type to an array element.
❖ int a[?];
❖ Static: subscript ranges are statically bound, and
storage allocation is static (during compile time)
❖ Advantage: efficiency (no dynamic allocation)
❖ func1()
{
static int a[100]; …
}

71
ARRAY CATEGORIES
❖ Fixed stack-dynamic:
❖ Subscript ranges are statically bound, but
❖ The allocation is done at declaration elaboration time during
execution
❖ Advantage: space efficiency
❖ Stack-dynamic:
❖ Subscript ranges are dynamically bound, and
❖ The storage allocation is dynamic (done at run-time)
❖ Advantage: flexibility (the size of an array need not be known until
the array is to be used)
❖ Fixed heap-dynamic:
❖ storage binding is dynamic but fixed after allocation (i.e., binding is
done when requested, and storage is allocated from the heap, not
stack)
❖ Heap-dynamic:
❖ binding of subscript ranges and storage allocation is dynamic and can
change any number of times
❖ Advantage: flexibility (arrays can grow or shrink during program
72
execution)
ARRAY INITIALIZATION
❖ Some languages allow initialization at the time of
storage allocation
❖ C, C++, Java, C# example
int list [] = {4, 5, 7, 83}
❖ Character strings in C and C++
char name [] = “Freddie”;
❖ Arrays of strings in C and C++
char *names [] = {“Bob”, “Jake”, “Joe”];
❖ Java initialization of String objects
String[] names = {“Bob”, “Jake”, “Joe”};

73
RECORD TYPES
❖ Record types are composite data structures used
in programming to group related data items,
often of different types, into a single logical unit.
They are analogous to "rows" in a database table
or "structs" in certain programming languages.
❖ Introduced by COBOL in the 1960s

❖ Syntax
COBOL 01 EMP-REC.
02 EMP-NAME.
05 FIRST PIC X(20).
05 MID PIC X(10).
05 LAST PIC X(20).
02 HOURLY-RATE PIC 99V99. 74
UNIONS TYPES
❖ A union type is a data structure used in
programming to store variables of different types
in the same memory location. However, only one
value is stored at any given time, as all fields
share the same memory space. Union types are
particularly useful in low-level programming for
memory optimization or working with data that
can take on multiple forms, such as interfacing
with hardware or managing polymorphic data.
❖ Design issues
❖ Should type-checking be required?
❖ Should unions be embedded in records? 75
POINTER AND REFERENCE TYPES
❖ A pointer type variable has a range of values that
consists of memory addresses and a special value,
nil
❖ One address value
❖ An address tuple (segment, offset)
❖ The use of pointers
❖ Support indirect addressing
❖ Manage dynamic memory
❖ Access a location in the heap

❖ The variable that stores the address of another


variable is what in C++ is called a pointer
76
POINTER OPERATIONS
❖ Two fundamental operations
❖ Assignment
❖ Assigning a value to a pointer variable, typically the
memory address of another variable or object.
❖ set a pointer variable’s value to some useful address

❖ Dereferencing
❖ Accessing or modifying the value stored at the
memory location pointed to by a pointer.
❖ yields the value stored at the location represented by

the pointer’s value


❖ Dereferencing can be explicit or implicit

❖ C/C++ use an explicit operator “*”


77
j = * ptr
POINTERS IN C AND C++
❖ Extremely flexible but must be used with care
❖ Pointers can point at any variable
❖ Pointer arithmetic is possible
❖ Explicit dereferencing and address-of operators
❖ Support dynamic storage management and addressing

78
CHAPTER 7
EXPRESSIONS AND
ASSIGNMENT STATEMENTS
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT W.
SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA. ELEVENTH
EDITION. 2016

Md. Rawnak Saif Adib


79
Lecturer
Department of Computer
Science and Engineering
INTRODUCTION
❖ Expressions are the fundamental means of
specifying computations in a programming
language. It is crucial for a programmer to
understand both the syntax and semantics of
expressions of the language he or she uses.
❖ To understand expression evaluation, it is
necessary to be familiar with the orders of
operator and operand evaluation.
❖ The operator evaluation order of expressions is
dictated by the associativity and precedence rules
of the language.
80
ARITHMETIC EXPRESSIONS
❖ In programming languages, arithmetic
expressions consist of operators, operands,
parentheses, and function calls.
❖ An operator can be unary, meaning it has a
single operand, binary, meaning it has two
operands, or ternary, meaning it has three
operands.
❖ The purpose of an arithmetic expression is to
specify an arithmetic computation.
❖ An implementation of such a computation must
cause two actions: fetching the operands, usually
from memory, and executing arithmetic
operations on those operands. 81
ARITHMETIC EXPRESSIONS
❖ Example
( 3 + b ) * a – func(14)

❖ operators, operands, parentheses, and function calls

82
OPERATORS
❖ Different operators
❖ A unary operator has one operand
– –a, a ++, …
❖ A binary operator has two operands
a + b, a – b, …
❖ A ternary operator has three operands
(a>b) ? a : b // result is a if (a>b), or b if not
❖ Exercise:
❖ What is the value? a=10; a++; ++a;
❖ legal or not? ++a, a++, ++ a ++

83
MODULO OPERATION
❖ a modulo b
❖ Returns the remainder of the division of a by b
❖ a = b  q + r, return r

❖ Straightforward if both a and b are positive


values
❖ What if a or b is negative?
❖ Language-dependent, implementation-dependent
❖ Ada:
Mod: result has the same sign as the divisor
(-10) mod 3 = 1
rem: result has the same sign as dividend
84
(-10) rem 3 = -1
ORDER OF COMPUTATION
❖ Operator precedence rule
❖ Define the order in which “adjacent” operators of
different precedence levels are evaluated
❖ Typical precedence levels
❖ parentheses
❖ unary operators
❖ ** (if the language supports it)
❖ *, /
❖ +, -

85
ORDER OF COMPUTATION
❖ Operator associativity rule
❖ define the order in which adjacent operators with the
same precedence level are evaluated
❖ Typical associativity rules
❖ Left to right
❖ a + b + c, a * b * d
❖ Exponentiation **
❖ Fortran: from right to left 2 ** 3 ** 4 = 2 81 ≠ 8 4
❖ Ada: must use a parenthesis (2 ** 3) ** 4
❖ Visual basic 2 ** 3 ** 4 = 8 4 ≠ 2 81

86
PRECEDENCE
❖ The value of an expression depends at least in
part on the order of evaluation of the operators in
the expression. Consider the following
expression:
a+b*c
❖ Suppose the variables a, b, and c have the values
3, 4, and 5, respectively.
❖ If evaluated left to right (the addition first and
then the multiplication), the result is 35.
❖ If evaluated right to the left, the result is 23.

87
PRECEDENCE
❖ The operator precedence rules for expression
evaluation partially define the order in which the
operators of different precedence levels are
evaluated.
❖ The operator precedence rules for expressions are
based on the hierarchy of operator priorities, as
seen by the language designer.

88
ASSOCIATIVITY
❖ Consider the following expression:
a-b+c–d
❖ If the addition and subtraction operators have the same
level of precedence as they do in programming languages,
the precedence rules say nothing about the order of
evaluation of the operators in this expression.
❖ When an expression contains two adjacent occurrences of
operators with the same level of precedence, the question of
which operator is evaluated first is answered by the
associativity rules of the language.
❖ An operator can have either left or right associativity,
meaning that when there are two adjacent operators with
the same precedence, the left operator is evaluated first, or
the right operator is evaluated first, respectively.
89
ASSOCIATIVITY
❖ Associativity in common languages is left to
right, except that the exponentiation operator
(when provided) sometimes associates right to
left.
❖ In the Java expression a - b + c, the left operator
is evaluated first.

90
PARENTHESES
❖ A parenthesized part of an expression has
precedence over its adjacent un-parenthesized
parts. For example, although multiplication has
precedence over addition, in the expression
❖ (A + B) * C

❖ the addition will be evaluated first.


Mathematically, this is perfectly natural.
❖ In this expression, the first operand of the
multiplication operator is not available until the
addition in the parenthesized subexpression is
evaluated.
91
CONDITIONAL EXPRESSIONS
❖ if-then-else statements can be used to perform a
conditional expression assignment. For example,
consider
if (count == 0)
average = 0;
else
average = sum / count;
❖ In the C-based languages, this code can be specified
more conveniently in an assignment statement using
a conditional expression, which has the following
form:
expression_1 ? expression_2 : expression_3
❖ where expression_1 is interpreted as a Boolean
expression. If expression_1 evaluates to true, the
value of the whole expression is the value of
expression_2; otherwise, it is the value of 92
expression_3.
OPERAND EVALUATION ORDER
❖ Variables in expressions are evaluated by
fetching their values from memory.
❖ Constants are sometimes evaluated the same
way.
❖ In other cases, a constant may be part of the
machine language instruction and not require a
memory fetch.
❖ If an operand is a parenthesized expression, all of
the operators it contains must be evaluated
before its value can be used as an operand.
93
SIDE EFFECTS
❖ A side effect of a function, naturally called a
functional side effect, occurs when the function
changes either one of its parameters or a global
variable. (A global variable is declared outside the
function but is accessible in the function.)
❖ Consider the following expression:
❖ a + fun(a)
❖ If fun does not have the side effect of changing a, then
the order of evaluation of the two operands, a and
fun(a), has no effect on the value of the expression.
However, if fun changes a, there is an effect.
❖ Consider the following situation: fun returns 10 and
changes the value of its parameter to 20. Suppose we
have the following: a = 10; b = a + fun(a); 94
SIDE EFFECTS
❖ Then, if the value of a is fetched first (in the
expression evaluation process), its value is 10
and the value of the expression is 20. But if the
second operand is evaluated first, then the value
of the first operand is 20 and the value of the
expression is 30.

95
SOLUTIONS TO FUNCTION SIDE
EFFECTS
❖ Two possible solutions to the problem when
designing a language
❖ Disallow functional side effects
❖ No two-way parameters in functions
❖ No non-local references in functions
❖ Advantage: it works!
❖ Disadvantage: inflexibility of two-way parameters and
non-local references
❖ Fix the operand evaluation order
❖ Disadvantage: limits some compiler optimizations

96
OVERLOADED OPERATORS
❖ Arithmetic operators are often used for more
than one purpose.
❖ For example, + is usually used to specify integer
addition and floating-point addition.
❖ Some languages—Java, for example—also use it
for string catenation.
❖ This multiple uses of an operator is called
operator overloading and is generally thought to
be acceptable as long as neither readability nor
reliability suffers.
97
OVERLOADED OPERATORS
❖ As an example of the possible dangers of
overloading, consider the use of the ampersand
(&) in C++.
❖ As a binary operator, it specifies a bitwise logical
AND operation.
❖ As a unary operator with a variable as its
operand, the expression value is the address of
that variable.
❖ In this case, the ampersand is called the address-
of-operator. For example, the execution of x = &y;
causes the address of y to be placed in x.
98
OVERLOADED OPERATORS
❖ There are two problems with this multiple use of
the ampersand.
❖ First, using the same symbol for two completely
unrelated operations is detrimental to
readability.
❖ Second, the simple keying error of leaving out the
first operand for a bitwise AND operation can go
undetected by the compiler because it is
interpreted as an address-of operator. Such an
error may be difficult to diagnose.

99
TYPE CONVERSIONS
❖ A narrowing conversion converts a value to a
type that cannot store even approximations of all
of the values of the original type.
❖ For example, converting a double to a float in
Java is a narrowing conversion because the range
of a double is much larger than that of a float.
❖ A widening conversion converts a value to a type
that can include at least approximations of all of
the values of the original type.

100
TYPE CONVERSIONS
❖ For example, converting an int to a float in Java
is a widening conversion.
❖ Widening conversions are nearly always safe,
meaning that the approximate magnitude of the
converted value is maintained.
❖ Narrowing conversions are not always safe—
sometimes the magnitude of the converted value
is changed in the process

101
COERCION IN EXPRESSIONS
❖ One of the design decisions concerning arithmetic
expressions is whether an operator can have operands
of different types.
❖ Languages that allow such expressions, which are
called mixed-mode expressions, must define
conventions for implicit operand type conversions
because computers do not have binary operations that
take operands of different types.
❖ Implicit type conversions are those where the type
conversion is initiated by the compiler or runtime
system.
❖ Type conversions explicitly requested by the
programmer are referred to as explicit conversions or
casts, not coercions. 102
EXPLICIT TYPE CONVERSION
❖ Most languages provide some capability for doing
explicit conversions, both widening and narrowing.
❖ In some cases, warning messages are produced when
an explicit narrowing conversion results in a
significant change to the value of the object being
converted.
❖ Definition: The programmer manually specifies the
type conversion.
❖ How it’s done: By using functions or syntax provided
by the programming language.
❖ Control: Full control over the conversion process; the
programmer decides when and how the conversion
happens.
❖ Use case: When you know the specific type conversion
you need and want to avoid unintended behaviors. 103
IMPLICIT TYPE CONVERSION
❖ Definition: The compiler or interpreter automatically
performs the type conversion.
❖ How it’s done: It happens behind the scenes without
explicit instructions from the programmer.
❖ Control: Less control, as the conversion is based on
predefined rules of the language.
❖ Use case: When combining different data types where
the conversion rules are predictable and consistent.
❖ Use explicit type conversion when precision, clarity,
or error prevention is critical.
❖ Rely on implicit type conversion for simpler,
straightforward operations where you trust the 104
language's type conversion rules.
RELATIONAL AND BOOLEAN
EXPRESSIONS
❖ A relational operator is an operator that
compares the values of its two operands.
❖ A relational expression has two operands and one
relational operator.
❖ The value of a relational expression is Boolean,
except when Boolean is not a type included in the
language.
❖ They are fundamental in programming and are
used in decision-making, loops, and conditional
structures.
105
USE CASES OF RELATIONAL
EXPRESSIONS
❖ Control Flow: To determine the flow of
execution in if, while, for.
if age >= 18:
print("Eligible to vote")
❖ Validation: To check user inputs, data integrity,
or constraints., etc.
if password == "secure123":
print("Access granted")
❖ Loops: To control iteration based on conditions.
i=0
while i < 5:
print(i)
i += 1 106
BOOLEAN EXPRESSIONS
❖ Boolean expressions consist of Boolean variables,
Boolean constants, relational expressions, and
Boolean operators.
❖ The operators usually include those for the AND,
OR, and NOT operations, and sometimes for
exclusive OR and equivalence. Boolean operators
usually take only Boolean operands (Boolean
variables, Boolean literals, or relational
expressions) and produce Boolean values.

107
SHORT-CIRCUIT EVALUATION
❖ A short-circuit evaluation of an expression is one
in which the result is determined without
evaluating all of the operands and/or operators.
For example, the value of the arithmetic
expression
(13 * a) * (b / 12)
❖ is independent of the value of (b / 12) if a is 0,
because 0 * x = 0 for any x. So, when a is 0, there
is no need to evaluate (b / 12) or perform the
second multiplication.
❖ However, in arithmetic expressions, this shortcut
is not easily detected during execution, so it is
never taken. 108
SHORT-CIRCUIT EVALUATION
❖ The value of the Boolean expression (a >= 0) &&
(b < 10) is independent of the second relational
expression if a < 0, because the expression
(FALSE && (b < 10)) is FALSE for all values of b.
❖ So, when a is less than zero, there is no need to
evaluate b, the constant 10, the second relational
expression, or the && operation.
❖ Unlike the case of arithmetic expressions, this
shortcut easily can be discovered during
execution.

109
SHORT-CIRCUIT EVALUATION
❖ Sometimes, we can have problems without short-
circuit evaluation
❖ index = 1;
❖ while (index <= length) && (LIST[index] != value)
❖ index++;
❖ when index=length, LIST [index] will cause an
indexing problem (assuming LIST has length -1
elements)

110
ASSIGNMENT STATEMENTS
❖ The assignment statement is one of the central
constructs in imperative languages.
❖ It provides the mechanism by which the user can
dynamically change the bindings of values to
variables.
❖ The different varieties of assignment statements
are as follows:

111
ASSIGNMENT STATEMENTS
Simple Assignments
❖ Nearly all programming languages currently
being used use the equal sign for the assignment
operator.
❖ All of these must use something different from an
equal sign for the equality relational operator to
avoid confusion with their assignment operator.

112
ASSIGNMENT STATEMENTS
Conditional Targets
❖ Perl allows conditional targets on assignment
statements.
❖ For example, consider
($flag? $count1 : $count2) = 0;
which is equivalent to
if ($flag)
{ $count1 = 0; }
Else
{ $count2 = 0; }

113
ASSIGNMENT STATEMENTS
Compound Assignment Operators
❖ A compound assignment operator is a shorthand
method of specifying a commonly needed form of
assignment.
❖ The form of assignment that can be abbreviated
with this technique has the destination variable
also appearing as the first operand in the
expression on the right side, as in a = a + b.
❖ The syntax of these assignment operators is the
catenation of the desired binary operator to the =
operator.
❖ For example, sum += value; is equivalent to sum
= sum + value; 114
ASSIGNMENT STATEMENTS
Unary Assignment Operators
❖ The C-based languages, Perl, and JavaScript include
two special unary arithmetic operators that are
actually abbreviated assignments.
❖ They combine increment and decrement operations
with assignment.
❖ The operators ++ for increment and -- for decrement
can be used either in expressions or to form stand-
alone single-operator assignment statements.
❖ In the assignment statement, sum = ++ count;
❖ the value of count is incremented by 1 and then
assigned to sum. This operation could also be stated
as
115
count = count + 1;
ASSIGNMENT STATEMENTS
Assignment as an Expression
❖ In the C-based languages, Perl, and JavaScript,
the assignment statement produces a result,
which is the same as the value assigned to the
target.
❖ It can, therefore, be used as an expression and as
an operand in other expressions.
❖ This design treats the assignment operator much
like any other binary operator, except that it has
the side effect of changing its left operand.
❖ For example, in C, it is common to write
statements such as
116
while ((ch = getchar()) != EOF) { ... }
ASSIGNMENT STATEMENTS
Assignment as an Expression
❖ In this statement, the next character from the
standard input file, usually the keyboard, is
gotten with getchar and assigned to the variable
ch.
❖ The result, or value assigned, is then compared
with the constant EOF.
❖ If ch is not equal to EOF, the compound
statement {...} is executed.

117
ASSIGNMENT STATEMENTS
Multiple Assignments
❖ Several recent programming languages,
including Perl and Ruby provide multiple-target,
multiple-source assignment statements.
❖ For example, in Perl one can write
($first, $second, $third) = (20, 40, 60);
❖ The semantics is that 20 is assigned to $first, 40
is assigned to $second, and 60 is assigned to
$third.

118
MIXED-MODE ASSIGNMENT
❖ Assignment statements can also be mixed-mode,
for example
int a, b;
float c;
c = a / b;
❖ In Pascal, integer variables can be assigned to
real variables, but real variables cannot be
assigned to integers
❖ In Java, only widening assignment coercions are
done
❖ In Ada, there is no assignment coercion
119
CHAPTER 8
STATEMENT-LEVEL CONTROL
STRUCTURES
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT W.
SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA. ELEVENTH
EDITION. 2016

Md. Rawnak Saif Adib


120
Lecturer
Department of Computer
Science and Engineering
INTRODUCTION
❖ Imperative programs rely on expressions and
variable assignments but require more to be
effective. Control statements provide essential
features for flexibility and power, enabling
selection among execution paths and supporting
repeated execution of statements.
❖ Functional programming computes by evaluating
expressions and applying functions to
parameters, with execution flow managed by
expressions and functions, some resembling
imperative control statements.
❖ A control structure is a control statement and the
collection of statements whose execution it 121
controls.
SELECTION STATEMENTS
❖ A selection statement provides the means of
choosing between two or more execution paths in
a program.
❖ Selection statements fall into two general
categories:
❖ Two-way. E.g.: if-statement.
❖ Multiple selection. E.g.: case statement.

122
TWO-WAY SELECTION STATEMENTS
❖ Two-way selection statements are control
structures in programming languages that allow
the program to choose between two alternative
paths of execution based on a specified condition.
❖ These statements evaluate a Boolean expression
and execute one block of code if the condition is
true and another block (or nothing) if it is false.
❖ The design issues for two-way selectors can be
summarized as follows:
❖ What is the form and type of the expression that
controls the selection?
❖ How are the then and else clauses specified?
❖ How should the meaning of nested selectors be 123
specified?
TWO-WAY SELECTION STATEMENTS
❖ Common Characteristics:
❖ Condition: A Boolean expression that determines which
path is taken.
❖ True Branch: A block of code executed if the condition
evaluates to true.
❖ False Branch: A block of code executed if the condition
evaluates false. This branch is often optional.
❖ In many programming languages, two-way selection
statements follow this format.
❖ Example in C:
if (condition) {
// True branch: code executed when the condition is true
} else {
// False branch: code executed when the condition is false
} 124
MULTIPLE-SELECTION STATEMENTS
❖ Multiple-selection statements are control structures
in programming that allow a program to choose from
multiple alternative paths of execution based on the
evaluation of a single condition or expression.
❖ These statements extend the functionality of two-way
selection by handling more than two options.
❖ Key Characteristics:
❖ Single Condition or Expression: The decision is based on
the value of a condition or expression.
❖ Multiple Alternatives: Each alternative corresponds to a
specific value or range of values of the condition or
expression.
❖ Default Option: A fallback alternative that executes if no 125
other conditions match (optional in some languages).
MULTIPLE-SELECTION STATEMENTS
❖ The design issues for two-way selectors can be
summarized as follows:
❖ What is the form and type of the expression that
controls the selection?
❖ How are the selectable segments specified?
❖ Is execution flow through the structure restricted to
include just a single selectable segment?
❖ How are the case values specified?
❖ How should unrepresented selector expression values
be handled, if at all?

126
MULTIPLE-SELECTION STATEMENTS
❖ The syntax for multiple-selection statements varies
by language, but they often take the form of switch-
case statements or equivalent structures.
❖ Example: switch Statement in C, C++, Java:
switch (expression) {
case value1:
// Code block for value1
break;
case value2:
// Code block for value2
break;
default:
// Code block for no matches
} 127
MULTIPLE-SELECTION STATEMENTS
❖ Example: elif or else if Chains in Python and C:
if condition1:
# Code for condition1
elif condition2:
# Code for condition2
elif condition3:
# Code for condition3
else:
# Default code

128
MULTIPLE-SELECTION STATEMENTS
❖ Advantages:
❖ Makes code for multiple decisions easier to write and
understand.
❖ Often more efficient than a series of if-else
conditions, especially with many alternatives.
❖ Disadvantages:
❖ Limited flexibility in some implementations (e.g., no
range support in switch in C).
❖ Requires careful handling of fall-through behavior in
languages like C or C++.

129
ITERATIVE STATEMENTS
❖ An iterative statement is one that causes a
statement or collection of statements to be
executed zero, one, or more times. An iterative
statement is often called a loop.
❖ Several categories of iteration control statements
have been developed. The primary categories are
defined by how designers answered two basic
design questions:
❖ How is the iteration controlled?
❖ Where should the control mechanism appear in the
loop statement?
130
ITERATIVE STATEMENTS
❖ The body of an iterative statement is the collection of
statements whose execution is controlled by the
iteration statement.
❖ We use the term pretest to mean that the test for loop
completion occurs before the loop body is executed
and posttest to mean that it occurs after the loop body
is executed.
❖ The iteration statement and the associated loop body
together form an iteration statement.
❖ Key Characteristics of Iterative Statements:
❖ Repetition: Executes a block of code multiple times.
❖ Condition: Controls when the iteration starts, continues, or
stops.
❖ Iteration Variable: Often used to track the current 131
iteration or control the loop's progress.
COUNTER-CONTROLLED LOOPS
❖ A counting iterative control statement has a
variable called the loop variable, in which the
count value is maintained.
❖ It also includes some means of specifying the
initial and terminal values of the loop variable
and the difference between sequential loop
variable values, often called the step size.
❖ The initial, terminal, and step size specifications
of a loop are called the loop parameters.

132
COUNTER-CONTROLLED LOOPS
❖ The following are the design issues:
❖ What are the type and scope of the loop variable?
❖ Should it be legal for the loop variable or loop
parameters to be changed in the loop, and if so, does
the change affect loop control?
❖ Should the loop parameters be evaluated only once or
once for every iteration?
❖ What is the value of the loop variable after loop
termination?

133
CHAPTER 9
SUBPROGRAMS
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT
W. SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA.
ELEVENTH EDITION. 2016

Md. Rawnak Saif Adib


134
Lecturer
Department of Computer
Science and Engineering
FUNDAMENTALS OF SUBPROGRAMS
All subprograms have the following characteristics:
❖ Each subprogram has a single-entry point.

❖ The calling program unit is suspended during the


execution of the called subprogram, which
implies that there is only one subprogram in
execution at any given time.
❖ Control always returns to the caller when the
subprogram execution terminates.

135
FUNDAMENTALS OF SUBPROGRAMS
❖ A subprogram definition describes the interface
to and the actions of the subprogram abstraction.
❖ A subprogram call is the explicit request that a
specific subprogram be executed.
❖ A subprogram is said to be active if, after having
been called, it has begun execution but has not
yet completed that execution.

136
FUNDAMENTALS OF SUBPROGRAMS
❖ A subprogram header, which is the first part of
the definition, serves several purposes.
❖ First, it specifies that the following syntactic unit
is a subprogram definition of some particular
kind.
❖ In languages that have more than one kind of
subprogram, the kind of subprogram is usually
specified with a special word.
❖ Second, if the subprogram is not anonymous, the
header provides a name for the subprogram.
❖ Third, it may specify a list of parameters.
137
FUNDAMENTALS OF SUBPROGRAMS
❖ Consider the following header examples:
def adder parameters):
❖ This is the header of a Python subprogram
named adder. Ruby subprogram headers also
begin with def. The header of a JavaScript
subprogram begins with a function.
❖ Procedure: A subprogram designed to perform a
task or set of actions. It typically does not return
a value to the caller. Its main purpose is to
execute code.
❖ Function: A subprogram designed to compute and
return a value to the caller. Its main purpose is
to produce a result. 138
PARAMETERS
❖ Subprograms typically describe computations.
❖ There are two ways that a non-method
subprogram can gain access to the data that it is
to process: through direct access to nonlocal
variables (declared elsewhere but visible in the
subprogram) or through parameter passing.
❖ Data passed through parameters are accessed
using names that are local to the subprogram.
❖ Parameter passing is more flexible than direct
access to nonlocal variables.
139
DESIGN ISSUES FOR SUBPROGRAMS
❖ Subprograms are complex structures, and it
follows from this that a lengthy list of issues is
involved in their design. One obvious issue is the
choice of one or more parameter-passing methods
that will be used.
❖ The nature of the local environment of a
subprogram dictates, to some degree, the nature
of the subprogram. The most important question
here is whether local variables are statically or
dynamically allocated.
❖ Whether subprogram definitions can be nested?
❖ Whether subprogram names can be passed as
parameters. 140
DESIGN ISSUES FOR SUBPROGRAMS
❖ Finally, there are the questions of whether
subprograms can be overloaded or generic.
❖ An overloaded subprogram is one that has the
same name as another subprogram in the same
referencing environment.
❖ A generic subprogram is one whose computation
can be done on data of different types in different
calls.
❖ A closure is a nested subprogram and its
referencing environment, which together allow
the subprogram to be called from anywhere in a
141
program
PARAMETER-PASSING METHODS
❖ Parameter-passing methods are the ways in
which parameters are transmitted to and/or from
called subprograms.
❖ Formal parameters are characterized by one of
three distinct semantics models: (1) They can
receive data from the corresponding actual
parameter; (2) they can transmit data to the
actual parameter; or (3) they can do both.
❖ These models are called in mode, out mode, and
inout mode, respectively.

142
PARAMETER-PASSING METHODS
❖ Consider a subprogram that takes two arrays of int
values as parameters—list1 and list2.
❖ The subprogram must add list1 to list2 and return the
result as a revised version of list2.
❖ Furthermore, the subprogram must create a new
array from the two given arrays and return it.
❖ For this subprogram, list1 should be in mode because
it is not to be changed by the subprogram.
❖ list2 must be inout mode because the subprogram
needs the given value of the array and must return its
new value.
❖ The third array should be out mode because there is
no initial value for this array, and its computed value
143
must be returned to the caller.
PARAMETER-PASSING METHODS
❖ There are two conceptual models of how data
transfers take place in parameter transmission.
❖ An actual value is copied (to the caller, to the
called, or both ways). Pass-by-Value, Pass-by-
Value-Result. In this model, the actual value of
the argument is copied during the parameter
transmission. This copying can occur:
❖ To the Called: The caller sends a copy of the value to
the called subprogram.
❖ To the Caller: After the called subprogram finishes
execution, a copy of the modified value (if applicable)
may be sent back to the caller. 144
PARAMETER-PASSING METHODS
❖ An access path is transmitted. In this model, instead
of copying the actual data, a reference or access path
(such as a memory address or pointer) is transmitted.
❖ This allows both the caller and the called subprogram
to access and modify the same memory location. Pass-
by-Reference, Pass-by-Name.
❖ Value Transmission (Copying Data): Creates a
duplicate of the value for the called subprogram to
work on. It is safe but may involve overhead.
❖ Access Path Transmission (Passing a Reference):
Provides direct access to the caller's data, allowing
modifications but potentially introducing side effects.
145
PASS-BY-VALUE
❖ Pass by value is a method of parameter
transmission where a copy of the actual
argument's value is passed to the subprogram
(function or procedure).
❖ The subprogram works with this copy, and any
changes made to the parameter inside the
subprogram do not affect the original value in the
calling program.

146
PASS-BY-VALUE
void increment(int num) {
num = num + 1; // Modifies the local
copy Inside function: 6
Outside function: 5
printf("Inside function: %d\n", num);
}
int main() {
int x = 5;
increment(x); // Pass a copy of x to
the function
printf("Outside function: %d\n", x);
// Original value remains unchanged
return 0;
147
}
PASS-BY-RESULT
❖ Pass-by-Result, also known as Call-by-Result, is
a parameter transmission mechanism where:
❖ The subprogram is called, and the parameter
variable inside the subprogram is initially
uninitialized.
❖ The subprogram computes a value for the
parameter during execution.
❖ At the end of the subprogram's execution, the
computed value is copied back to the caller's
variable, replacing its previous value.
148
PASS-BY-RESULT
int x = 7;
int y = 10; Before: After:
void func1(int •a, int •b) x=7 global x = 15
{ y = 10 data y = 18
a = 15;
b = 18; main

}
a a = 15
void main() b
func1
b = 18
{
func1 (x, y);
printf(“ x= %d y=
%d \n”, x, y); 149
}
PASS-BY-VALUE-RESULT
❖ Pass-by-value-result is a parameter-passing
mechanism where:
❖ A copy of the actual argument's value is
passed to the subprogram at the start of
execution (copy-in).
❖ Once the subprogram completes, the final value
of the parameter in the subprogram is copied
back to the caller's argument (copy-out).
❖ This method combines elements of both pass-by-
value and pass-by-reference.
150
PASS-BY-VALUE-RESULT
int x = 7;
Before: After:
int y = 10;
x=7 x = 15
void func1(int •a, int •b) global
y = 10 data y = 25
{
a = 15; main
b = a+b;
} a=7 a = 15
func1
void main() b = 10 b = 25
{
func1 (x, y);
printf(“ x= %d y=
%d \n”, x, y); 151
}
PASS-BY-REFERENCE
❖ Pass by reference is a parameter transmission
method where a reference (memory address or
pointer) to the actual argument is passed to the
subprogram (function or procedure).
❖ The subprogram directly accesses and
manipulates the original variable or data,
meaning any changes made to the parameter
inside the subprogram affect the original value in
the calling program.

152
PASS-BY-REFERENCE
void increment(int &num) { // Reference
parameter
num += 1; // Modifies the original Inside function: 6
variable Outside function: 6
cout << "Inside function: " << num <<
endl;
}

int main() {
int x = 5;
increment(x); // Passes a reference to x
cout << "Outside function: " << x << endl;
// Original value is updated
return 0;
} 153
PASS-BY-NAME
❖ Pass by name is a parameter-passing mechanism
in which the expression or variable name
provided as an argument is passed to the
subprogram.
❖ The subprogram does not evaluate the argument
immediately; instead, the argument is re-
evaluated every time it is accessed in the
subprogram.
❖ This can be thought of as a textual substitution
of the argument expression wherever the
parameter is used in the subprogram.
❖ Pass by name was famously used in Algol 60 and
is rarely seen in modern programming languages. 154
PASS-BY-NAME
procedure example(a);
begin
print(a); // Evaluates a for the
first time
print(a); // Re-evaluates a
again
Output: 6 6
end; Here, x + 1 is re-evaluated each
time a is accessed. If x changes
between the evaluations, the
x := 5; results would differ.
example(x + 1);
155
OVERLOADED SUBPROGRAMS
❖ Overloading refers to the ability to define
multiple subprograms (functions or methods)
with the same name but different parameter
types or number of parameters.
❖ This allows a programmer to use the same name
for different operations, improving code
readability and making the code more intuitive.
❖ Overloading is often seen in languages like C++,
Java, and Python (though Python uses a different
approach, such as default arguments or variable-
length arguments).
156
OVERLOADED SUBPROGRAMS
void print(int x) { Integer: 5
cout << "Integer: " << x << endl; Double: 3.14
} String: Hello
void print(double x) {
cout << "Double: " << x << endl; Here, the print function
} is overloaded based on
void print(string x) { the type of the argument
cout << "String: " << x << endl; (int, double, string).
}

int main() {
print(5); // Calls print(int)
print(3.14); // Calls print(double)
print("Hello"); // Calls
print(string)
return 0; 157
}
COROUTINES
❖ A coroutine is a generalization of a subroutine (or
function) that allows multiple entry points for
suspending and resuming execution.
❖ Unlike regular functions, which start execution when
called and return when they finish, coroutines can
pause execution and later resume from where they
left off, preserving their state.
❖ This makes coroutines especially useful for tasks that
involve concurrency, asynchronous programming, or
handling long-running operations without blocking
the main thread.
❖ Coroutines are a key feature in many modern
programming languages, particularly in contexts like
asynchronous I/O, event-driven programming, and 158
parallel processing.
COROUTINES
❖ A coroutine is designed to yield control back to
the caller at specific points in its execution.
❖ When control is yielded, the state of the coroutine
is preserved, and when the coroutine is resumed,
it continues from the point where it yielded
control.
❖ Suspension Point: This is where the coroutine
suspends its execution, and it could wait for an
event (e.g., I/O completion).
❖ Resumption: When the event occurs, the
coroutine resumes from where it was suspended.
159
COROUTINES

160
CHAPTER 12
SUPPORT FOR OBJECT
ORIENTED PROGRAMMING
SLIDES COURTESY FROM :
“CONCEPTS OF PROGRAMMING LANGUAGES” –BY ROBERT W.
SEBESTA.
PUBLISHED BY PEARSON EDUCATION, INC. USA. ELEVENTH
EDITION. 2016

Md. Rawnak Saif Adib


161
Lecturer
Department of Computer
Science and Engineering
FUNDAMENTALS OF OOP
❖ Object-Oriented Programming (OOP) is a
programming paradigm centered around the
concept of objects.
❖ Objects are instances of classes, which
encapsulate data (attributes or properties) and
behavior (methods or functions) related to that
data.
❖ OOP is widely used in software development
because it promotes modularity, reusability, and
scalability.

162
FUNDAMENTALS OF OOP
❖ Key Concepts of OOP:
❖ Class: A blueprint or template for creating objects.
class Car:
def __init__(self, make, model):
self.make = make
self.model = model

def start(self):
print(f"{self.make} {self.model} is starting.")
❖ Object: An instance of a class.
my_car = Car("Toyota", "Corolla")
my_car.start() # Output: Toyota Corolla is starting.
163
FUNDAMENTALS OF OOP
❖ Encapsulation: Bundling data and methods together
in a single unit (class). Access to data is often
controlled using access modifiers like private or
protected (e.g., in Python, using a single _ or double
__ before attribute names).
❖ Inheritance: A mechanism to create a new class based
on an existing class. The new class (child class)
inherits attributes and methods from the existing
class (parent class).
❖ Polymorphism: The ability to use the same interface
or method name for different types of objects.
❖ Abstraction: Hiding complex implementation details
164
and exposing only the essential features.
CONSTRUCTOR
❖ A constructor is a special method in object-
oriented programming that is automatically
called when an object of a class is created. Its
main purpose is to initialize the attributes of the
object.

165
CONSTRUCTOR
class Car {
String make;
String model;
// Constructor
Car(String make, String model) {
this.make = make;
this.model = model;
}
void displayInfo() {
System.out.println("This car is a " + make + " " + model);
}
}
public class Main {
public static void main(String[] args) {
Car myCar = new Car("Toyota", "Corolla");
myCar.displayInfo(); // Output: This car is a Toyota Corolla.
}
166
}
INHERITANCE IN JAVA
❖ Inheritance in Java is a mechanism where one class
acquires the properties (fields) and behaviors
(methods) of another class. It promotes code reuse
and establishes a relationship between classes.
❖ Types of Inheritance in Java:
❖ Single Inheritance: A subclass inherits from one
superclass.
❖ Multilevel Inheritance: A subclass inherits from a
class, which itself inherits from another class.
❖ Hierarchical Inheritance: Multiple subclasses
inherit from a single superclass.
❖ Java does NOT support multiple inheritance
with classes to avoid ambiguity (but it is allowed
167
with interfaces).
WHY MULTIPLE INHERITANCE IS NOT
SUPPORTED IN JAVA?

❖ Multiple inheritance is not supported in Java


through classes to avoid ambiguity and
complexity that can arise when two parent
classes have methods with the same name or
signature. This ambiguity is often referred to as
the "Diamond Problem" in inheritance.

168
WHY MULTIPLE INHERITANCE IS NOT
SUPPORTED IN JAVA?

❖ What is the Diamond Problem?


❖ Consider this scenario in a language that allows
multiple inheritance:
❖ Class A defines a method display().

❖ Class B and Class C both inherit from Class A


and override the display() method.
❖ Class D inherits from both Class B and Class C.

❖ Now, if you create an object of Class D and call


the display() method, it’s unclear which display()
method should be executed — the one from Class
B or Class C. 169
WHY MULTIPLE INHERITANCE IS NOT
SUPPORTED IN JAVA?
class A { }
void display() {
System.out.println("Display from// Hypothetical scenario where D
A"); inherits from both B and C
} class D extends B, C { // Not allowed in
} Java
class B extends A { // Ambiguity: Should D inherit B's
display() or C's display()?
void display() {
}
System.out.println("Display from
B");
} public class Main {
} public static void main(String[] args) {
D obj = new D();
class C extends A { obj.display(); // Which display()?
B's or C's?
void display() {
}
System.out.println("Display from
C"); } 170
}

You might also like