0% found this document useful (0 votes)
2 views

22_3014_033004_Unit 2 Part 1 Elementary Data Types

The document outlines the key concepts of programming languages, focusing on data types, expressions, control statements, subprograms, and abstract data types. It details various data types including primitive, character string, user-defined ordinal types, and their implementations. Additionally, it discusses descriptors, memory allocation strategies, and the design issues related to arrays and enumeration types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

22_3014_033004_Unit 2 Part 1 Elementary Data Types

The document outlines the key concepts of programming languages, focusing on data types, expressions, control statements, subprograms, and abstract data types. It details various data types including primitive, character string, user-defined ordinal types, and their implementations. Additionally, it discusses descriptors, memory allocation strategies, and the design issues related to arrays and enumeration types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 146

SE Computer

Course Name :Principles of Programming Language


Course Code: 210256
Unit 2
Structuring the Data, Computations and Program
● Elementary Data Types : Primitive data Types, Character String types, User Defined Ordinal Types,
Array types, Associative Arrays, Record Types, Union Types, Pointer and reference Type.
● Expression and Assignment Statements: Arithmetic expression, Overloaded Operators, Type
conversions, Relational and Boolean Expressions, Short Circuit Evaluation, Assignment Statements,
Mixed mode Assignment.
● Statement level Control Statements: Selection Statements, Iterative Statements, Unconditional
Branching.
● Subprograms: Fundamentals of Sub Programs, Design Issues for Subprograms, Local referencing
Environments, Parameter passing methods. Abstract Data Types and Encapsulation Construct: Design
issues for Abstraction, Parameterized Abstract Data types, Encapsulation Constructs, Naming
Encapsulations
Department of Computer Engineering
Topic Book To Refer

Elementary Data Types : Primitive data Sebesta R., "Concepts of Programming


Languages", 4th Edition, Pearson
Types, Character String types, User
Education, ISBN-
Defined Ordinal Types, Array types, 81-7808-161-X.
Associative Arrays, Record Types, Union Page No : 267 - 323
Types, Pointer and reference Type.

Department of Computer Engineering


Topic Book To Refer

Expression and Assignment Sebesta R., "Concepts of Programming


Languages", 4th Edition, Pearson
Statements: Arithmetic expression,
Education, ISBN-
Overloaded Operators, Type conversions, 81-7808-161-X.
Relational and Boolean Expressions, Short Page No : 330-355
Circuit Evaluation, Assignment
Statements, Mixed mode Assignment.

Department of Computer Engineering


Topic Book To Refer

Statement level Control Statements: Sebesta R., "Concepts of Programming


Languages", 4th Edition, Pearson
Selection Statements, Iterative
Education, ISBN-
Statements, Unconditional Branching. 81-7808-161-X.
Page No : 362-390

Department of Computer Engineering


Topic Book To Refer

Subprograms: Fundamentals of Sub Sebesta R., "Concepts of Programming


Languages", 4th Edition, Pearson
Programs, Design Issues for Subprograms,
Education, ISBN-
Local referencing Environments, 81-7808-161-X.
Parameter passing methods. Page No : 402-436

Abstract Data Types and Encapsulation Sebesta R., "Concepts of Programming


Languages", 4th Edition, Pearson
Construct: Design issues for Abstraction,
Education, ISBN-
Parameterized Abstract Data types, 81-7808-161-X.
Encapsulation Constructs, Naming Page No : 492, 508-521
Encapsulations

Department of Computer Engineering


Data type: A data type defines a
collection of data values and
a set of predefined operations on
those values
An object represents an
instance of a user-defined
(abstract data) type
Descriptor

● A descriptor contains two or more long words that describe the standard
data type, size, and address of the data specified by the argument
● Different types of descriptors
○ Fixed-Length Descriptor
○ Dynamic String Descriptor
● The Fixed-Length Descriptor applies to scalar data (non-structured data
type) and fixed-length strings.
● The Dynamic-String Descriptor applies to dynamically allocated strings.
Descriptor

● A descriptor is the collection of the attributes of a variable.


● In an implementation, a descriptor is a collection of memory cells that store
variable attributes.
● If the attributes are static, descriptor are required only at compile time.
● For dynamic attributes, part or all of the descriptor must be maintained during
execution.
● Descriptors are used for type checking and by allocation and deallocation
operations.
Primitive Data Types
Primitive Data Types
● Primitive data types are those that are not defined in terms of other data
types

● Early PLs had only numeric


primitive types, and still play a
central role among the
collections of types supported
by contemporary languages.
Numerical Data Types …Integer
Numerical Data Types …Floating Point
Numerical Data Types …Floating Point
Newer Machines Use IEEE (The Institute of Electrical and Electronics Engineers)
floating-point formats: (a) Single precision, (b) Double precision
Numerical Data Types …Floating Point
Numerical Data Types …Decimal

● Advantage
○ Accuracy
● Disadvantages:
○ Limited range
○ Wastes memory
Boolean Types

Advantage:
Readability
Character Types
Character String Types
Character String Types
● Character string type is one in which the values consist of sequences of characters

● Design issues with the string types

○ Should strings be simply a special kind of character array or a

primitive type?

○ Should strings have static or dynamic length?


Character String Types …. String Operations
Character String Types.. String Length Options
Character String Types
● C and C++
○ not primitive
○ use char arrays and a library of functions that provide operations
● Java : String class (not arrays of char)
○ objects are immutable
○ StringBuffer is a class for changeable string objects
○ String length options
● Static – Python, Java’s String class, C++ standard class library, Ruby’s built-in String
class, and the .NET class library in C# and F#.
● Limited dynamic length – C and C++ ( up to a max length indicated by a null character)
● Dynamic –Perl, JavaScript
Character String Types …Implementation
● Static length - Compile-time descriptor

● Limited dynamic length - may need a run-time descriptor for length (but not in C and

C++ because the end of a string is marked with the null character)

● dynamic length - need simpler run-time descriptor; allocation/deallocation is the

biggest implementation problem


Character String Types …Implementation
Character String Types …Implementation
Dynamic length allocation by three approaches:
Character String Types …Implementation

Dynamic length allocation by three approaches:

● First approach :
○ Using linked list
○ Disadvantage - Extra storage for links and complexity of operations
● Second approach :
○ Store as array of pointers to individual characters allocated in a heap.
○ Disadvantage- Still uses extra memory
Character String Types …Implementation

● Third approach:
○ To store complete strings in adjacent storage cells
○ When a string grows and adjacent storage is not available, a new area of memory
is found that can store the complete new string and the old part is moved to this
area, and the memory cells for the old string are deallocated.
○ This results in faster string operations and requires less storage
○ Disadvantage : = Allocation / deallocation process is slower.
Used Defined Ordinal Types
An ordinal type is one in
which the range of
possible values can be
easily associated with
the set of positive
integers
Enumeration Types
● All possible values, which are named constants, are provided in the definition
● C# example
enum days {mon, tue, wed, thu, fri, sat, sun};
● The enumeration constants are typically implicitly assigned the integer values, 0, 1, …, but
can be explicitly assigned any integer literal in the type’s definition
Design issues

● Is an enumeration constant allowed to appear in more than one type definition, and if so,
how is the type of an occurrence of that constant checked?
● Are enumeration values coerced to integer?
● Any other type coerced to an enumeration type?
Enumeration Types
● In languages that do not have enumeration types, programmers usually simulate them with
integer values.
● E.g. Fortran 77, use 0 to represent blue and 1 to represent red:

INTEGER RED, BLUE

DATA RED, BLUE/0,1/

● Problem:
○ There is no type checking when they are used.
○ It would be legal to add two together. Or they can be assigned any integer value thus
destroying the relationship with the colors.
Enumeration Types-Design
● In C++, we could have

enum colors {red, blue, green, yellow, black};

○ colors myColor = blue, yourColor = red;


● The enumeration values are coerced to int when they are put in integer context.

E.g. myColor++ would assign green to myColor.

● In Java, all enumeration types are implicitly subclasses of the predefined class Enum. They
can have instance data fields, constructors and methods.
Enumeration Types-Design
Java Example
Enumeration days;
Vector dayNames = new Vector();
dayNames.add("Monday");

dayNames.add("Friday");
days = dayNames.elements();
while (days.hasMoreElements())
System.out.println(days.nextElement());
Enumeration Types-Design
● C# enumeration types are like those of C++ except that they are never coerced to integer.
● Operations are restricted to those that make sense.
● The range of values is restricted to that of the particular enumeration type.
Enumeration Types- Evaluation

● Aid to readability, e.g., no need to code a color as a number


● Aid to reliability, e.g., compiler can check: operations (don’t allow colors to be added)
● No enumeration variable can be assigned a value outside its defined range,
○ e.g. if the colors type has 10 enumeration constants and uses 0 .. 9 as its internal
values, no number greater than 9 can be assigned to a colors type variable.
● Ada, C#, and Java 5.0 provide better support for enumeration than C++ because
enumeration type variables in these languages are not coerced into integer types
Enumeration Types- Evaluation
● C treats enumeration variables like integer variables; it does not provide either of the two
advantages.
● C++ is better. Numeric values can be assigned to enumeration type variables only if they are
cast to the type of the assigned variable. Numeric values are checked to determine in they are in
the range of the internal values. However if the user uses a wide range of explicitly assigned
values, this checking is not effective.
○ E.g. enum colors {red = 1, blue = 100, green = 100000}
○ A value assigned to a variable of colors type will only be checked to determine whether it
is in the range of 1..100000.
● Java 5.0, C# and Ada are better, as variables are never coerced to integer types
An ordinal type is one in
which the range of
possible values can be
easily associated with
the set of positive
integers
Subrange Type
An ordered contiguous subsequence of an Ada’s design
ordinal type type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
● Not a new type, but a restricted subtype Weekdays is Days range Mon..Fri;
existing type subtype Index is Integer range 1..100;
Day1: Days;
Example: 12..18 is a subrange of
Day2: Weekday;
integer type
Day2 := Day1; //legal if Day1 it not set to Sat or
Introduced by Pascal and included in Sun
Ada
Subrange Type -Evaluation
● Aid to readability
○ Make it clear to the readers that variables of subrange can store only
certain range of values
● Reliability
○ Assigning a value to a subrange variable that is outside the specified
range is detected as an error.

Implementation of User-Defined Ordinal Types


● Enumeration types are implemented as integers
● Subrange types are implemented like the parent types with code inserted (by the
compiler) to restrict assignments to subrange variables
Array Type
An array is an aggregate of homogeneous data elements in which an individual
element is identified by its position in the aggregate, relative to the first
element.
Array Type- Design Issues
● What types are legal for subscripts?
● Are subscripting expressions in element references range checked?
● When are subscript ranges bound?
● When does allocation take place?
● Are ragged or rectangular multi dimensioned arrays allowed, or both?
● Can arrays be initialized when they have their storage allocated?
● Are any kind of slices allowed?
Array and Indexes
● Indexing (or subscripting) is a mapping from indices to element
○ array_name (index_value_list) -> element
● Index Syntax
○ FORTRAN, PL/I, Ada use parentheses , Sum-:= Sum +B(I);
○ Ada explicitly uses parentheses to show uniformity between array references and
function calls because both are mappings
○ Most other languages use brackets
○ E.g. List(27) direct reference to the List array’s element with the subscript 27.
Array and Indexes

Language Index Type

FORTRAN , C, Java Integer

PASCAL Any Ordinal Type


(Integer, Boolean, Enumeration,
Character)

Ada Integer Or Enumeration


Array and Indexes
● C, C++, Perl, and Fortran do not specify range checking
● Java, ML, C# specify range checking
Subscript Bindings and Array Categories
● Binding is usually static
● Subscript value ranges are dynamically bound
● Lower bound are implicit
● In C based languages- lower bound of all index ranges is fixed at 0;
● Fortran 95 defaults to 1
● In some other languages, ranges must be completely specified by the programmer
● Array occurs in 5 categories
Flip ClassRoom Activity Memory Allocation :
Stack and Heap Gap Analysis
Memory Allocation : Stack and Heap Gap Analysis
Memory Allocation : using Stack Gap Analysis
Memory Allocation : using Heap Gap Analysis
Categories of Arrays
Categories of Arrays

● Static: subscript ranges are statically bound and storage allocation is static
(before run‐ time)
● Advantage: efficiency (no dynamic allocation/deallocation required)
● Example: In C and C++ arrays that include the static modifier are static
● static int myarray[3] = {2, 3, 4};
● int static_array[7];
Subscript Bindings and Array Categories fixed stack‐dynamic
● subscript ranges are statically bound, but the allocation is done at declaration time
● Advantage: space efficiency
● Example: arrays without static modifier are fixed stack ‐dynamic
● int array[3] = {2, 3, 4};
● E.g. void foo()
{
int fixed_stack_dynamic_array[7];
/* ... */
}
Subscript Bindings and Array Categories Stack‐dynamic
● Stack‐dynamic: subscript ranges are dynamically bound and the storage
allocation is dynamic (done at run‐ time)
● Advantage: flexibility (the size of an array need not be known until the array is to
be used)
void foo(int n)
● Example: In Ada, you can use stack‐dynamic arrays as {
int stack_dynamic_array[n];
Get(List_Len);
declare List: array (1..List_Len) of Integer /* ... */ }

begin ... end;


Subscript Bindings and Array Categories fixed heap_dynamic_array
● similar to fixed stack‐ dynamic: storage binding is dynamic but fixed after
allocation (i.e., binding is done when requested & storage is allocated from heap,
not stack)
● Example: In C/C++, using malloc/free to allocate/deallocate memory from the
heap
● Java has fixed heap dynamic arrays int *
● C# includes a second array class ArrayList that fixed_heap_dynamic_arr
provides fixed heap‐dynamic ay = malloc(7 *
sizeof(int));
Subscript Bindings and Array Categories heap_dynamic_array
● Binding of subscript ranges and storage allocation is dynamic and can change
any number of times
● Advantage: flexibility (arrays can grow or shrink during program execution)
● Examples: Perl, JavaScript, Python, and Ruby support heap‐dynamic arrays
● Perl: @states = (“Idaho",“Washington",“Oregon");
● Python: a = [1.25, 233, 3.141519, 0, ‐1]

void foo(int n) {
int * heap_dynamic_array = malloc(n * sizeof(int));
}
Static int static_array[7];

void foo()
fixed stack‐dynamic
{ int fixed_stack_dynamic_array[7]; }

void foo(int n)
Stack‐dynamic
{ int stack_dynamic_array[n];}

int * fixed_heap_dynamic_array = malloc(7 * sizeof(int));


fixed heap_dynamic_array

void foo(int n)
heap_dynamic_array
{ int * heap_dynamic_array = malloc(n * sizeof(int)); }
Subscript Bindings and Array Categories
Array Initialization
Some language allow initialization at the time of storage allocation

● Fortran
List (3) Data List /0, 5, 5/ // List is initialized to the values
● C, C++, Java, C# example
int list [] = {4, 5, 7, 83}
● Character strings in C and C++
char name [] = “freddie”; // eight elements, including last element as null
character
● Arrays of strings in C and C++
char *names [] = {“Bob”, “Jake”, “Joe”];
Array Initialization
● Some language allow initialization at the time of storage allocation

● Java initialization of String objects


String[] names = {“Bob”, “Jake”, “Joe”};
● Ada positions for the values can be specified:
List : array (1..5) of Integer := (1, 3, 5, 7, 9);
Bunch : array (1..5) of Integer:= (1 => 3, 3 => 4, others => 0);
Note: the array value is (3, 0, 4, 0, 0)
● Concatenation. One array can be appended to another array using the & operator provided that they are of the same type.
Array Operations
Language Array Operation

C-based No Operations , only through methods

C#, Perl Array Assignments

Ada Assignment, Concatenation(&), Comparison

Python Concatenation(+), Element


Membership(in), Comparison(==, is)

FORTRAN Matrix Multiplication, Transpose

APL(Most powerful) +.* ,


Rectangular and Jagged Arrays

Language supported: Language supported:


● Fortran ● Java
● Ada ● C
● C# ● C++
● C#
Slice

● A slice is some substructure of an array; nothing more than a


referencing mechanism
● Slices are only useful in languages that have array operations
Slice

● Fortran Declaration:
○ Vector(3:6) Four element from third to sixth
○ Mat(1:3,2) Second Column of Mat
○ Mat(3, 1:3) Third row of Mat
○ Vector(2:10:2) Complex size , second, fourth ,sixth, eight, tenth
○ Vector(/3,2,1,8/)
Slice
Implementation of array Types
Implementation of array Types
Address Calculation in One Dimensional Array:

Address of A [ I ] = B + W * ( I – LB )

Where,
B = Base address
W = Storage Size of one element stored in the array (in byte)
I = Subscript of element whose address is to be found
LB = Lower limit / Lower Bound of subscript, if not specified assume 0 (zero)
Implementation of array Types
Address of A [ I ] = B + W * ( I – LB )
● I=3 , B=1100 W=4 LB=0
● A[3]= 1100+4*(3-0)
● A[3]= 1100+12 =1112
Implementation of array Types
Address Calculation in Double (Two) Dimensional Array:
Implementation of array Types
Implementation of array Types
Address Calculation in Double (Two) Dimensional Array:

● B = Base address
● I = Row subscript of element whose address is to be found
● J = Column subscript of element whose address is to be found
● W = Storage Size of one element stored in the array (in byte)
● Lr = Lower limit of row/start row index of matrix, if not given assume 0 (zero)
● Lc = Lower limit of column/start column index of matrix, if not given assume 0 (zero)
● M = Number of row of the given matrix
● N = Number of column of the given matrix
Implementation of array Types
Address Calculation in Double (Two) Dimensional Array:

Row Major System:

Address of A [ I ][ J ] = B + W * [ N * ( I – Lr ) + ( J – Lc ) ]

Column Major System:

Address of A [ I ][ J ] Column Major Wise = B + W * [( I – Lr ) + M * ( J – Lc )]


Implementation of array Types
Address Calculation in Double (Two) Dimensional Array:

Row Major System:

Address of A [ I ][ J ] = B + W * [ N * ( I – Lr ) + ( J – Lc ) ]

Column Major System:

Address of A [ I ][ J ] Column Major Wise = B + W * [( I – Lr ) + M * ( J – Lc )]


Implementation of array Types
Usually number of rows and columns of a matrix are given ( like A[20]
[30] or A[40][60] ) but

if it is given as A[Lr- – – – – Ur, Lc- – – – – Uc]. In this case number of


rows and columns are calculated using the following methods:

Number of rows (M) will be calculated as = (Ur – Lr) + 1


Number of columns (N) will be calculated as = (Uc – Lc) + 1
Implementation of array Types

An array X [-15……….10, 15……………40] requires one


byte of storage.

If beginning location is 1500 determine the location of X


[15][20]
Compile-Time Descriptors
Associative Array
Associative Array

● An associative array is an unordered collection of data elements


that are indexed by an equal number of values called keys
● Also known as Hash tables
○ Index by key (part of data) rather than value
○ Store both key and value (take more space)
○ Best when access is by data rather than index
Associative Array

Design Issues

● What is the form of references to elements?


● Is the size static or dynamic?
Associative Array

Design Issues

● What is the form of references to elements?


● Is the size static or dynamic?
Associative Array - Perl
The reference a particular value you do:
%lookup =
$lookup{"dave"}
("dave", 1234,
new elements by assignments to new keys.
“peter", 3456,
$lookup{"adam"} = 3845
"andrew", 6789);

new assignments to old keys also:

# change dave's code

$lookup{"dave"} = 7634;
Associative Array

● In the case of non-associative arrays, the indices never need to be


stored. However, in an associative array the user defined keys must
be stored in the structure. So each elements of an associative array
is in fact a pair of entities, a key and a value.
● Associative arrays are supported by Perl, Python, Ruby, and by the
standard class libraries of Java, C++, and C#.
Associative Array

● In Perl, associative arrays are often called hashes, because in the


implementation their elements are stored and retrieved with
hash functions.
● The each space for Perl hashes is distinct.
● Each hash variable must begin with a percentage sign (%).
Associative Array

● Hashes can be set to literal values with the assignment statement

%salaries = (“Gagan” => 7500, “Pavan” =>5700, “Meenu” =>5575,


“Chetan” => 4785);

● A new element can be added : $ salaries{“rina”=>8500}


● An element can be removed: delete $salaries {“Gagan”}
● An entire hash can be empty : @salaries = ( );
Associative Array
● The size of a Perl hash is dynamic.
● That is, it grows when a new element is added, and shrink when an
element is delete, and also when it is emptied by assignment of the
empty literal.
● The exists operator returns true or false, depending on whether its
operand key is an element in hash.
● For example,

If (exists $salaries {“Shalu”})…


Associative Array
● The size of a Perl hash is dynamic.
● That is, it grows when a new element is added, and shrink when an
element is delete, and also when it is emptied by assignment of the
empty literal.
● The exists operator returns true or false, depending on whether its
operand key is an element in hash.
● For example,

If (exists $salaries {“Shalu”})…


Implementing Associative Array
● Perl uses a hash function for fast lookups but is optimized for fast
reorganization
○ Uses a 32 bit hash value for each entry but only a few bits are used for small
arrays
○ To double the array size use one more bit and move half the existing
entries
● PHP also uses a hash function but stores arrays in a linked list for easy traversal
○ An array with both associative and numeric indices can develop gaps in the
numeric sequence
Record Type
Record Type
Definition of Records in COBOL
COBOL uses level numbers to show nested records; others use recursive definition

01 EMP-REC.

02 EMP-NAME.

05 FIRST PIC X(20).

05 MID PIC X(10).

05 LAST PIC X(20).

02 HOURLY-RATE PIC 99V99.


Definition of Records in Ada
COBOL uses level numbers to show nested records; others use recursive definition
Record structures are indicated in an orthogonal way
type Emp_Rec_Type is record
First: String (1..20);
Mid: String (1..10);
Last: String (1..20);
Hourly_Rate: Float;
end record;
Emp_Rec: Emp_Rec_Type;
References to Records
● Record field references

1. COBOL : field_name OF record_name_1 OF ... OF record_name_n

2. Others (dot notation) :record_name_1.record_name_2. ... record_name_n.field_name

● Fully qualified references must include all record names


● Elliptical references allow leaving out record names as long as the reference is unambiguous,
● for example in COBOL

FIRST, FIRST OF EMP-NAME, and FIRST of EMP-REC are elliptical references to the
employee’s first name
References to Records
struct { fully qualified reference: lists all

int age; intermediate names as in

struct { person.name.first

char *first; elliptical: may omit some of them if


unambiguous (COBOL,PL/1) as in

char *last; first of person

} name;
Operations on Records
Operations on Records
Implementation of Record Type

Offset address relative to the beginning of


the records is associated with each field
Union
● A union is a type whose variables are allowed to store
different type values at different times during
execution
● Design issues
○ Should type checking be required?
○ Should unions be embedded in records?
Discriminated vs. Free Unions
● Fortran, C, and C++ provide union constructs in which
there is no language support for type checking; the union in
these languages is called free union
● Type checking of unions require that each union include a
type indicator called a discriminant
● Supported by Ada
Ada Union Types
type Shape is (Circle, Triangle, Rectangle); Declare Variable
type Colors is (Red, Green, Blue);
type Figure (Form: Shape) is record
Figure_1: Figure;
Filled: Boolean; Figure_2:Figure(Form => Triangle)
Color: Colors;
case Form is
when Circle => Diameter:
Figure_1:= (filled=>True,
Float; color=Blue,
when Triangle =>
Form=>Rectangle
Leftside,
Rightside: Integer; Side1=>12
Angle: Float;
Side2=>3
when Rectangle => Side1,
Side2: Integer; )
end case;
Ada Union Type Illustrated
Pointer and reference type
● A pointer type variable has a range of values that consists of
memory addresses and a special value, nil
● Provide the power of indirect addressing
● Provide a way to manage dynamic memory
● A pointer can be used to access a location in the area where
storage is dynamically created (usually called a heap)
Design Issues of Pointers

● What are the scope of and lifetime of a pointer variable?


● What is the lifetime of a heap-dynamic variable?
● Are pointers restricted as to the type of value to which they can
point?
● Are pointers used for dynamic storage management, indirect
addressing, or both?
● Should the language support pointer types, reference types, or both?
Pointer Operations

● Two fundamental operations: assignment and dereferencing


● Assignment is used to set a pointer variable’s value to some useful address
● Dereferencing yields the value stored at the location represented by the
pointer’s value
● Dereferencing can be explicit or implicit
● C++ uses an explicit operation via *
j = *ptr
sets j to the value located at ptr
Pointer Assignment Illustrated

The
assignment
operation
j = *ptr
Dangling Pointer
Dangling pointers (dangerous)
A pointer pointing to data that does not exist anymore is called a
dangling pointer.

d is a dangling pointer.
Dangling Pointer
#include <stdio.h>
int main()
{
int *ptr=(int *)malloc(sizeof(int));
int a=560;
ptr=&a;
free(ptr);
return 0;
}
1st Solution to Dangling Pointer
If we assign the NULL value to the 'ptr',
then 'ptr' will not point to the deleted
memory. Therefore, we can say that ptr
is not a dangling pointer, as shown in the
below image:
Lost Heap dynamic Variable
● An allocated heap-dynamic variable that is no longer
accessible to the user program (often called garbage)
● Pointer p1 is set to point to a newly created heap-dynamic
variable
● Pointer p1 is later set to point to another newly created
heap-dynamic variable
● The process of losing heap-dynamic variables is called
memory leakage
Lost Heap dynamic Variable
Pointer Arithmetic in C and C++

float stuff[100];
float *p;
p = stuff;

*(p+5) is equivalent to stuff[5] and p[5]


*(p+i) is equivalent to stuff[i] and p[i]
Reference Type
● C++ includes a special kind of pointer type called a reference type
that is used primarily for formal parameters
○ Advantages of both pass-by-reference and pass-by-value
● Java extends C++’s reference variables and allows them to
replace pointers entirely
○ References are references to objects, rather than being
addresses
● C# includes both the references of Java and the pointers of C++
Reference Type

● Dangling pointers and dangling objects are problems as is heap


management
● Pointers are like goto's--they widen the range of cells that can be
accessed by a variable
● Pointers or references are necessary for dynamic data structures--
so we can't design a language without them
int result = 0;
int &ref_result = result; result and
. . . ref_result are
ref_result = 100;
Cout<<result; //100 aliases.
Pointer Vs Reference
When to use What : Pointer Vs Reference

Use references

● In function parameters and return types.

Use pointers:

● Use pointers if pointer arithmetic or passing NULL-pointer is needed.

For example for arrays (Note that array access is implemented using pointer arithmetic).

● To implement data structures like linked list, tree, etc and their algorithms because to
point different cell, we have to use the concept of pointers.
Implementation of Pointers and Reference Types
Representations of Pointers

● Large computers use single values


● Intel microprocessors use segment and offset
Dangling Pointer Problem

● Tombstone: extra heap cell that is a pointer to the heap-dynamic


variable
● The actual pointer variable points only at tombstones
● When heap-dynamic variable de-allocated, tombstone remains
but set to null
● Costly in time and space
int * pointer1 = new int(5);
int * pointer2 = pointer1;

111222 112233 445566


112233 445566
5
pointer1 Tombstone Dynamic heap variable

111333
112233
pointer2

delete pointer1;
111222 112233 445566
112233 NULL
5
pointer Tombstone Dynamic heap variable

111333 Costly in time and space


112233
access to tombstone that equals to null
pointer2 result in run-time error
Dangling Pointer Problem Solution… Lock and Keys

● When memory is allocated, space for one more cell called the lock cell is
allocated. This cell is assigned a value.
● The pointer which points to this memory is stored as a tuple containing an
address and a key.
● A pointer is allowed to access the memory only if the values of the lock and
keys match.
● Otherwise, it will throw a runtime error.
Dangling Pointer solution using
Locks-and-keys:
• Int * x = new int (10);
• Int *y=x;
• delete y;
Key address Lock allocated memory

x 5 0x12ff66 5 10

y 5 0x12ff66

delete y;

x 5 0x12ff66 67 10
If they match, the access is legal; otherwise the access is
treated as a run-time error. This approach is save as
y 5 0x12ff66 it doesn’t access other program data, as the heap
dynamic variable may be assigned to another program.
Dangling Pointer solution using
Locks-and-keys:

Soon after deallocation, we


assign a NULL value to the lock.
Now, if we try to access the
memory using a or b, the values
of the lock and the keys do not
match and runtime error is
thrown. This prevented the
dangling pointers from accessing
the memory location.
Heap Management

● A very complex runtime process


● Single-size cells vs. variable-size cells
● Two approaches to reclaim garbage
○ Reference counters (eager approach): reclamation is
gradual
○ Mark-sweep (lazy approach): reclamation occurs when
the list of variable space becomes empty
Heap Management
Reference Counter

● Reference counters: maintain a counter in every cell that store


the number of pointers currently pointing at the cell
● Disadvantages: space required, execution time required,
complications for cells connected circularly
● Advantage: it is intrinsically incremental, so significant delays in the
application execution are avoided
Heap Management
1-Reference Counter :

count
char * p1 = new char (111); P1 2 111
char * p2=p1; Dynamic heap variable
P2
count

P1 1 222

P1= new char (222); count

P2 1 111
count

P1 1 222
count

P2= new char (333); P2 1 333


count
Will be moved to avail_List
0 111
1- Reference Counter :

count
char * p1 = new char (111); P1 2 111
char * p2=p1; Dynamic heap variable
P2

count
P1 0 111
Dynamic heap variable
delete p1; P2

Will be moved to avail_List

If the reference counter reaches zero, it means that no program pointers are pointing at the cell, and it
has thus become garbage and can be returned to the list of available space.
1-Reference Counter
// STEP 1
Object a = new Integer (100);
Object b = new Integer (99);
// Step2
a=b;

Possible implementation for reference counter code for a=b; is


if (a != b)
if (a != b)
{
{
if (a != null)
if (a != null)
if (--a.refCount ==
--a.refCount;
a = b; 🡺 0)
heap.release (a);
if (a != null)
a = b;
++a.refCount;
if (a != null)
}
++a.refCount;
1-Reference Counter – problem

public void buidDog() { After completing


Dog newDog = new Dog(); execution of buildDog
Tail newTail = new Tail(); method the heap will
be like this
newDog.tail = newTail;
newTail.dog = newDog;
}

Main()
{

buidDog();
….
}
Reference
counting does not
detect garbage One of the solution
with cyclic is to use mark and
references. sweep alg.
1-Reference Counter – problem

Reference Counting fails to reclaim circular structures

Python uses reference counting and offers cycle detection as


well.
Heap Management

reclaim garbage

Mark-sweep
Reference counters (lazy approach)
(eager approach): reclamation occurs
reclamation is when the list of
gradual available space
becomes empty
Mark-Sweep
● The run-time system allocates storage cells as requested and disconnects pointers
from cells as necessary; mark-sweep then begins
○ Every heap cell has an extra bit used by collection algorithm

○ All cells initially set to garbage


○ All pointers traced into heap, and reachable cells marked as not garbage
○ All garbage cells returned to list of available cells
● Disadvantages: in its original form, it was done too infrequently. When done, it caused
significant delays in application execution. Contemporary mark-sweep algorithms avoid
this by doing it more often—called incremental mark-sweep
Mark-Sweep
● Mark - In this phase, objects which are reachable from the program are marked
as reachable
● Sweep phase- This phase is used to clean up all the objects which weren’t
marked in the Mark phase..
Mark-Sweep
Mark-Sweep
Mark-Sweep
Mark-Sweep
Mark-Sweep

You might also like