0% found this document useful (0 votes)
21 views

Datatypes

Each variable in C has an associated data type. It specifies the type of data that the variable can store like integer, character, floating, double, etc. Each data type requires different amounts of memory and has some specific operations which can be performed over it. The data type is a collection of data with values having fixed values, meaning as well as its characteristics.

Uploaded by

Jeya Perumal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Datatypes

Each variable in C has an associated data type. It specifies the type of data that the variable can store like integer, character, floating, double, etc. Each data type requires different amounts of memory and has some specific operations which can be performed over it. The data type is a collection of data with values having fixed values, meaning as well as its characteristics.

Uploaded by

Jeya Perumal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Data types in C

T.JEYA.,ASSISTANT PROFESSOR.,
DEPARTMENT OF COMPUTER SCIENCE,
SAC WOMEN’S COLLEGE,CUMBUM.
Data types in C
Only really four basic types:
 char
 int (short, long, long long, unsigned)
 float
 double

Size of these types on Type Size (bytes)

CLEAR machines: char 1


int 4
short 2
long 8
Sizes of these types
long long 8
vary from one machine float 4
to another! double 8
Characters (char)

Roman alphabet, punctuation, digits, and


other symbols:
 Encoded within one byte (256 possible symbols)
 ASCII encoding (man ascii for details)

In C:

char a_char = ’a’;


char newline_char = ’\n’;
char tab_char = ’\t’;
char backslash_char = ’\\’;
ASCII
Special
control
characters From “man ascii”:

| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|


| 8 BS | 9 HT | 10 NL | 11 VT | 12 NP | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |
| 32 SP | 33 ! | 34 " | 35 # | 36 $ | 37 % | 38 & | 39 ' |
| 40 ( | 41 ) | 42 * | 43 + | 44 , | 45 - | 46 . | 47 / |
| 48 0 | 49 1 | 50 2 | 51 3 | 52 4 | 53 5 | 54 6 | 55 7 |
| 56 8 | 57 9 | 58 : | 59 ; | 60 < | 61 = | 62 > | 63 ? |
| 64 @ | 65 A | 66 B | 67 C | 68 D | 69 E | 70 F | 71 G |
| 72 H | 73 I | 74 J | 75 K | 76 L | 77 M | 78 N | 79 O |
| 80 P | 81 Q | 82 R | 83 S | 84 T | 85 U | 86 V | 87 W |
| 88 X | 89 Y | 90 Z | 91 [ | 92 \ | 93 ] | 94 ^ | 95 _ |
| 96 ` | 97 a | 98 b | 99 c |100 d |101 e |102 f |103 g |
|104 h |105 i |106 j |107 k |108 l |109 m |110 n |111 o |
|112 p |113 q |114 r |115 s |116 t |117 u |118 v |119 w |
|120 x |121 y |122 z |123 { |124 | |125 } |126 ~ |127 DEL|
Characters are just numbers
What does this function do?
return type argument type
char and name
procedure fun(char c)
name {
local variable char new_c;
comparisons
type and name
with characters!
if ((c >= ’A’) && (c <= ’Z’))
Math on
new_c = c - ’A’ + ’a’;
characters!
else
new_c = c;

return (new_c);
}
Integers

Fundamental problem:
 Fixed-size representation can’t encode all numbers

Standard low-level solution:


 Limit number range and precision
• Usually sufficient
• Potential source of bugs

Signed and unsigned variants


 unsigned modifier can be used with any sized
integer (short, long, or long long)
Integer Representations
1111 0000 0001
Base 2 Base 16 Unsigned 2’s Comp.
0000 0 0 0 1110 0 0010
F 1
0001 1 1 1
0010 2 2 2 E 15 0 1 2
0011 3 3 3
1101 14 -1 0 1 2 0011
0100 4 4 4 D 3
0101 5 5 5 13 -2 2 3
0110 6 6 6 -3 3
0111 7 7 7 1100 C 12 -4 4 4 4 0100
1000 8 8 -8 -5 5
1001 9 9 -7 11 5
B -6 6 5
1010 A 10 -6 -7 -8 7
1011 10 6 0101
1011 B 11 -5
1100 C 12 -4 A 9 8 7 6
1101 D 13 -3
1110 E 14 -2 1010 9 8 7 0110
1111 F 15 -1
1001 1000 0111

Why one more negative than positive?


Integer Representations
Base 2 Base 16 Unsigned 2’s Comp. Math for n bits:
0000 0 0 0
0001 1 1 1 
0010 2 2 2 Define x  xn 1  x0
0011 3 3 3
0100 4 4 4
 n 1 i
B 2U ( x )   2 xi
0101 5 5 5
0110 6 6 6
0111 7 7 7 i 0
1000 8 8 -8
n2

B 2T ( x )  2 xn 1   2i xi
1001 9 9 -7 n 1
1010 A 10 -6
1011 B 11 -5 i 0
1100 C 12 -4
1101 D 13 -3
sign bit
1110 E 14 -2
0=non-negative
1111 F 15 -1
1=negative
Integer Ranges
Unsigned
UMinn … UMaxn = 0 … 2n-1:
32 bits: 0 ... 4,294,967,295 unsigned int
64 bits: 0 ... 18,446,744,073,709,551,615 unsigned long int

2’s Complement
TMinn … TMaxn = -2n-1 … 2n-1-1:
32 bits: -2,147,483,648 ... 2,147,483,647 int
64 bits: -9,223,372,036,854,775,808 … 9,223,372,036,854,775,807 long int

Note: C numeric ranges are platform dependent!


#include <limits.h> to define ULONG_MAX, UINT_MIN, INT_MAX, …
Bit Shifting as Multiplication

Shift left (x << 1) multiplies by 2:

0 0 1 1 =3 1 1 0 1 = -3

0 1 1 0 =6 1 0 1 0 = -6

Works for unsigned, 2’s complement


Can overflow

In decimal, same idea multiplies by 10: e.g., 42  420


Bit Shifting as Division
Logical shift right (x >> 1) divides by 2 for unsigned:

0 1 1 1 =7 1 0 0 1 =9

0 0 1 1 =3 0 1 0 0 =4
Always rounds down!

Arithmetic shift right (x >> 1) divides by 2 for 2’s complement:

0 1 1 1 =7 1 0 0 1 = -7

0 0 1 1 =3 1 1 0 0 = -4
Always rounds down!
Bit Shifting for Multiplication/Division

Why useful?
 Simpler, thus faster, than general multiplication &
division
 Standard compiler optimization

Can shift multiple positions at once:


 Multiplies or divides by corresponding power-of-2
 a << 5 a >> 5
A Sampling of Integer Properties

For both unsigned & 2’s complement:


Mostly as usual, e.g.: Some surprises, e.g.:
0 is identity for +, - ÷ doesn’t distribute over +, -
1 is identity for ×, ÷  (a,b > 0  a + b > a)
+, -, × are associative
+, × are commutative
× distributes over +, -

Why should you care?


– Programmer should be aware of behavior of their programs
– Compiler uses such properties in optimizations
Beware of Sign Conversions in C
Beware implicit or explicit conversions between
unsigned and signed representations!

One of many common mistakes:

?
unsigned int u;
… ? What’s wrong?
if (u > -1) …

Always false(!) because -1 is converted to unsigned, yielding UMax n


Non-Integral Numbers: How?

Fixed-size representations
 Rational numbers (i.e., pairs of integers)
 Fixed-point (use integer, remember where point is)
 Floating-point (scientific notation)

Variable-size representations
 Sums of fractions (e.g., Taylor-series)
 Unbounded-length series of digits/bits
Floating-point

Binary version of scientific notation

1.001101110 × 25 = 100110.1110
= 32 + 4 + 2 + 1/2 + 1/4 +
1
/8
= 38.875

-1.011 × 2-3 = -.001011


= - (1/8 + 1/32 + 1/64)
= -.171875

binary point
FP Overflow & Underflow
Fixed-sized representation leads to limitations

Large positive exponent.


Unlike integer arithmetic, overflow 
imprecise result (), not inaccurate result

Round Round
to - Zero to +

Negative Expressible Negative Positive Expressible Positive


overflow negative values underflow underflow positive values overflow

Large negative exponent


Round to zero
FP Representation

1.001101110 × 25

significand exponent

Fixed-size representation
 Using more significand bits  increased precision
 Using more exponent bits  increased range

Typically, fixed # of bits for each part, for


simplicity
FP Representation: IEEE 754

Current standard version of floating-point

Single-precision (float)
One word: 1 sign bit, 23 bit fraction, 8 bit exponent
Positive range: 1.17549435 × 10-38 … 3.40282347 × 10+38

Double-precision (double)
Two words: 1 sign bit, 52 bit fraction, 11 bit exponent
Positive range: 2.2250738585072014 × 10-308 … 1.7976931348623157
× 10+308

Lots of details in B&O Chapter 2.4


IEEE 754 Special Numbers

+0.0, -0.0
+, -
NaN: “Not a number”

(+1.0 × 10+38)2 = + +0.0  +0.0 = NaN


+1.0  +0.0 = + + - + = NaN
+1.0  -0.0 = - 1 = NaN
FP vs. Integer Results

int i = 20 / 3;
float f = 20.0 / 3.0;

True mathematical answer: 20  3 = 6 2/3

i= ? 6 Integer division ignores remainder

f= ? 6.666667 FP arithmetic rounds result


FP vs. Integer Results

int i = 1000 / 6;
float f = 1000.0 / 6.0;

True mathematical answer: 1000  6 = 166 2/3

i= ? 166 Integer division ignores remainder

f= ? 166.666672 FP arithmetic rounds result

Surprise!
Arithmetic in binary, printing in decimal –
doesn’t always give expected result
FP  Integer Conversions in C

#include <limits.h>
#include <stdio.h>

int
main(void)
{
unsigned int ui = UINT_MAX;
float f = ui;
printf(“ui: %u\nf: %f\n”, ui, (double)f);
}

Surprisingly, this program print the following. Why?

ui: 4294967295
f: 4294967296.000000
FP  Integer Conversions in C

int i = 3.3 * 5;
float f = i;

True mathematical answer: 3.3  5 = 16 ½

i= ? 16 Converts 5  5.0 – Truncates result 16 ½  16

f= ? 16.0
integer  FP: FP  integer:
Can lose precision Truncate fraction
If out of range,
Rounds, if necessary
undefined – not error
32-bit int fits in
double-precision FP
FP Behavior
Programmer must be aware of accuracy limitations!
Dealing with this is a subject of classes like CAAM 453

(1010 + 1030) + –1030 =? 1010 + (1030 + –1030)


1030 – 1030 =? 1010 + 0
0  1010

Operations not associative!


(1.0 + 6.0) ÷ 640.0 =? (1.0 ÷ 640.0) + (6.0 ÷ 640.0)
7.0 ÷ 640.0 =? .001563 + .009375
.010937  .010938

×,÷ not distributive across +,-


Booleans

One bit representation


 0 is false
 1 is true

One byte or word representation


 Inconvenient to manipulate only one bit
 Two common encodings:

0000…0000 is false 0000…0000 is false


0000…0001 is true all other words are true
all other words are garbage

 Wastes space, but space is usually cheap


Booleans in C

#include <stdbool.h>
Important!
bool bool1 = true; Compiler needs this or it
won't know about "bool"!
bool bool2 = false;

bool added to C in 1999

Many programmers had already defined their


own Boolean type
 To avoid conflict bool is disabled by default
C’s Common Boolean Operations
C extends definitions to integers
 Booleans are encoded as integers
• 0 == false
• non-0 == true
 Logical AND: 0 && 4 == 0 3 && 4 == 1 3 && 0 == 0
 Logical OR: 0 || 4 == 1 3 || 4 == 1 3 || 0 == 1
 Logical NOT: ! 4 == 0 ! 0 == 1

&& and || short-circuit


 Evaluate 2nd argument only if necessary
 E.g., 0 && error-producing-code == 0
Enumerated Types

E.g., a Color = red, blue, black, or yellow


 Small (finite) number of choices
 Booleans & characters are common special cases
 Pick arbitrary bit patterns for each

Not enforced in C
 Actually just integers
 Can assign values outside of the enumeration
 Could cause bugs
Enumerated Types in C
enum Color { RED, WHITE, BLACK, YELLOW };
enum Color my_color = RED;

The new type name is


Alternative style: “enum Color”
enum AColor { COLOR_RED, COLOR_WHITE,
COLOR_BLACK, COLOR_YELLOW };
typedef enum AColor color_t;
color_t my_color = COLOR_RED;

Pre-C99 Boolean definition:


enum Bool { false = 0, true = 1 };
typedef enum Bool bool;
bool my_bool = true;

Cox Simple Data Types 30

You might also like