CSC 121 Computers and Scientific Thinking David Reed Creighton University
CSC 121 Computers and Scientific Thinking David Reed Creighton University
Data Representation
2.
analog signals represent data in a way that is analogous to real life signals can vary continuously across an infinite range of values e.g., frequencies on an old-fashioned radio with a dial digital signals utilize only a finite set of values e.g., frequencies on a modern radio with digital display
the major tradeoff between analog and digital is variability vs. reproducibility
analog allows for a (potentially) infinite number of unique signals, but they are harder to reproduce good for storing data that is highly variable but does not need to be reproduced exactly digital signals limit the number of representable signals, but they are easily remembered and reproduced good for storing data when reproducibility is paramount
Binary Numbers
modern computers save and manipulate data as discrete (digital) values
the most effective systems use two distinct binary states for data representation in essence, all data is stored as binary numbers
in the binary number system, all values are represented using only the two binary digits 0 and 1, which are called bits
binary representation
Decimal Binary
algorithm for converting from decimal (D) to binary (B):
Representing Integers
when an integer value must be saved on a computer, its binary equivalent can be encoded as a bit pattern and stored digitally
usually, a fixed size (e.g., 32 bits) is used for each integer so that the computer knows where one integer ends and another begins
the initial bit in each pattern acts as the sign bit (0=positive, 1=negative) negative numbers are represented in twos complement notation
the "largest" bit pattern corresponds to the smallest absolute value (-1)
real numbers stored in this format are known as floating point numbers, since the decimal point moves (floats) to normalize the fraction
standard formats exist for storing real numbers, using either 32 bits (single precision) or 64 bits (double precision) most programming languages represent integers and reals differently JavaScript simplifies things by using IEEE double-precision floating point for all numbers 6
Representing Characters
characters have no natural correspondence to binary numbers
computer scientists devised an arbitrary system for representing characters as bit patterns ASCII (American Standard Code for Information Interchange)
maps each character to a specific 8-bit pattern note that all digits are contiguous, as are all lowercase and all upper-case letters
'0' < '1' < < '9' 'A' < 'B' < < 'Z' 'a' < 'b' < < 'z'
Representing Text
strings can be represented as sequences of ASCII codes, one for each character in the string
specific programs may store additional information along with the ASCII codes
e.g. programming languages will often store the number of characters along with the ASCII codes e.g., word processing programs will insert special character symbols to denote formatting (analogous to HTML tags in a Web page)
Representing Sounds
computers are capable of representing much more than numbers and text
sounds are inherently analog signals with a specific amplitudes and frequencies when sound waves reach your ear, they cause your eardrum to vibrate, and your brain interprets the vibration as sound e.g. telephones translate a waveform into electrical signals, which are then sent over a wire and converted back to sound e.g. phonographs interpret waveforms stored on on grooves of a disk (similar to audio cassettes) analog signals cannot be reproduced exactly, but this is not usually a problem since the human ear is unlikely to notice small inconsistencies 9
digital recordings can be reproduced exactly without any deterioration in sound quality
regular intervals, and stored as discrete measurements
analog waveforms must be converted to a sequence of discrete values digital sampling is the process in which the amplitude of a wave is measured at
frequent measurements must be taken to ensure high quality (e.g., 44,100 readings
per second for a CD) this results in massive amounts of storage techniques are used to compress the data and reduce file sizes (e.g., MP3, WAV) 10
Representing Images
EXAMPLE: representing images
images are stored using a variety of formats and compression techniques the simplest representation is a bitmap bitmaps partition an image into a grid of picture elements, called pixels, and then convert each pixel into a bit pattern
bitmaps that are divided into smaller pixels will yield higher resolution images the left image is stored using 96 pixels per square inch, and the right image is stored using 48 pixels per square inch the left image appears sharp, but has twice the storage requirements 11
the most common system is to translate each pixel into a 24 bit code, known as its RGB value: 8 bits to represent the intensity of each red/green/blue component
common image formats implement various compression techniques to reduce storage size
GIF (Graphics Interchange Format) a lossless format, meaning no information is lost in the compression commonly used for precise pictures, such as line drawings JPEG (Joint Photographic Experts Group) a lossy format, so the compression is not fully reversible (but more efficient) commonly used for photographs 12
short answer: it doesn't when a program stores data in memory, it must store additional information as to what type of data the bit pattern represents thus, the same bit pattern might represent different values in different contexts
13