Lecture 4
Lecture 4
Lecture №4
Lecture’s Outline
The aim of topic: explore how we can to store data and
understand how data represented.
Agenda:
Data refers to the symbols that represent people, events, things, and ideas. Data can be a
name, a number, the colors in a photograph, or the notes in a musical composition.
Data Representation refers to the form in which data is stored, processed, and
transmitted.
● The 0s and 1s used to represent digital data are referred to as binary digits – from this
term we get the word bit that stands for binary digit.
● A bit is a 0 or 1 used in the digital representation of data.
● A digital file, usually referred to simply as a file, is a named collection of data that exits
on a storage medium, such as a hard disk, CD, DVD, or flash drive.
Bits and Bytes
All of the data stored and transmitted by digital devices is encoded as bits.
Terminology related to bits and bytes is extensively used to describe storage capacity and
network access speed. The word bit, an abbreviation for binary digit, can be further
abbreviated as a lowercase b. A group of eight bits is called a byte and is usually
abbreviated as an uppercase B.
When reading about digital devices, you’ll frequently encounter references such as 90
kilobits per second, 1.44 megabytes, 2.8 gigahertz, and 2 terabytes.
Kilo, mega, giga, tera, and similar terms are used to quantify digital data.
Use bits for data rates, such as Internet connection speeds, and movie download
speeds.
Use bytes for file sizes and storage capacities.
104 KB: Kilobyte (KB or Kbyte) is often used when referring to the size of small
computer files.
Bits and
Bytes
Representing
Text
● Character data is composed of letters, symbols, and numerals that are not used in
calculations.
● Examples of character data include your name, address, and hair color.
● Character data is commonly referred to as “text.
● Digital devices employ several types of codes to represent character data, including
ASCII, Unicode, and their variants.
● ASCII (American Standard Code for Information Interchange, pronounced “ASK ee”)
requires seven bits for each character.
● The ASCII code for an uppercase A is 1000001.
● Extended ASCII is a superset of ASCII that uses eight bits for each character.
● For example, Extended ASCII represents the uppercase letter A as 01000001.
● Using eight bits instead of seven bits allows Extended ASCII to provide codes for 256
characters.
Representing
Text
Representing
Text
Morse
code
Representing
Text Problem: How to store non-English
characters
Early approach: every alphabet used it's own encodings.
Problem: How to store text that contains letters from different alphabets.
Different encodings:
Windows-1250 for Central European languages that use Latin script, (Polish, Czech, Slovak,
Hungarian, Slovene, Serbian, Croatian, Romanian and Albanian)
Windows-1251 for Cyrillic alphabets
Windows-1252 for Western languages
Windows-1253 for Greek
Windows-1254 for Turkish
Windows-1255 for Hebrew
Windows-… etc.
Using different encoding for each script won’t allow you to write text with different
scripts
Representing
Text
Unicode (pronounced “YOU ni code”) uses sixteen bits and provides codes or 65,000
characters.
• This is a bonus for representing the alphabets of multiple languages.
• Modern version contains 28 ancient and historic scripts (alphabets) and 72 modern
scripts
• UTF-8 is a variable-length coding scheme that uses seven bits for common ASCII
characters but uses sixteen-bit Unicode as necessary.
Representing
Text
● ASCII codes are used for numerals, such as Social Security numbers and phone numbers.
● Plain, unformatted text is sometimes called ASCII text and is stored in a so-called text file with
a name ending in .txt.
● On Apple devices these files are labeled “Plain Text.” In Windows, these files are labeled “Text
Document”.
● Microsoft Word produces formatted text and creates documents in DOCX format.
● Apple Pages produces documents in PAGES format.
● Adobe Acrobat produces documents in PDF format.
● HTML markup language used for Web pages produces documents in HTML format.
Images and
colors
Image is a set of pixels.
Pixel is one cell on screen, which contains only one color.
Image is stored in sequence of pixels, which is represented by its colors
Images and
colors
Raster
graphics
Raster graphics is a mechanism that represents a two-dimensional image as a
rectangular matrix or grid of square pixels, viewable via a computer display, paper, or
other display medium. A raster is technically characterized by the width and height of
the image in pixels and by the number of bits per pixel. Raster images are stored in
image files with varying dissemination, production, generation, and acquisition formats.
Images and
colors Vector
Vector graphics are stored asgraphics
a list of attributes. The attributes are used by the computer
to create the graphic. Rather than storing the data for each pixel, the computer will
generate an object by looking at its attributes.
It saves geometrical information about image. raster graphic is a mechanism that represents
a two-dimensional image as a rectangular matrix or grid of square pixels, viewable via a
computer display, paper, or other display medium. A raster is technically characterized by
the width and height of the image in pixels and by the number of bits per pixel. Raster
images are stored in image files with varying dissemination, production, generation, and
acquisition formats.
Images and
colors
Approach #2: Combine three colors: Red, Green, Blue. Used in displays.