0% found this document useful (0 votes)
28 views

Lecture 4

Uploaded by

kjuzym06
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Lecture 4

Uploaded by

kjuzym06
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Data representation

Lecture №4
Lecture’s Outline
The aim of topic: explore how we can to store data and
understand how data represented.

Agenda:

 Bits and Bytes


 Representing Text
 Images and Colors
Data
Computers are machines that do stuff with information. They let you view, listen, create,
and edit information in documents, images, videos, sound, spreadsheets and databases.
They let you play games in simulated worlds that don’t really exist except as information
inside the computer’s memory and displayed on the screen.

Data refers to the symbols that represent people, events, things, and ideas. Data can be a
name, a number, the colors in a photograph, or the notes in a musical composition.

Data Representation refers to the form in which data is stored, processed, and
transmitted.
● The 0s and 1s used to represent digital data are referred to as binary digits – from this
term we get the word bit that stands for binary digit.
● A bit is a 0 or 1 used in the digital representation of data.
● A digital file, usually referred to simply as a file, is a named collection of data that exits
on a storage medium, such as a hard disk, CD, DVD, or flash drive.
Bits and Bytes

All of the data stored and transmitted by digital devices is encoded as bits.
Terminology related to bits and bytes is extensively used to describe storage capacity and
network access speed. The word bit, an abbreviation for binary digit, can be further
abbreviated as a lowercase b. A group of eight bits is called a byte and is usually
abbreviated as an uppercase B.

When reading about digital devices, you’ll frequently encounter references such as 90
kilobits per second, 1.44 megabytes, 2.8 gigahertz, and 2 terabytes.

 Kilo, mega, giga, tera, and similar terms are used to quantify digital data.
 Use bits for data rates, such as Internet connection speeds, and movie download
speeds.
 Use bytes for file sizes and storage capacities.
 104 KB: Kilobyte (KB or Kbyte) is often used when referring to the size of small
computer files.
Bits and
Bytes
Representing
Text
● Character data is composed of letters, symbols, and numerals that are not used in
calculations.
● Examples of character data include your name, address, and hair color.
● Character data is commonly referred to as “text.
● Digital devices employ several types of codes to represent character data, including
ASCII, Unicode, and their variants.
● ASCII (American Standard Code for Information Interchange, pronounced “ASK ee”)
requires seven bits for each character.
● The ASCII code for an uppercase A is 1000001.
● Extended ASCII is a superset of ASCII that uses eight bits for each character.
● For example, Extended ASCII represents the uppercase letter A as 01000001.
● Using eight bits instead of seven bits allows Extended ASCII to provide codes for 256
characters.
Representing
Text
Representing
Text
Morse
code
Representing
Text Problem: How to store non-English
characters
Early approach: every alphabet used it's own encodings.
Problem: How to store text that contains letters from different alphabets.

Different encodings:
Windows-1250 for Central European languages that use Latin script, (Polish, Czech, Slovak,
Hungarian, Slovene, Serbian, Croatian, Romanian and Albanian)
Windows-1251 for Cyrillic alphabets
Windows-1252 for Western languages
Windows-1253 for Greek
Windows-1254 for Turkish
Windows-1255 for Hebrew
Windows-… etc.

How to represent all of them in one document?


Representing
Text
Problem: how to write following
text:
‫شخص جيد‬
‫אדם טוב‬
좋은 사람
καλό πρόσωπο
មនុស្សម្នាក់ដ៏ល្អ
நல்ல நபர்

Using different encoding for each script won’t allow you to write text with different
scripts
Representing
Text
Unicode (pronounced “YOU ni code”) uses sixteen bits and provides codes or 65,000
characters.
• This is a bonus for representing the alphabets of multiple languages.
• Modern version contains 28 ancient and historic scripts (alphabets) and 72 modern
scripts
• UTF-8 is a variable-length coding scheme that uses seven bits for common ASCII
characters but uses sixteen-bit Unicode as necessary.
Representing
Text
● ASCII codes are used for numerals, such as Social Security numbers and phone numbers.
● Plain, unformatted text is sometimes called ASCII text and is stored in a so-called text file with
a name ending in .txt.
● On Apple devices these files are labeled “Plain Text.” In Windows, these files are labeled “Text
Document”.
● Microsoft Word produces formatted text and creates documents in DOCX format.
● Apple Pages produces documents in PAGES format.
● Adobe Acrobat produces documents in PDF format.
● HTML markup language used for Web pages produces documents in HTML format.
Images and
colors
Image is a set of pixels.
Pixel is one cell on screen, which contains only one color.
Image is stored in sequence of pixels, which is represented by its colors
Images and
colors
Raster
graphics
Raster graphics is a mechanism that represents a two-dimensional image as a
rectangular matrix or grid of square pixels, viewable via a computer display, paper, or
other display medium. A raster is technically characterized by the width and height of
the image in pixels and by the number of bits per pixel. Raster images are stored in
image files with varying dissemination, production, generation, and acquisition formats.
Images and
colors Vector
Vector graphics are stored asgraphics
a list of attributes. The attributes are used by the computer
to create the graphic. Rather than storing the data for each pixel, the computer will
generate an object by looking at its attributes.
It saves geometrical information about image. raster graphic is a mechanism that represents
a two-dimensional image as a rectangular matrix or grid of square pixels, viewable via a
computer display, paper, or other display medium. A raster is technically characterized by
the width and height of the image in pixels and by the number of bits per pixel. Raster
images are stored in image files with varying dissemination, production, generation, and
acquisition formats.
Images and
colors

Raster (Bitmap) vs Vector


graphics
Images and
colors
Raster vs
Vector
● Loads faster: Raster.
● Can be zoomed without loss of quality: Vector
● Takes less memory for simple figures: Vector
● Used in typography: Vector
● Best for real-world images: Raster
● try to understand why?
Images and
colors
Approach #1: Combine three colors: Cyan, Magenta, Yellow. Used in printers.

Approach #2: Combine three colors: Red, Green, Blue. Used in displays.

In computers it mostly saves every color


Images and
colors
The amount of colours that can be represented in a bitmapped image is dictated by
the bit depth.

Bit depth Available colours


8 bits per pixel 256 (28)
16 bits per pixel 65,536 (216)

24 bits per pixel 16,777,216 (224)


Images and
colors
PBM PNM
PBM file format to represent bitmap PNM file format to represent color images
images.
So 1 means white, and 0 means black
Images and
colors
Data Representation - Computer Science Field Guide (csfieldguide.org.n
z)

IEEE 754 - Standard binary arithmetic float

Theoretical foundation of industrial electronics and industrial programming (softelectro.ru)


Thank you for your
attention!

You might also like