Multimedia Module Corrected
Multimedia Module Corrected
By:
Computer Science Department
1
Multimedia Systems
Chapter 1
Introduction to Multimedia Systems
2
Multimedia Systems
3
Multimedia Systems
4
Multimedia Systems
Hypertext
Hypermedia is the application of hypertext principles to a wider variety of media, including audio, animations,
video, and images.
Examples of Hypermedia Applications:
The World Wide Web (WWW) is the best example of hypermedia applications.
PowerPoint
Adobe Acrobat
Macromedia Director
Desirable Features for a Multimedia System
Given the above challenges, the following features are desirable for a Multimedia System:
1. Very high processing speed processing power. Why? Because there are large data to be processed.
Multimedia systems deals with large data and to process data in real time, the hardware should have high
processing capacity.
5
Multimedia Systems
2. It should support different file formats. Why? Because we deal with different data types (media types).
3. Efficient and High Input-output: input and output to the file subsystem needs to be efficient and fast. It has to
allow for real-time recording as well as playback of data. e.g. Direct to Disk recording systems.
4. Special Operating System: to allow access to file system and process data efficiently and quickly. It has to
support direct transfers to disk, real-time scheduling, fast interrupt processing, I/O streaming, etc.
5. Storage and Memory: large storage units and large memory are required. Large Caches are also required.
6. Network Support: Client-server systems common as distributed systems common.
7. Software Tools: User-friendly tools needed to handle media, design and develop applications, deliver media.
Challenges of Multimedia Systems
1) Synchronization issue: in MM application, variety of media are used at the same instance. In addition, there
should be some relationship between the media. E.g between Movie (video) and sound. There arises the issue of
synchronization.
2) Data conversion: in MM application, data is represented digitally. Because of this, we have to convert analog data
into digital data.
3) Compression and decompression: Why? Because multimedia deals with large amount of data (e.g. Movie, sound,
etc) which takes a lot of storage space.
4) Render different data at same time — continuous data.
6
Multimedia Systems
Chapter 2
2. Multimedia Software Tools
2.1. What is Authoring System?
Authoring is the process of creating multimedia applications.
An authoring system is a program which has pre-programmed elements for the development of interactive
multimedia presentations.
Authoring tools provide an integrated environment for binding together the different elements of a Multimedia
production. Multimedia Authoring Tools provide tools for making a complete multimedia presentation where
users usually have a lot of interactive controls.
Multimedia presentations can be created using:
simple presentation packages such as PowerPoint
powerful RAD tools such as Delphi, .Net, JBuilder;
True Authoring environments, which lie somewhere in between in terms of technical complexity.
Authoring systems vary widely in:
-Orientation - Capabilities, and
-Learning curve: how easy it is to learn how to use the application
Why should you use an authoring system?
Can speed up programming i.e. content development and delivery
Time gains i.e. accelerated prototyping
The content creation (graphics, text, video, audio, animation) is not affected by choice of authoring
system
2.2. Characteristics of Authoring Tools A good authoring tool should be able to:
integrate text, graphics, video, and audio to create a single multimedia presentation
Control interactivity by the use of menus, buttons, hotspots, hot objects etc.
publish as a presentation or a self-running executable; on CD/DVD, Intranet, WWW
Be extended through the use of pre-built or externally supplied components, plug-ins etc
let you create highly efficient, integrated workflow
Have a large user base.
2.3. Multimedia Authoring Paradigms: The authoring paradigm, or authoring metaphor, is the
methodology by which the authoring system accomplishes its task. There are various paradigms:
Scripting Language
Icon-Based Control Authoring Tool
Card and Page Based Authoring Tool
7
Multimedia Systems
viewed individually
Extensible via XCMDs (External Command) and DLLs (Dynamic Link Libraries).
All objects (including individual graphic elements) to be scripted;
Many entertainment applications are prototyped in a card/scripting system prior to compiled-language coding.
Each object may contain programming script that is activated when an event occurs.
Examples: - Hypercard (Macintosh)
- SuperCard(Macintosh)
- ToolBook (Windows), etc.
Time Based Authoring Tools
In these authoring systems elements are organized along a time line with resolutions. Sequentially organized
graphic frames are played back at a speed set by developer. Other elements, such as audio events, can be
triggered at a given time or location in the sequence of events.
Are the most popular multimedia authoring tool
They are best suited for applications that have a message with beginning and end, animation intensive
pages, or synchronized media application.
Examples - Macromedia Director
- Macromedia Flash
Macromedia Director
Director is a powerful and complex multimedia authoring tool which has broad set of features to create
multimedia presentation, animation, and interactive application. You can assemble and sequence the elements
of project using cast and score. Three important things that Director uses to arrange and synchronize media
elements:
Cast : Cast is multimedia database containing any media type that is to be included in the project. It imports
wide range of data type and multimedia element formats directly into the cast. You can also create elements
from scratch and add to cast. To include multimedia elements in cast into the stages, you drag and drop the
media on the stage.
Score: This is where the elements in the cast are arranged. It is sequence for displaying, animating, and
playing cast members. Score is made of frames and frames contain cast member. You can set frame rate per
second.
Lingo: Lingo is a full-featured object-oriented scripting language used in Director.
It enables interactivity and programmed control of elements
It enables to control external sound and video devices
It also enables you to control operations of internet such as sending mail, reading documents, images,
9
Multimedia Systems
10
Multimedia Systems
11
Multimedia Systems
CHAPTER 3
3. DATA REPRESENTATIONS
3.1. Graphic/Image Data Representation
An image could be described as two-dimensional array of points where every point is allocated its own color.
Every such single point is called pixel, short form of picture element. Image is a collection of these points that
are colored in such a way that they produce meaningful information/data. Pixel (picture element) contains the
color or hue and relative brightness of that point in the image. The number of pixels in the image determines
the resolution of the image.
A digital image consists of many picture elements, called pixels.
The number of pixels determines the quality of the image image resolution.
Higher resolution always yields better quality.
Bitmap resolution most graphics applications let you create bitmaps up to 300 dots per inch (dpi).
Such high resolution is useful for print media, but on the screen most of the information is lost, since
monitors usually display around 72 to 96 dpi.
A bit-map representation stores the graphic/image data in the same manner that the computer monitor
contents are stored in video memory.
Most graphic/image formats incorporate compression because of the large size of the data.
Fig 1 pixels
3.2. Types of images
There are two basic forms of computer graphics: bit-maps and vector graphics. The kind you use determines
the tools you choose. Bitmap formats are the ones used for digital photographs. Vector formats are used only
for line drawings.
1. Bit-map images (also called Raster Graphics): They are formed from pixelsóa matrix of dots with
different colors. Bitmap images are defined by their dimension in pixels as well as by the number of colors
they represent. For example, a 640X480 image contains 640 pixels and 480 pixels in horizontal and vertical
direction respectively. If you enlarge a small area of a bit-mapped image, you can clearly see the pixels that
are used to create it (to check this open a picture in flash and change the magnification to 800 by going into
View->magnification->800.).
Each of the small pixels can be a shade of gray or a color. Using 24-bit color, each pixel can be set to any one
12
Multimedia Systems
of 16 million colors. All digital photographs and paintings are bitmapped, and any other kind of image can be
saved or exported into a bitmap format. In fact, when you print any kind of image on a laser or ink-jet printer,
it is first converted by either the computer or printer into a bitmap form so it can be printed with the dots the
printer uses.
To edit or modify bitmapped images you use a paint program. Bitmap images are widely used but they suffer
from a few unavoidable problems. They must be printed or displayed at a size determined by the number of
pixels in the image. Bitmap images also have large file sizes that are determined by the image dimensions in
pixels and its color depth. To reduce this problem, some graphic formats such as GIF and JPEG are used to
store images in compressed format.
2. Vector graphics: They are really just a list of graphical objects such as lines, rectangles, ellipses, arcs,
or curves called primitives. Draw programs, also called vector graphics programs, are used to create and edit
these vector graphics. These programs store the primitives as a set of numerical coordinates and mathematical
formulas that specify their shape and position in the image. This format is widely used by computer-aided
design programs to create detailed engineering and design drawings. It is also used in multimedia when 3D
animation is desired. Draw programs have a number of advantages over paint-type programs.
These include:
Precise control over lines and colors.
Ability to skew and rotate objects to see them from different angles or add perspective.
Ability to scale objects to any size to fit the available space. Vector graphics always print at the best
resolution of the printer you use, no matter what size you make them.
Color blends and shadings can be easily changed.
Text can be wrapped around objects.
3. Monochrome/Bit-Map Images
Each pixel is stored as a single bit (0 or 1)
The value of the bit indicates whether it is light or dark
A 640 x 480 monochrome image requires 37.5 KB of storage.
Dithering is often used for displaying monochrome images
13
Multimedia Systems
14
Multimedia Systems
15
Multimedia Systems
Uses complex lossy compression which allows user to set the desired level of quality (compression).
A compression setting of about 60% will result in the optimum balance of quality and filesize.
Though JPGs can be interlaced, they do not support animation and transparency unlike GIF
4. TIFF
Tagged Image File Format (TIFF), stores many different types of images (e.g., monochrome,
grayscale, 8-bit & 24-bit RGB, etc.)
Uses tags, keywords defining the characteristics of the image that is included in the file. For example,
a picture 320 by 240 pixels would include a 'width' tag followed by the number '320' and a 'depth' tag
followed by the number '240'.
TIFF is a lossless format (when not utilizing the new JPEG tag which allows for JPEG compression)
It does not provide any major advantages over JPEG and is not as user-controllable.
Do not use TIFF for web images. They produce big files, and more importantly, most web browsers
will not display TIFFs.
3.4.2. System Dependent Formats
1. Microsoft Windows: BMP
A system standard graphics file format for Microsoft Windows
Used in Many PC Graphics programs
It is capable of storing 24-bit bitmap images
2. Macintosh: PAINT and PICT
PAINT was originally used in MacPaint program, initially only for 1-bit monochrome images.
PICT is a file format that was developed by Apple Computer in 1984 as the native format for
Macintosh graphics.
The PICT format is a meta-format that can be used for both bitmap images and vector images though
it was originally used in MacDraw (a vector based drawing program) fo r storing structured graphics
Still an underlying Mac format (although PDF on OS X)
3. X-windows: XBM
Primary graphics format for the X Window system
Supports 24-bit colour bitmap
Many public domain graphic editors, e.g., xv
Used in X Windows for storing icons, pixmaps, backdrops, etc.
17
Multimedia Systems
pressure and the fluctuations in pressure as sound waves. Sound waves are produced by a vibrating body,
be it a guitar string, loudspeaker cone or jet engine. The vibrating sound source causes a disturbance to the
surrounding air molecules, causing them bounce off each other with a force proportional to the
disturbance. The back and forth oscillation of pressure produces a sound waves.
Source Generates Sound
Air Pressure changes
Electrical Microphone produces electric signal
Acoustic Direct Pressure Variations
Destination Receives Sound
Electrical Loud Speaker
Ears Responds to pressure hear sound
3.5.1. Common Audio Formats
There are two basic types of audio files: the traditional discrete audio file, that you can save to a hard drive or
other digital storage medium, and the streaming audio file that you listen to as it downloads in real time from
a network/internet server to your computer.
1. Discrete Audio File Formats : Common discrete audio file formats include WAV, AIF, AU and
MP3. A fifth format, called MIDI is actually not a file format for storing digital audio, but a system of
instructions for creating electronic music.
2. WAV: The WAV format is the standard audio file format for Microsoft Windows applications, and is
the default file type produced when conducting digital recording within Windows. It supports a variety
of bit resolutions, sample rates, and channels of audio. This format is very popular upon IBM PC
(clone) platforms, and is widely used as a basic format for saving and modifying digital audio data.
3. AIF/AIFF: The Audio Interchange File Format (AIFF) is the standard audio format employed by
computers using the Apple Macintosh operating system. Like the WAV format, it supports a variety of
bit resolutions, sample rates, and channels of audio and is widely used in software programs used to
create and modify digital audio.
4. AU: The AU file format is a compressed audio file format developed by Sun Microsystems and
popular in the Unix world. It is also the standard audio file format for the Java programming language.
Only supports 8-bit depth thus cannot provide CD-quality sound.
5. MP3: MP3 stands for Motion Picture Experts Group, Audio Layer 3 Compression. MP3 files provide
near-CD-quality sound. Because MP3 files are small, they can easily be transferred across the Internet
and played on any multimedia computer with MP3 player software.
6. MIDI/MID MIDI (Musical Instrument Digital Interface): is not a file format for storing or
transmitting recorded sounds, but rather a set of instructions used to play electronic music on devices
18
Multimedia Systems
such as synthesizers. MIDI files are very small compared to recorded audio file formats. However, the
quality and range of MIDI tones is limited.
7. Streaming Audio File Formats: Streaming is a network technique for transferring data from a server
to client in a format that can be continuously read and processed by the client computer. Using this
method, the client computer can start playing the initial elements of large time-based audio or video
files before the entire file is downloaded. As the Internet grows, streaming technologies are becoming
an increasingly important way to deliver time-based audio and video data.
Popular audio file formats are:
o au (Unix)
o aiff (MAC)
o wav (PC)
o mp3
MIDI MIDI stands for Musical Instrument Digital Interface.
Definition of MIDI:
MIDI is a protocol that enables computer, synthesizers, keyboards, and other musical device to
communicate with each other. This protocol is a language that allows interworking between
instruments from different manufacturers by providing a link that is capable of transmitting and
receiving digital data. MIDI transmits only commands, it does not transmit an audio signal.
19
Multimedia Systems
server for MIDI data. Nowadays it is more a software music editor on the computer.)
It has one or more MIDI INs and MIDI OUTs.
Basic MIDI Concepts
Track: Track in sequencer is used to organize the recordings. Tracks can be turned on or off on recording or
playing back.
Channel: MIDI channels are used to separate information in a MIDI system. There are 16 MIDI channels in
one cable. Channel numbers are coded into each MIDI message.
Timbre: The quality of the sound, e.g., flute sound, cello sound, etc. Multitimbral ñ capable of playing many
different sounds at the same time (e.g., piano, brass, drums, etc.)
Pitch: The Musical note that the instrument plays
Voice: Voice is the portion of the synthesizer that produces sound. Synthesizers can have many (12, 20, 24,
36, etc.) voices. Each voice works independently and simultaneously to produce sounds of . Different timbre
and pitch.
Patch: The control settings that define a particular timbre.
Hardware Aspects of MIDI
MIDI connectors: Three 5-pin ports found on the back of every MIDI unit
MIDI IN: the connector via which the device receives all MIDI data.
MIDI OUT: the connector through which the device transmits all the MIDI data it generates itself.
MIDI THROUGH: the connector by which the device echoes the data receives from MIDI IN. (See picture 8
for diagrammatical view)
MIDI Messages MIDI messages are used by MIDI devices to communicate with each other.
MIDI messages are very low bandwidth:
20
Multimedia Systems
Chapter 4
4. COLOR IN IMAGE AND VIDEO
4.1. Color in Image and Video — Basics of Color
The Color of Objects Here we consider the color of an object illuminated by white light. Color is
produced by the absorption of selected wavelengths of light by an object. Objects can be thought of as
absorbing all colors except the colors of their appearance which are reflected back. A blue object
illuminated by white light absorbs most of the wavelengths except those corresponding to blue light.
These blue wavelengths are reflected by the object.
Fig White light composed of all wavelengths of visible light incident on a pure blue object. Only blue light is
reflected from the surface.
4.2. Color Spaces
Color space specifies how color information is represented. It is also called color model. Any color could be
described in a three dimensional graph, called a color space. Mathematically the axis can be tilted or moved in
different directions to change the way the space is described, without changing the actual colors. The values
along an axis can be linear or non-linear. This gives a variety of ways to describe colors that have an impact
on the way we process a color image. There are different ways of representing color. Some of these are:
RGB color space
YUV color space
YIQ color space
CMY/CMYK color space
CIE color space
HSV color space
HSL color space
YCbCr color space
21
Multimedia Systems
RGB Color Space: RGB stands for Red, Green, Blue. RGB color space expresses/defines color as a mixture
of three primary colors: Red, Green, Blue. All other colors are produced by varying the intensity of these
three primaries and mixing the colors. It is used self-luminous devices such as TV, monitor, camera, and
scanner.
CRT Displays
CRT displays have three phosphors (RGB) which produce a combination of wavelengths when excited
with electrons.
The gamut of colors is all colors that can be reproduced using the three Primaries.
The gamut of a color monitor is smaller than that of color models, E.g. CIE (LAB) Model
22
Multimedia Systems
CYM and CYMK: A color model used with printers and other peripherals. Three primary colors, cyan (C),
magenta (M), and yellow (Y), are used to reproduce all colors.
YIQ Color Model : YIQ is used in color TV broadcasting, it is downward compatible with Black and White
TV. The YIQ color space is commonly used in North American television systems. Note that if the
23
Multimedia Systems
One neat aspect of YUV is that you can throw out the U and V components and get a grey-scale image. Black
and white TV receives only Y (luminanace) component ignoring the otheres. This makes it black-white TV
compatible. Since the human eye is more responsive to brightness than it is to color, many lossy image
compression formats throw away half or more of the samples in the chroma channels (color part) to reduce the
amount of data to deal with, without severely destroying the image quality.
24
Multimedia Systems
colors (e.g., no red light reflected from cyan ink) and in painting.
CMYK color model : Sometimes, an alternative CMYK model (K stands for Black) is used in color printing
(e.g., to produce darker black than simply mixing CMY),
where K = min(C, M, Y),
C = C - K, M = M - K,
Y = Y - K.
Colors on self-luminous devices, such as televisions and computer monitors, are produced by adding the three
RGB primary colors in different proportions. However, color reproduction media, such as printed matter and
paintings, produce colors by absorbing some wavelengths and reflecting others. The three RGB primary
colors, when mixed, produce white, but the three CMY primary colors produce black when they are mixed
together. Since actual inks will not produce pure colors, black (K) is included as a separate color, and the
model is called CMYK. With the CMYK model, the range of reproducible colors is narrower than with RGB,
so when RGB data is converted to CMYK data, the colors seem dirtier.
25
Multimedia Systems
Chapter 5
5. Basics of Digital Audio and Fundamental Concepts in Video
5.1. Digitizing Sound
Microphone produces analog signal
Computer deals with digital signal
Sampling Audio
Analog Audio Most natural phenomena around us are continuous; they are continuous transitions between
two different states. Sound is not exception to this rule i.e. sound also constantly varies. Continuously
varying signals are represented by analog signal. Signal is a continuous function f in the time domain. For
value y=f(t), the argument t of the function f represents time. If we graph f, it is called wave. (see the
following diagram)
26
Multimedia Systems
When sound is recorded using microphone, the microphone changes the sound into analog representation
of the sound. In computer, we can’t deal with analog things. This makes it necessary to change analog
audio into digital audio. How? Read the next topic.
Analog to Digital Conversion Converting an analog audio to digital audio requires that the analog signal
is sampled. Sampling is the process of taking periodic measurements of the continuous signal. Samples are
taken at regular time interval, i.e. every T seconds. This is called sampling frequency/sampling rate.
Digitized audio is sampled audio. Many times each second, the analog signal is sampled. How often these
samples are taken is referred to as sampling rate. The amount of information stored ab out each sample is
referred to as sample size.
Analog signal is represented by amplitude and frequency. Converting these waves to digital information is
referred to as digitizing. The challenge is to convert the analog waves to numbers (digital information).
In digital form, the measure of amplitude (the 7 point scale - vertically) is represented with binary
numbers (bottom of graph). The more numbers on the scale the better the quality of the sample, but more
bits will be needed to represent that sample. The graph below only shows 3- bits being used for each
sample, but in reality either 8 or 16-bits will be used to create all the levels of amplitude on a scale. (Music
CDs use 16-bits for each sample).
27
Multimedia Systems
In digital form, the measure of frequency is referred to as how often the sample is taken. In the
graph below the sample has been taken 7 times (reading across). Frequency is talked about in terms of
Kilohertz (KHz).
Hertz (Hz) = number of cycles per second
KHz = 1000Hz
MHz = 1000 KHz
Music CDs use a frequency of 44.1 KHz. A frequency of 22 KHz for example, would mean that the
sample was taken less often.
Sampling means measuring the value of the signal at a given time period. The samples are then quantized.
Quantization is rounding the value of each sample to the nearest amplitude number in the graph. For
example, if amplitude of a specific sample is 5.6, this should be rounded either up to 6 or down to 5. This
is called quantization. Quantization is assigning a value (from a set) to a sample. The quantized values are
changed to binary pattern. The binary patterns are stored in computer.
28
Multimedia Systems
Example: The sampling points in the above diagram are A, B, C, D, E, F, H, and I. The value of sample at
point A falls between 2 and 3, may be 2.6. This value should be represented by the nearest number. We
will round the sample value to 3. Then this three is converted into binary and stored inside computer.
Similarly, the values of other sampling points are: B=1 C=3 D=1 E=3 F=1 G=2 H=3 I=1
The values of most sample points are quantized. After quantization, we convert sample values into binary
digits.
5.2. Sample Rate
A sample is a single measurement of amplitude. The sample rate is the number of these measurements
taken every second. In order to accurately represent all of the frequencies in a recording that fall within the
range of human perception, generally accepted as 20Hzñ20KHz, we must choose a sample rate high
enough to represent all of these frequencies. At first consideration, one might choose a sample rate of 20
29
Multimedia Systems
KHz since this is identical to the highest frequency. This will not work, however, because every cycle of a
waveform has both a positive and negative amplitude and it is the rate of alternation between positive and
negative amplitudes that determines frequency. Therefore, we need at least two samples for every cycle
resulting in a sample rate of at least 40 KHz.
5.3. Sampling Theorem
Sampling frequency/rate is very important in order to accurately reproduce a digital version of an analog
waveform.
Nyquistís Theorem:
The Sampling frequency for a signal must be at least twice the highest frequency component in the signal.
Sample rate = 2 x highest frequency
When the sampling rate is lower than or equal to the Nyquist rate, the condition is defined as under
sampling. It is impossible to rebuild the original signal according to the sampling theorem when such
sampling rate is used.
Aliasing
What exactly happens to frequencies that lie above the Nyquist frequency? First, we’ll look at a
frequency that was sampled accurately:
In this case, there are more than two samples for every cycle, and the measurement is a good
30
Multimedia Systems
approximation of the original wave. we will get back the same signal we put in later on when converting it
into analog.
Remember: speakers can play only analog sound. You have to convert back digital audio to analog when
you play it. If we under sample the signal, though, we will get a very different result:
Under sampling causes frequency components that are higher than half of the sampling frequency to
overlap with the lower frequency components. As a result, the higher frequency components roll into the
reconstructed signal and cause distortion of the signal. This type of signal distortion is called aliasing.
31
Multimedia Systems
32
Multimedia Systems
lose any of its original sharpness and clarity. The image is an exact copy of the original. A computer is the
most common form of digital technology. The limitations of analog video led to the birth of digital video.
Digital video is just a digital representation of the analogue video signal. Unlike analogue video that
degrades in quality from one generation to the next, digital video does not degrade. Each generation of
digital video is identical to the parent. Even though the data is digital, virtually all digital formats are still
stored on sequential tapes.
There are two significant advantages for using computers for digital video :
- the ability to randomly access the storage of video and
- compress the video stored.
Computer-based digital video is defined as a series of individual images and associated audio. These
elements are stored in a format in which both elements (pixel and sound sample) are represented as a
series of binary digits (bits).
Analog vs. Digital Video An analog video can be very similar to the original video copied, but it is not
identical. Digital copies will always be identical and will not lose their sharpness and clarity over time.
However, digital video has the limitation of the amount of RAM available, whereas this is not a factor
with analog video. Digital technology allows for easy editing and enhancing of videos. Storage of the
analog video tapes is much more cumbersome than digital video CDs. Clearly, with new technology
continuously emerging, this debate will always be changing.
Recording Video: CCDs (Charge Coupled Devices) a chip containing a series of tiny, light-sensitive
photo sites. It forms the heart of all electronic and digital cameras. CCDs can be thought of as film for
electronic cameras. CCDs consist of thousands or even millions of cells, each of which is light-sensitive
and capable of producing varying amounts of charge in response to the amount of light they receive.
Digital camera uses lens which focuses the image onto a Charge Coupled Device (CCD), which then
converts the image into electrical pulses. These pulses are then saved into memory. In short, just as the
film in a conventional camera records an image when light hits it, the CCD records the image
electronically. The photo sites convert light into electrons. The electrons pass through an analog-to-digital
converter, which produces a file of encoded digital information in which bits represent the color and tonal
values of a subject. The performance of a CCD is often measured by its output resolution, which in turn is
a function of the number of photo sites on the CCD's surface.
5.4.2.1. Types of Color Video Signals
1. Component video - each primary is sent as a separate video signal. The primaries can either be RGB or
33
Multimedia Systems
34
Multimedia Systems
television signals with a higher resolution than traditional formats (NTSC, SECAM, PAL) allow. Except
for early analog formats in Europe and Japan, HDTV is broadcasted digitally, and therefore its
introduction sometimes coincides with the introduction of digital television (DTV). -
- Modern plasma television uses this
- It consists of 720-1080 lines and higher number of pixels (as many as 1920 pixels).
- Having a choice in between progressive and interlaced is one advantage of HDTV. Many people have
their preferences
35
Multimedia Systems
36
Multimedia Systems
The calculation shows space required for video is excessive. For video, the way to reduce this amount of
data down to a manageable level is to compromise on the quality of video to some extent. This is done by
lossy compression which forgets some of the original data.
7.4. Compression Algorithms
Compression methods use mathematical algorithms to reduce (or compress) data by eliminating, grouping
37
Multimedia Systems
and/or averaging similar data found in the signal. Although there are various compression methods,
including Motion JPEG, only MPEG-1 and MPEG-2 are internationally recognized standards for the
compression of moving pictures (video). A simple characterization of data compression is that it involves
transforming a string of characters in some representation (such as ASCII) into a new string (of bits, for
example) which contains the same information but whose length is as small as possible. Data compression
has important application in the areas of data transmission and data storage.
The proliferation of computer communication networks is resulting in massive transfer of data over
communication links. Compressing data to be stored or transmitted reduces storage and/or communication
costs. When the amount of data to be transmitted is reduced, the effect is that of increasing the capacity of
the communication channel.
Lossless compression: is a method of reducing the size of computer files without losing any information.
That means when you compress a file, it will take up less space, but when you decompress it, it will still
have the exact same information. The idea is to get rid of any redundancy in the information, this is
exactly what happens is used in ZIP and GIF files. This differs from lossy compression, such as in JPEG
files, which loses some information that isn't very noticeable. Why use lossless compression?
You can use lossless compression whenever space is a concern, but the information must be the same. An
example is when sending text files over a modem or the Internet. If the files are smaller, they will get there
faster. However, they must be the same as that you sent at destination. Modem uses LZW compression
automatically to speed up transfers.
Shannon-Fano Coding Let us assume the source alphabet S={X1,X2,X3,Ö,Xn} and Associated
probability P={P1,P2,P3,Ö,Pn} The steps to encode data using Shannon-Fano coding algorithm is as
follows: Order the source letter into a sequence according to the probability of occurrence in non-
increasing order i.e. decreasing order.
38
Multimedia Systems
ShannonFano(sequence s)
If s has two letters
Attach 0 to the codeword of one letter and 1 to the codeword of another;
Else if s has more than two letter
Divide s into two subsequences S1, and S2 with the minimal difference between
probabilities of each subsequence;
extend the codeword for each letter in S1 by attaching 0, and by attaching 1 to each
codeword for letters in S2;
ShannonFano(S1);
ShannonFano(S2);
39
Multimedia Systems
The message is transmitted using the following code (by traversing the tree)
A=00 B=01
C=10 D=110
E=111
Instead of transmitting ABCDE, we transmit 000110110111.
7.4.2. Dictionary Encoding
Dictionary coding uses groups of symbols, words, and phrases with corresponding abbreviation. It
transmits the index of the symbol/word instead of the word itself. There are different variations of
dictionary based coding: LZ77 (printed in 1977) LZ78 (printed in 1978) LZSS LZW (Lempel-Ziv-Welch)
LZW Compression LZW compression has its roots in the work of Jacob Ziv and Abraham Lempel. In
1977, they published a paper on "sliding-window" compression, and followed it with another paper in
1978 on "dictionary" based compression. These algorithms were named LZ77 and LZ78, respectively.
Then in 1984, Terry Welch made a modification to LZ78 which became very popular and was called
LZW.
The Concept
Many files, especially text files, have certain strings that repeat very often, for example " the ". With the
spaces, the string takes 5 bytes, or 40 bits to encode. But what if we were to add the whole string to the list
of characters? Then every time we came across " the ", we could send the code instead of
32,116,104,101,32. This would take less no of bits.
This is exactly the approach that LZW compression takes. It starts with a dictionary of all the single
character with indexes 0-255. It then starts to expand the dictionary as information gets sent through.
Then, redundant strings will be coded, and compression has occurred.
40
Multimedia Systems
The Algorithm:
LZWEncoding()
Enter all letters to the dictionary;
Initialize string s to the first letter from the input;
While any input is left
read symbol c;
if s+c exists in the dictionary
s = s+c;
else
output codeword(s); //codeword for s
enter s+c to dictionary;
s =c;
end loop
output codeword(s);
Example: encode the ff string “aababacbaacbaadaa”
The program reads one character at a time. If the code is in the dictionary, then it adds the character to the
current work string, and waits for the next one. This occurs on the first character as well. If the work string
is not in the dictionary, (such as when the second character comes across), it adds the work string to the
dictionary and sends over the wire the works string without the new character. It then sets the work string
to the new character.
Example: Encode the message aababacbaacbaadaaa using the above algorithm
Encoding Create dictionary of letters found in the message
41
Multimedia Systems
the dictionary.
Then initialize s to c (s=c=a).
Read the next letter from message to c (c=b) Check if s+c (ab) is found in the dictionary. It is not found.
Then, add s+c (s+c=ab) into dictionary and output code for c (c=b). The codeword is 2. Then initialize s to
c (s=c=b).
Encoder Dictionary
Input(s+c)Output Index Entry
1 a
2 b
3 c
4 d
aa 1 5 aa
ab 1 6 ab
Read the next letter to c (c=a).
Check if s+c (s+c=ba) is found in the dictionary. It is not found. Then add s+c (s+c=ba) to the dictionary.
Then output the codeword for s (s=b). It is 2. Then initialize s to c (s=c=b).
42
Multimedia Systems
Read the next message to c (c=a). Then check if s+c (s+c=ab) is found in the dictionary. It is
there. Then initialize s to s+c (s=s+c=ab).
Read again the next letter to c (c=a). Then check if s+c (s+c=aba) is found in the dicitionary. It is
not there. Then transmit codeword for s (s=ab). The code is 6. Initialize s to c(s=c=a).
Again read the next letter to c and continue the same way till the end of message. At last you will
have the following encoding table.
43
Multimedia Systems
Huffman Compression
When we encode characters in computers, we assign each an 8-bit code based on an ASCII chart.
But in most files, some characters appear more often than others. So wouldn't it make more sense
to assign shorter codes for characters that appear more often and longer codes for characters that
appear less often? D.A. Huffman published a paper in 1952 that improved the algorithm slightly
and it soon super ceded Shannon-Fano coding with the appropriately named Huffman coding.
44 | P a g e
Multimedia Systems
Create a parent node for these two nodes. Give this parent node a weight of the sum of the two
nodes.
Remove the two nodes from the list, and add the parent node. This way, the nodes with the
highest weight will be near the top of the tree, and have shorter codes.
For each letter create a tree with single root node and order all trees according to
the probability of letter of occurrence;
while more than one tree is left
take two trees t1, and t2 with the lowest probabilities p1, p2 and create a tree
with probability in its root equal to p1+p2 and with t1 and t2 as its subtrees;
associate 0 with each left branch and 1 with each right branch;
create unique codeword for each letter by traversing the tree the root to the leaf containing
the probability corresponding to this letter and putting all encountered 0s and 1s together;
45 | P a g e
Multimedia Systems
To read the codes from a Huffman tree, start from the root and add a 0 every time you go left to
a child, and add a 1 every time you go right. So in this example, the code for the character b is
01 and the code for d is 110.
As you can see, a has a shorter code than d. Notice that since all the characters are at the leafs of
the tree, there is never a chance that one code will be the prefix of another one (eg. a is 01 and b
is 011). Hence, this unique prefix property assures that each code can be uniquely decoded.
The code for each letter is:
a=000 b=001
c=010 d=011
e=1
The original message will be encoded to: abcde=0000010100111
IRcureentInterval; CurrentInterval=SubIntervali in
CurrentInterval;
Output bits uniquely identifying CurrentInterval;
46 | P a g e
Multimedia Systems
Assume the source alphabet s={X1, X2, X3,…, Xn} and associated probability
of P={p1, p2, p3,…, pn}
To calculate sub interval of current interval [L,R], use the following formula
[L+(R-L)*Pn-1, L+(R-L)*P1)}
Cumulative probabilities are indicated using capital P and single probabilities are indicated
using small p.
Example:
P1=0.4
P2=0.4+0.3=0.7
P3=0.4+0.3+0.1=0.8
P4=0.4+0.3+0.1+0.2=
Now the question is, which one of the SubIntervals will be the CurrentInterval? To determine
this, read the first letter of the message. It is a. Look where a is found in the source alphabet. It is
found at the beginning. So the next CurrentInterval will be [0,4) which is also found at the
beginning in the SubIntervals.
Again let us calculate the SubIntervals of CurrentInterval [0,0.4). The cumulative probability
does not change i.e the same as previous.
IR[0,0.4]={[0,0.16),[0.16,0.28),[0.28,0.32),[0.32,0.4)}.
Which interval will be the next CurrentInterval? Read the next letter from message. It is b. B
is found in the second place in the source alphabet list. The next CurrentInterval will be the
second SubInterval i.e [0.16,0.28).
Continue like this till there is letter left in the message. You will get the following result:
IR[0.16,0.28]={[0.16,0.208),[0.208,0.244),[0.244,0.256),[0.256,0.28)}. Next
IR[0.208,0.244]={[0.208,0.2224),[0.2224,0.2332),[0.2332,0.2368),[0.2368,0.242). Next
IR[0.2332,0.2368]={[0.2332,0.23464),[0.23464,0.23572),[0.23572,0.23608),[0.23608, 0.2368)}.
We are done because no more letter remained in the message. The last letter read was #. It is the
fourth letter in source alphabet. So take the fourth SubInterval as CurrentInterval i.e [0.23608,
0.2368]. Now any number between the last CurrentInterval is sent as the message. So you can
send 0.23608 as the encoded message or any number between 0.23608, and 0.2368.
48 | P a g e
Multimedia Systems
References
49 | P a g e