0% found this document useful (0 votes)

49 views299 pages

DCA3142 Graphics and Multimedia

Uploaded by

mannem karthik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views299 pages

DCA3142 Graphics and Multimedia

Uploaded by

mannem karthik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 299

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 1: Introduction to Computer Graphics and Graphics System 1

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 1
Introduction to Computer Graphics and
Graphics System
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Overview of Computer Graphics - 1 4-5
3 Image Processing and Visualization - 2 6-9
4 Interactive graphics and Passive Graphics - 3 10 - 11
5 RGB Color Model - 4 12 - 14
6 Direct coding 1 5 15 - 16
7 Lookup table 2 6 17 - 18

8 Graphics Devices - 7 19 - 20
9 Summary - - 21
10 Terminal Questions - - 21 - 22
11 Answers - - 22 - 24

Unit 1: Introduction to Computer Graphics and Graphics System 2

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

The art of drawing images using computer programming is known as computer graphics. It
entails calculations, data production, and manipulation. In other words, computer graphics
can be thought of as a tool for creating and modifying images. Graphics can be two- or three-
dimensional while images can be completely synthetic or be produced by manipulating
photographs. So, farm any powerful tools have been developed to visualize data.

Computer graphics has emerged as a sub-field of computer science which studies methods
to synthesize and manipulate visual content digitally to digitally synthesize and manipulate
visual content. Over the past decade, other specialized fields have evolved like information
visualization, and scientific visualization which is primarily focused on the visualization of
three dimensional phenomena where the emphasis is on realistic renderings of volumes and
surfaces.

Computer generated imagery can be categorized into several different types: 2D, 3D, and
animated graphics. With improvements in technology, 3D computer graphics have become
common, but 2D computer graphics too are widely used.

In this unit we will discuss graphics, Present and Interact picture presentation, image
processing such as picture analysis, visualization, RGB color model, direct coding and lookup
table.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Give an overview of graphics

❖ Discuss the advantages of interactive graphics
❖ Discuss visualization and image processing
❖ Explain direct table
❖ Explain the role of lookup table

Unit 1: Introduction to Computer Graphics and Graphics System 3

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. OVERVIEW OF COMPUTER GRAPHICS

The term computer graphics has been broadly used to describe "almost everything on a
computer that is not text or sound". Typically, the term computer graphics refers to several
different things: the representation and manipulation of image data by a computer, the
various technologies used to create and manipulate images, the images so produced, and the
sub-field of computer science which studies methods for digitally synthesizing and
manipulating visual content. Today, computers and computer-generated images touch many
aspects of daily life. Computer imagery is found on television, in newspapers, for example in
weather reports, or in all kinds of medical investigation and surgical procedures. A well-
constructed graph can present complex statistics in a form that is easier to understand and
interpret. In the media such graphs are used to illustrate papers, reports, thesis, and other
presentation material.

There are many other areas that involve computer graphics, but whether they are core
graphics area is a debatable matter. These will all be touched upon in the text. Such areas
include the following:

User Interaction: It deals with the interface between input devices such as mouse and
tablets, the application, feedback to the user in imagery, and other types of sensory feedback.
Historically, this area is associated with graphics largely because graphics researchers were
among the first to have access to the input/output devices that are now ubiquitous.

Virtual Reality: It attempts to immerse the user into a 3D virtual world. This typically
requires at least stereo graphics and response to head motion. For true virtual reality, sound
and force feedback should be provided as well. Because this area requires advanced 3D
graphics and advanced display technology, it is closely associated with graphics.

Visualization: Today, there is a much greater need for visualization than ever before. Data
visualization, for example, aids in discovering insights from data, and we also need the right
kind of visualization to check and study the behaviour of processes around us. This can be
done by using computer graphics in the right way.

Unit 1: Introduction to Computer Graphics and Graphics System 4

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Image Processing: Editing is required for many distinct varieties of photographs and
pictures before they may be utilised in a variety of settings. One of the many applications of
computer graphics is the processing of already-existing images into more refined ones for
the purpose of improved interpretation.

3D Scanning: It uses range-finding technology to create measured 3D models. Such models

are useful in creating rich visual imagery, and the processing of such models often
requires graphics algorithms.

Computational Photography: This deals with the use of computer graphics, computer
vision, and image processing methods to enable new ways of photographically capturing
objects, scenes, and environments.

Computer Aided Drawing : Computer-aided drawing is used in the design of structures,

vehicles, and aero planes. This aids in adding minute details to the drawing and produces
more precise, sharp, and detailed drawings with superior specifications.

Graphical User Interface: The use of photos, images, icons, pop-up menus, and graphical
objects helps build a user-friendly environment where working is simple and enjoyable. By
employing computer graphics, we can create a setting where everything is automatable and
anyone can easily accomplish the needed action.

SELF-ASSESSMENT QUESTIONS - 1

1. deals with the interface between input devices such as mouse and tablets.
2. Virtual reality attempts to immerse the user into a 3D virtual world. State
True/False.
3. Visualization attempts to give users insight into complex information via
4. ___________ helps in producing more sharp and detailed drawings with superior
specifications.

Unit 1: Introduction to Computer Graphics and Graphics System 5

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. IMAGE PROCESSING AND VISUALIZATION

Computer graphics is a collection, contribution and representation of real or imaginary

objects from their computer-based models. Thus it can be said that computer graphics
concerns the pictorial synthesis of real or imaginary objects. However, the related field image
processing or sometimes called picture analysis concerns the analysis of scenes, or the
reconstruction of models of 2D or 3D objects from their picture. This is exactly the reverse
process. The image processing can be classified as

➢ Image enhancement
➢ Pattern detection and recognition
➢ Scene analysis and computer vision

The image enhancement deals with the improvement in the image quality by eliminating
noise or by increasing image contrast. Pattern detection and recognition deal with the
detection and clarification of standard patterns and finding deviations from these patterns.
The optical character recognition

(OCR) technology is a practical example of pattern detection and recognition. Scene analysis
and computer vision deals with the recognition and construction of 3D model of scene from
several 2D images.

The above three fields of image processing proved their importance in many areas such as
finger print detection and recognition, modeling of building, ships, automobiles etc., and so
on.

Computer graphics and image processing of computer processing of picture in the initial
stages were quite separate disciplines. But now a day they use some common features, and
overlap between them is growing and they both use raster displays.

1. Visualization

Visualization is a technique for creating images, diagrams, or animations to

communicate a message. Visualization through visual imagery has been an effective
way to communicate both abstract and concrete ideas. Today, visualization has ever-

Unit 1: Introduction to Computer Graphics and Graphics System 6

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

expanding applications in science, education, engineering (for example, product

visualization), interactive multimedia, medicine, and so on. Typical visualization
application is the field of computer graphics. The invention of computer graphics can
be considered the most important development in visualization since the invention of
the central perspective. The use of visualization to present information is not a new
phenomenon. It has been used in maps, scientific drawings, and data plots for over a
thousand years. Computer graphics has from its beginning been used to study scientific
problems.

Most people are familiar with the digital animations produced to present
meteorological data during weather reports on television, though only a few can
distinguish between the models of reality and the satellite photos that are also shown
on such programs. Television offers scientific visualizations when it shows computer
drawn and animated reconstructions of road or airplane accidents. Some of the most
popular examples of scientific visualizations are: computer-generated images that
show a real spacecraft in action, out in the void far beyond Earth, or on other planets.
Dynamic forms of visualization, such as educational animation or timelines, have the
potential to enhance learning about systems that change over time.

Apart from the distinction between interactive visualizations and animation, the most
useful categorization is probably between abstract and model- based scientific
visualizations.

Data visualization is a related subcategory of visualization that deals with statistical

graphics and geographic or spatial data (as in thematic cartography) that is abstracted in
schematic form.

Scientific visualization is the transformation, selection, or representation of data from

simulations or experiments, with an implicit or explicit geometric structure, to allow the
exploration, analysis, and understanding of data. Traditional areas of scientific visualization
are flow visualization, medical visualization, astrophysical visualization, and chemical
visualization. There are several different techniques to visualize scientific data, with surface
reconstruction and direct volume rendering being the more common ones.

Unit 1: Introduction to Computer Graphics and Graphics System 7

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Computer visualisation techniques are currently employed in the following fields:

Construction and engineering: Architecture, space planning, and interior design;

Biomedical applications:Planning for surgery and radiation therapy, diagnostic assistance;

Management and business graphics: Decision-making systems and graphic displays of data;

Education and instruction: Techniques for enhancing children's and adults' visual thinking
and creative abilities;

Electric CAD/CAM: Generation of printed wire board and integrated circuit design symbols
and schematics.

Human factors and interface design: Programming in visible language, enhancements to

screen layout, windows, icons, typography, and motion;

Geography and cartography Geographic information systems: visual databases, computer-

assisted cartography, three-dimensional cartography, and transportation analyses;

Printing and publishing: Text and graphic integration in printed documents, page-layout
software, scanning systems, capability for direct-to-plate printing;

Statistical illustrations: Graphical ways for displaying vast quantities of data to enhance
comprehension of data analysis;

Multimedia and video technology: High-definition television, computer-generated video for

entertainment and educational applications, and TV news and weather applications;

Visual arts and design: Computer graphics for graphic design, industrial design, advertising,
and interior design; standards based on design principles about colour, proportion,
positioning, and orientation of visual elements.

Unit 1: Introduction to Computer Graphics and Graphics System 8

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 2

5. Detection and clarification of standard patterns is

related to____________ .
a. Image enhancement
b. Pattern detection
c. Scene analysis
d. Computer vision
6. OCR Stands for _________ .
7. ---------concerns the reconstruction of models of 2D or 3D objects from their
picture.
8. ------------ type of visualization deals with statistical graphics.
9. What are the traditional areas of scientific visualization?
10. Are computer generated images part of data visualization? State True/False

Unit 1: Introduction to Computer Graphics and Graphics System 9

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. INTERACTIVE GRAPHICS AND PASSIVE GRAPHICS

Interactive graphics allow users to exert a great deal of influence over the layout and make
substantial modifications. Graphic designs that are interactive facilitate two-way contact
between users and the design. When you click a button on a website, use an app on a smart
phone, use an ATM, a picture booth, an airport check-in station, or play a video game, you
are interacting with interactive visuals. Even the operating system of a computer is
interactive graphics.

Let us now discuss the advantages of interactive graphics.

• Today, a high quality graphics display of personal computer offers one of the most
natural means of communication with a computer.
• It provides tools for producing pictures not only of concrete, “real-world” objects but
also of abstract, synthetic objects such as mathematical surfaces in 4D, and of data that
may have no inherent geometry such as survey results.
• It has an ability to show moving pictures, and thus it is possible to produce animations
with interactive graphics.
• With interactive graphics one can also control the animation by adjusting the speed.
Interactive graphics provides a tool called motion dynamics. With this tool, a user can
move and tumble objects with respect to a stationary observer, or can make objects
stationary with the viewer moving around them. A typical example is walk through
made by apartment builders toshow the interiors of an apartment as well as the
surrounding of the building. In many case it is also possible to move both the objects
and the viewer.
• The interactive graphics also provides a facility called update dynamics. With update
dynamics it is possible to change the shape, colour or other properties of the objects
being viewed.
• With the recent development of digital signal processing (DSP) and audio synthesis
chip, interactive graphics can now provide audio feedback along with the graphical
feedbacks to make the simulated environment even more realistic.

Unit 1: Introduction to Computer Graphics and Graphics System 10

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

In short, interactive graphics permits extensive, high-bandwidth user- computer interaction.

It significantly enhances the ability to understand information, perceive trends and to
visualize real or imaginary objects either moving or stationary in a realistic environment. It
also makes it possible to get high quality and more precise results and develop products with
lower analysis and design cost.

Passive Graphics :It is also called Non interactive graphics. In non-interactive computer
graphics, the image is rendered on the monitor, and the user has no control over the
rendered image, i.e., the user cannot alter the rendered image. Non-interactive Graphics
feature only one-way communication between the computer and the user. The user is able
to view the generated image, but cannot alter it. One example of it is Titles shown on T.V.

SELF-ASSESSMENT QUESTIONS - 3

11. What are update dynamics?

12. What does DSP stand for?
13. In what type of graphics the user has no control over the rendered image____

Unit 1: Introduction to Computer Graphics and Graphics System 11

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. RGB COLOUR MODEL

RGB is an abbreviation for Red, Green, and Blue. In computer graphics, this colour space is
widely utilized. RGB are the primary colours from which all other colours are derived.Here
red, green, and blue light is added together in different ways to reproduce a broad array of
colours.. The main purpose of the RGB colour model is sensing, representation, and display
of images in electronic systems, such as televisions and computers, though it has also been
used in conventional photography. RGB is a device-dependent colour model. Different
devices detect or reproduce a given RGB value differently, since the colour elements (such
as phosphors or dyes) and their response to the individual R, G, and B levels vary from
manufacturer to manufacturer, or even in the same device over time. Thus an RGB value does
not define the same colour across devices without some kind of colour management.

Typical RGB input devices are video cameras, image scanners, and digital cameras. Typical
RGB output devices are TV sets of various technologies (CRT, LCD, plasma, and so on.),
computer and mobile phone displays, video projectors, multicolor LED displays, and large
screens such as JumboTron. Colour printers, on the other hand, are not RGB devices, but
subtractive colour devices (typically CMYK colour model).

Additive primary colour

Unit 1: Introduction to Computer Graphics and Graphics System 12

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

To form a colour with RGB, three colored light beams (one red, one green, and one blue) must
be superimposed (for example by emission from a black screen, or by reflection from a white
screen). Each of the three beams is called a component of that colour, and each of them can
have an arbitrary intensity, from fully off to fully on, in the mixture. The RGB colour model is
additive in the sense that the three light beams are added together, and their light spectra
add, wavelength for wavelength, to make the final colour's spectrum.

Zero intensity for each component gives the darkest colour (no light, considered the black),
and full intensity of each gives a white. The quality of this white depends on the nature of the
primary light sources, but if they are properly balanced, the result is a neutral white
matching the system's white point. When the intensities for all the components are the same,
the result is a shade of gray, darker or lighter depending on the intensity. When the
intensities are different, the result is a colourized hue, more or less saturated depending on
the difference of the strongest and weakest of the intensities of the primary colours
employed.

When one of the components has the strongest intensity, the colour is a hue near this primary
colour (reddish, greenish, or bluish), and when two components have the same strongest
intensity, then the colour is a hue of a secondary colour (a shade of cyan, magenta or yellow).
A secondary colour is formed by the sum of two primary colours of equal intensity: cyan is
green +blue, magenta is red +blue and yellow is red +green. Every secondary color is the
complement of one primary color. When a primary and its complementary secondary colour
are added together, the result is white: cyan complements red, magenta complements green
and yellow complements blue.

The RGB colour model itself does not define what is meant by red, green, and blue
calorimetrically, and so the results of mixing them are not specified as absolute, but relative
to the primary colours. When the exact chromaticities of the red, green, and blue primaries
are defined, the colour model becomes an absolute colour space.

Unit 1: Introduction to Computer Graphics and Graphics System 13

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 4

14. Zero intensity for each component gives the____ .

15. The RGB colour model defines what is meant by red, green, and blue. State
True/False
16. When one of the components in RGB has the strongest intensity, the colour is a hue
near its primary colour or secondary color?

Unit 1: Introduction to Computer Graphics and Graphics System 14

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. DIRECT CODING

In computer graphics, direct coding is an algorithm that allocates storage space for each pixel
so that it can be assigned a colour.Images are just collections of pixels with colours. . For
example one may allocate 3 bits for each pixel, with one bit for each primary colour (refer
figure 1.1). This 3-bit representation allows each primary to vary independently between
two intensity levels: 0 (off) 1 (0n). Hence each pixel can take on one of the eight colours that
correspond to the corners of the RGB colour tube.

Figure 1.1: Direct coding of colors using 3 bits.

A widely accepted industry standard uses 3 bytes, or 24 bits per pixel, with one byte for each
primary colour. This way each primary colour is allowed to have 256 different intensity
levels, corresponding to binary values from 00000000 to 11111111. Thus a pixel can take
on a colour from 256 x 256 x

256 or 16.7 million possible choices. This 24-bits format is commonly referred to as the true
colour representation the difference between two colours that differ by one intensity level
in one or more of the primaries are virtually undetectable under normal viewing conditions.
Hence a more precise representation involving more bits is of little use in terms of perceived
colour accuracy.

A notable special case of direct coding is the representation of black-and- white (bi-level)
and gray-scale images, where the three primaries always have the same value and hence
need not be coded separately. A black-and- white image requires only one bit per pixel, with

Unit 1: Introduction to Computer Graphics and Graphics System 15

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

bit value 0 representing black and bit 1 representing white. A grey scale image is typically
coded with 8 per pixel to allow a total of 256 intensity or gray levels.

Although the direct coding method features simplicity and has supported a variety of
applications, there is a relatively high demand for storage space when it comes to the 24-bit
standard. For example, a 1000x1000 true colour image would take up three million bytes.
Further, even if every pixel in that image had a different colour, there would be only one
million colours in the image. In many applications the number of colours that appear in any
one particular image is much less. Therefore the 24-bit representation’s ability to have 16.7
million different colours appear simultaneously in a single image seems to be overkill.

SELF-ASSESSMENT QUESTIONS - 5

17. How many possible colour choices can a pixel have?

18. ___________ type of images are a special case of direct coding where three primaries
always have same value and need not be coded separately.
19. Can each pixel in an image take on one of the eight colours that correspond to the
corners of the RGB colour tube. State yes or no.

Unit 1: Introduction to Computer Graphics and Graphics System 16

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. LOOKUP TABLE

Lookup tables are tables that store numeric data in a multidimensional array format. In the
simpler two-dimensional case, lookup tables can be represented by matrices. Each element
of a matrix is a numerical quantity, which can be precisely located in terms of two indexing
variables. At higher dimensions, lookup tables can be represented by multidimensional
matrices, whose elements are described in terms of a corresponding number of indexing
variables.

Image representation using a lookup table can be viewed as a compromise between the
desire to have a lower storage requirement and the need to support a reasonably sufficient
number of simultaneous colors. In this approach pixel values do not code colours directly.
Instead, they are addresses or indices into a table of colour values. The colour of a particular
pixel is determined by the colour value in the table entry that the value of the pixel
references.

Figure 1.2 shows a lookup table with 256 entries. The entries have addresses 0 through 255.
Each entry contains a 24-bit RGB colour value. Considering that pixel values are 1-byte, or 8-
bit, quantities, the colour of a pixel whose value is I, where 0 ≤ i ≤ 255, is determined by the
colour value in the table entry whose address is i. This 24-bit 256 entry lookup table
representation is often referred to as the 8-bit format. It reduces the storage requirement of
a 1000 x 1000 image to one million bytes plus 768 bytes for the colour values in the lookup
table. It allows 256 simultaneous colours that are chosen from 16.7 million possible colours.

Figure 1.2: 24-bits 256-entry lookup table

Unit 1: Introduction to Computer Graphics and Graphics System 17

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

It is important to remember that, using the lookup table representation, an image is defined
not only by its pixel values but also by the colour values in corresponding lookup table. These
colour values form a colour map for the image.

SELF-ASSESSMENT QUESTIONS - 6

20. 24-bit 256 entry lookup table representation is often referred to as_______________
representation.
21. _____________ is commonly referred to as the true colour
22. The colour of a particular pixel is determined by the in the lookup table entry.

Unit 1: Introduction to Computer Graphics and Graphics System 18

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

8. GRAPHICS DEVICES:

Graphics storage devices: Memory cards, external or internal hard drives, Cloud, USB flash
drives, and laptop computers are all suitable for digital image storage.

Graphics Input and output devices: Graphics input devices are classified into two types and
they are Manual data entry devices and Direct data entry devices. Manual input devices are
those peripheral devices through which the user can enter the data manually (by hand) at
the time of processing. Eg: Keyboard,mouse,joystick, Touch screen , touchpad etc. Direct data
devices are those peripheral devices through which we can directly input the data from the
source and transfer that to the computer system.Eg: Scanner,barcode reader, MICR(Magnetic
Ink Character reader),OCR(Optical character recognition), Sensors, Biometric systems etc.

Graphics output devices: It is an electromechanical device, which accepts data from a

computer and translates them into form understandable by the user. Various types of
graphics output devices are LCDs, CRT units, Plasma screens, Printers and Plotters.

Computer graphics software: Graphics software is a type of computer programme used for
image creation and editing. There is a vast selection of graphics software available on the
market, ranging from simple applications that allow users to generate and edit simple images
to complex tools that can be used to build intricate 3D models and animations. Adobe
Photoshop, Corel Painter, and Autodesk Maya are a few of the most popular graphics
software applications.

The graphics software components are the tools that you use to create and manipulate your
graphic images. These components include the following:

Image editors : These are the tools used to produce and modify graphic images. Photoshop,
Illustrator, and Inkscape are widespread

Vector graphics editors: ools used to generate or modify vector graphics. CorelDRAW and
Inkscape are well-known vector graphics editors.

3D modeling software: This is the software that you use to create three-dimensional models.
Common 3D modeling software includes Maya, 3ds Max, and Cinema 4D.

Unit 1: Introduction to Computer Graphics and Graphics System 19

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Animation software: This is the software that you use to create animations. Common
animation software includes Adobe After Effects, Apple Motion, and Autodesk Maya.

Video editing software: This is the software that you use to edit videos. Common video
editing software includes Adobe Premiere Pro, Apple Final Cut Pro, and Avid Media
Composer

Types of graphics software are :

• Vector graphics software: This type of software is used to create images made up of
lines and shapes, which can be scaled without losing quality. Vector graphics are often
used for logos, illustrations, and diagrams.
• Raster graphics software: This type of software is used to create images made up of
pixels, which cannot be scaled without losing quality. Raster graphics are often used for
photos and web graphics.
• 3D graphics software: This type of software is used to create three-dimensional
images and animations. 3D graphics are often used for product visualization and
gaming.
• Animation software: This type of software is used to create moving images, either by
animating existing graphics or by creating new ones from scratch. Animation software
is often used for movies, commercials, and video games.

SELF-ASSESSMENT QUESTIONS - 7

23. How many types of graphics softwares are there?

24. Software that creates images with help of pixel is called -------------.
25. What are the two types of graphic input devices?

Unit 1: Introduction to Computer Graphics and Graphics System 20

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

9. SUMMARY

We shall now summarise the unit content. We started our discussion with the overview of
graphics. Typically, the term computer graphics refers to several different things: the
representation and manipulation of image data by a computer, and the various technologies
used to create and manipulate the images. We also discussed the advantages of the
interactive graphics. We can say that computer graphics concerns the pictorial synthesis of
real or imaginary objects which require to represent the imaginary objects. We also explored
the concept of visualization. Today, visualization has ever- expanding applications in science,
education, engineering (example, product visualization), interactive multimedia, medicine,
etc. The RGB colour model is an additive colour model in which red, green, and blue light is
added together in various ways to reproduce a broad array of colours. We concluded this
unit with the discussion of direct and lookup table, which deal with the discussion of pixel
colour representation.

10. TERMINAL QUESTIONS SHORT ANSWERS.

1. Give the overview of computer graphics.

2. List the advantages of interactive graphics.
3. Explain image processing as picture analysis.
4. Define visualization.
5. Explain RGB colour model.
6. Explain direct coding and lookup tables.
7. How has graphics improvised the field of architecture?
8. What are two types of graphic input devices?
9. Explain Scientific Visualization.
10. What are the different types of computer software?

Terminal Questions long answers

1. Explain the differences between Interactive and Passive Graphics.

2. Elaborate Visualisation and its types.
3. Why is direct coding used ?
4. Explain the RGB and CMYK color models in detail.

Unit 1: Introduction to Computer Graphics and Graphics System 21

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. What are the different types of graphics softwares ?Elaborate.

6. Discribe the various applications of computer graphics in detail.

11. ANSWERS
Self Assessment Questions

1. User interaction
2. True
3. Visual display
4. Computer Aided Drawing
5. It is interactive graphics facility to change shape, color and other properties.
6. Digital Signal Processing.
7. Passive graphics
8. b) Pattern detection
9. Optical character recognition
10. Picture analysis
11. Data Visualization
12. Traditional areas of scientific visualization are flow visualization, medical visualization,
astrophysical visualization, and chemical visualization
13. True
14. Darkest color black
15. True
16. Primary color
17. 16.7 million
18. gray-scale images
19. Yes
20. 8 bit format
21. Colour value
22. Four
23. Raster graphics software
24. Manual data entry devices and Direct data entry devices.

Unit 1: Introduction to Computer Graphics and Graphics System 22

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Terminal Questions short answers

1. Typically, the term computer graphics refers to several different things: the
representation and manipulation of image data by a computer the various technologies.
For further details refer section 1.2.
2. Today, a high quality graphics display of personal computer provide one of the most
natural means of communication with a computer. It provides tools for producing
pictures not only of concrete, “real-world” objects but also of abstract. For more details
refer section 1.3.
3. The image enhancement deals with the improvement in the image quality by eliminating
noise or by increasing image contrast. Pattern detection and recognition deal with the
detection and clarification of standard patterns and finding deviations from these
patterns. For more details refer section 1.4.
4. Visualization is a technique for creating images, diagrams, or animations to
communicate a message. Visualization through visual imagery has been an effective way
to communicate both abstract and concrete ideas. For more details refer section 1.5.
5. The RGB color model is an additive colour model in which red, green, and blue light is
added together in various ways to reproduce a broad array of colours. For more details
refer section 1.6.
6. Image representation is essentially the representation of pixel colours. Using direct
coding we allocate a certain amount of storage space for each pixel to code its colour. In
this approach pixel values do not code colours directly. Instead, they are addresses or
indices into a table of colour values. For more details refer section 1.8.
7. Computer-aided drawing is used in the design of structures, vehicles, and aero planes.
This aids in adding minute details to the drawing and produces more precise, sharp, and
detailed drawings with superior specifications. For more details refer section 1.1.
8. Graphics input devices are classified into two types and they are Manual data entry
devices and Direct data entry devices. Manual input devices are those peripheral devices
through which the user can enter the data manually (by hand) at the time of processing.
Eg: Keyboard,mouse,joystick, Touch screen , touchpad etc. Direct data devices are those
peripheral devices through which we can directly input the data from the source and

Unit 1: Introduction to Computer Graphics and Graphics System 23

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

transfer that to the computer system.Eg: Scanner,barcode reader, MICR(Magnetic Ink

Character reader),OCR(Optical character recognition), Sensors, Biometric systems etc.
9. Scientific visualization is the transformation, selection, or representation of data from
simulations or experiments, with an implicit or explicit geometric structure, to allow the
exploration, analysis, and understanding of data. Traditional areas of scientific
visualization are flow visualization, medical visualization, astrophysical visualization,
and chemical visualization.For more details refer section 1.5.
10. Vedio graphics editors, 3D modeling software, Video editing software, Animation
software. For more details refer section 1.9.

Terminal Questions long answers

1. Refer section 1.3

2. Refer section 1.2
3. Refer section 1.5
4. Refer section 1.4
5. Refer section 1.6
6. Refer section 1.1

Unit 1: Introduction to Computer Graphics and Graphics System 24

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 2: Scan Conversion 1

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 2
Scan Conversion
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3-4
1.1 Objectives - -
2 Points and Lines 1, 2 1 5-7
3 Line Drawing Algorithm DDA algorithm 3, 4, 2 8 - 12
4 Bresenham’s Line Algorithm 5, 6 3 13 - 19
5 Circle Generating Algorithm 7, 8, 9, 10, 11 4
20 - 26
5.1 Midpoint Circle algorithm - -
6 Ellipse Generating Algorithm 12, 13 5 27 - 30
7 Scan Line Fill Algorithm 14, 15, 16 6 31 - 35

8 Boundary Fill Algorithm 17 -

36 - 37

9 Flood Fill Algorithm - 7 38 - 39

10 Summary - - 40
11 Terminal Questions - - 41
12 Answers - - 42 - 43

Unit 2: Scan Conversion 2

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed the different types of storage and display devices used for
the graphical system. We explored the various display devices like storage graphics display,
raster-scan display and 3D viewing devices. We also learnt about the various input and
output units which are exclusively used for graphical purposes like plotters, printers,
digitizers, light pens and so on. Active and passive graphic devices were discussed and the
unit concluded with a note on the various components of computer graphic software.

In this unit we will discuss what points are and how to draw lines in graphics. A line drawing
algorithm is a graphical algorithm for approximating a line segment on discrete graphical
media. In computer graphics, a hardware or software implementation of a digital differential
analyzer (DDA) is used for linear interpolation of variables over an interval between start
and end point. In this unit DDA algorithm is discussed for the purpose of learning to draw a
line. We will also learn about one more line drawing algorithm called Bresenham’s line
algorithm, which determines which points in an n-dimensional raster should be plotted in
order to form a close approximation to a straight line between two given points. It is
commonly used to draw lines on a computer screen. Along with line drawing algorithms we
have a discussion on the properties of a circle and the algorithm used to draw a circle.
Bresenham's circle algorithm (also known as a midpoint circle algorithm) is an algorithm for
determining the points needed for drawing a circle with a given radius and the origin for the
circle. The scan converts a circle of a specified radius, centered at a specified location. We
will also discuss the ellipse generating algorithm, scan filling algorithm, boundary and flood
fill algorithms.

Unit 2: Scan Conversion 3

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1.1 Objectives:

After studying this unit, you should be able to:

❖ Explain the role of points and lines in graphics

❖ List and discuss the line drawing algorithm
❖ Analyze DDA line drawing algorithm.
❖ Explain Bresenham’s line drawing algorithm.
❖ Discuss circle generating algorithm
❖ Explain ellipse generating algorithm
❖ Implement scan fill algorithm
❖ Explain boundary and flood fill algorithms

Unit 2: Scan Conversion 4

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. POINTS AND LINES

Point plotting is done by converting a single coordinate position furnished by an application

program into appropriate operations for the output device in use. With a CRT monitor, for
example, the electron beam is turned on to illuminate the screen phosphor at the selected
location. The positioning of the electron beam depends on the display technology. A random-
scan (vector) system stores point-plotting instructions in the display list and coordinate
values in these instructions are converted to deflection voltages that position the electron
beam at the screen locations to be plotted during each refresh cycle. For a black and- white
raster system, on the other hand, a point is plotted by setting the bit value corresponding to
a specified screen position within the frame buffer to 1. Then, as the electron beam sweeps
across each horizontal scan line, it emits a burst of electrons (plots a point) whenever a value
of 1 is encountered in the frame buffer. With an RGB system, the frame buffer is loaded with
the color codes for the intensities that are to be displayed at the screen pixel positions.

Line drawing is accomplished by calculating intermediate positions along the line path
between two specified endpoint positions. An output device is then directed to fill in these
positions between the endpoints. For analog devices, such as a vector pen plotter or a
random-scan display, a straight line can be drawn smoothly from one endpoint to the other.
Linearly varying horizontal and vertical deflection voltages are generated that are
proportional to the required changes in the x and y directions to produce a smooth line.

Digital devices display a straight-line segment by plotting discrete points between the two
endpoints. Discrete coordinate positions along the line path are calculated from the equation
of the line. For a raster video display, the line color (intensity) is then loaded into the frame
buffer at the corresponding pixel coordinates. The video controller, by reading from the
frame buffer "plots" the screen pixels. Screen locations are referenced with integer values,
so plotted positions may only approximate actual line positions between two specified
endpoints. For example, computed line position of (10.48, 20.51) would be converted to pixel
position (10, 21). Thus rounding of coordinate values to integers causes lines to be displayed
with a stairstep appearance ("the jaggies"), as represented in Fig 3.1.

Unit 2: Scan Conversion 5

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 2.1: Stairstep effect (jaggies)

The characteristic stairstep shape of raster lines is particularly noticeable on systems with
low resolution, and their appearance can be somewhat improved by displaying them on
high-resolution systems. More effective techniques for smoothing raster lines are based on
adjusting pixel intensities along the line paths. In this case it can be assumed that pixel
positions are referenced according to scan-line number and column number (pixel position
across a scan line). This addressing scheme is illustrated in Fig. 3.2. Scan lines are numbered
consecutively from 0, starting at the bottom of the screen; and pixel columns are numbered
from 0, left to right across each scan line.

Figure 2.2: Pixel positions referenced by scan line number and column number

To load a specified color into the frame buffer at a position corresponding to column x along
scan line y, we will assume that we have available a low- level procedure of the form SetPixel

Unit 2: Scan Conversion 6

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

(x, y). We may also want to be able to retrieve the current frame-buffer intensity setting for
a specified location. This can be accomplished with the low-level function getPixel (x, y).

SELF-ASSESSMENT QUESTIONS -1

1. DDA stands for _____________________ .

2. ___________________ is accomplished by calculating intermediate positions
along the line path between two specified endpoint positions.
3. Which method is used to load a specified color into the frame buffer at a specified
position?

a. getPixel (x, y) b. SetPixel (x, y), c. getcolor (x, y) d. setcolor (x, y)

4. The stairstep effect is also called as __________________

Unit 2: Scan Conversion 7

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. LINE DRAWING ALGORITHM

A line connects two points. It is a basic element in graphics. To draw a line, you need two
points between which you can draw a line. A line drawing algorithm is a graphical algorithm
for approximating a line segment on discrete graphical media.

The Cartesian slope-intercept equation for a straight line is

Y =m . x +b (Equation 2-1)

Here, m represents the slope of the line and b as they intercept, given that the two endpoints
of a line segment are specified at positions (x1, y1) and (x2, y2) as shown in Fig. 3.3.

Figure 2.3: Line path between endpoint positions (x1, y1) and (x2, y2).

We can determine values for the slope m and y intercept b with the following calculations:

𝑦2 −𝑦1
𝑚= 𝑥2 −𝑥1
(Equation 2-2)

Algorithms for displaying straight lines are based on the line equation 3-1 and the
calculations given in Equations 3-2 and 3-3.

For any given x interval ∆x along a line, we can compute the corresponding y interval ∆y from
equation 3-2 as

∆y = m∆ x (Equation 2-3)

Similarly, we can obtain the x interval of ∆x corresponding to a specified ∆y as

∆𝑦
∆𝑥 = (Equation 2-4)
𝑚

Unit 2: Scan Conversion 8

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

These equations from (2-1 to 2-5) form the basis for determining deflection voltages in
analog devices. For lines with slope magnitudes | m | < 1, can be set proportional to a small
horizontal deflection voltage and the corresponding vertical deflection is then set
proportional to ∆y as calculated from Eq. 2-4. For lines whose slopes have magnitudes | m |
> 1, ∆y can be set proportional to a small vertical deflection voltage with the corresponding
horizontal deflection voltage set proportional to Ax, calculated from Eq. 2-5. For lines with
m = 1, ∆x = ∆y and the horizontal and vertical deflections voltages are equal. In each case, a
smooth line with slope m is generated between the specified endpoints.

On raster systems, lines are plotted with pixels, and step sizes in the horizontal and vertical
directions are constrained by pixel separations. That is, a line must be "sampled" at discrete
positions and the nearest pixel to the line determined at each sampled position. This scan
conversion process for straight lines is illustrated in Fig. 2.4, for a near horizontal line with
discrete sample positions along the x axis.

Figure 2.4: Straight line segments

2.4 Figure 2.4 depicts the straight line with five sampling positions along the x axis
between x1 and x2.DDA Algorithm

The digital differential analyzer (DDA) is a scan-conversion line algorithm based on

calculating either ∆y or ∆x, using equation 2-4 or equation 2-5. The line is sampled at unit
intervals in one coordinate and corresponding integer values nearest the five sampling
positions along line path is determined for the other coordinate. Consider first a line with
positive slope, as shown in Fig. 2.3. If the slope is less than or equal to 1, it is sampled at unit
x intervals (∆x = 1) and each successive y value is computed as

Unit 2: Scan Conversion 9

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Y k +1 =Yk +m (Equation 2-6)

Subscript k takes integer values starting from 1 for the first point, and increases by 1 until
the final endpoint is reached. Since m can be any real number between 0 and 1, the calculated
y values must be rounded to the nearest integer. For lines with a positive slope greater than
1, the roles of x and y are reversed, which means that sampling is done at unit y intervals (∆y
= 1) and each succeeding x value calculated as

1
𝑥𝑘+1 = 𝑥𝑘 + 𝑚 (Equation 2-7)

Equations 2-6 and 2-7 are based on the assumption that lines are to be processed from the
left endpoint to the right endpoint (Fig. 3.3). If this processing is reversed so that the starting
endpoint is at the right, then either we have ∆x = -1 and

Y k +1 =Yk -m (Equation 2-8)

Or (when the slope is greater than I) we have ∆ = -1 with

1
𝑥𝑘+1 = 𝑥𝑘 − 𝑚 (Equation 2-9)

Equations 2-6 through 2-9 can also be used to calculate pixel positions along a line with
negative slope. If the absolute value of the slope is less than 1 and the start endpoint is at the
left, we set ∆x = 1 and calculate y values with equation. 2-6. When the start endpoint is at the
right (for the same slope), we set ∆x = -1 and output primitives obtain y positions from
equation 2-8. Similarly, when the absolute value of a negative slope is greater than 1, we use
∆y = -1 and equation 2-9 or we use ∆y = 1 and equation 2-7.

This algorithm is summarized in the following procedure, which accepts as input the two
endpoint pixel positions. Horizontal and vertical differences between the endpoint positions
are assigned to parameters dx and dy. The difference with the greater magnitude determines
the value of parameter steps. Starting with pixel position (xa, ya), the offset needed at each
step to generate the next pixel position along the line path can be determined. We loop
through this process steps times. If the magnitude of dx is greater than the magnitude of dy
and xa is less than xb, the values of the increments in the x and y directions are 1 and m,
respectively. If the greater change is in the x direction, but xa is greater than xb, then the

Unit 2: Scan Conversion 10

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

decrements - 1 and -m are used to generate each new point on the line. Otherwise, we use a
unit increment (or decrement) in the y direction and an x increment (or decrement) of l / m.
#include “device. h"
# define ROUND(a) ((int)(a+0.5)
void line DDA (int xa, int ya, int xb, int yb)
{

int dx = xb - xa, dy = yb - ya, steps, k;

float xIrncrement, yIncrement, x = xa, y = ya;
if (abs (dx) > abs (dy) steps = abs (dx) ;
else steps = abs (dy);
xIncrement = dx / (float) steps;
yIncrement = dy/ (float) steps;
setpixel (ROUNDlxl, ROUND(y) ) :
for (k=O; k<steps; k++)
{

x += xIncrement;
y += yIncrement;
setpixel (ROUND(x), ROVND(y));
}

The} DDA algorithm is a faster method for calculating pixel positions than the direct use of
equation 2-1. It eliminates the multiplication in equation 3-1 by making use of raster
characteristics, so that appropriate increments are applied in the x or y direction to step to
pixel positions along the line path. The accumulation of round off error in successive
additions of the floating- point increment, however, can cause the calculated pixel positions
to drift away from the true line path for long line segments. Further, the rounding operations
and floating-point arithmetic in procedure line DDA are still time- consuming. The
performance of the DDA algorithm can be improved by separating the increments m and l/m
into integer and fractional parts so that all calculations are reduced to integer operations.

Unit 2: Scan Conversion 11

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS -2

5. The Cartesian slope-intercept equation for a straight line is _________.

6. On raster systems, lines are plotted with __________.
7. We can improve the performance of the DDA algorithm by separating the
increments and into integer and fractional parts.
8. The accumulation of round off error in successive additions of the floating- point
increment can cause the calculated pixel positions to drift away from the true line
path for ____________.

Unit 2: Scan Conversion 12

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. BRESENHAM’S LINE ALGORITHM

Figure 2.5 tries to show the Bresenham's line drawing algorithm visually. From the figure it
can be understood that it is impossible to draw the true line that we want because of the
pixel spacing (in other words there is not enough precision for drawing true lines on a PC
monitor especially when dealing with low resolutions).

Figure 2.5: Bresenham's line drawing algorithm

The Bresenham's line-drawing algorithm is based on drawing an approximation of the true

line. The true line is indicated in bright color, and its approximation is indicated in black
pixels. In this example the starting point of the line is located exactly at 0, 0 and the ending
point of the line is located exactly at 9, 6.

The way the algorithm works is described here. First it decides which axis is the major axis
and which is the minor one. The major axis is longer than the minor axis. In the picture
illustrated above, the major axis is the X axis. Each iteration progresses the current value of
the major axis (starting from the original position), by exactly one pixel. Then it decides
which pixel on the minor axis is appropriate for the current pixel of the major axis.

How can one approximate the right pixel on the minor axis that matches the pixel on the
major axis? - That is what Bresenham's line-drawing algorithm is all about. It does so by
checking which pixel's center is closer to the true line. In the picture above it would be easy
to identify these pixels by looking at them. The center of each pixel is marked with a dot. The
algorithm takes the coordinates of that dot and compares it to the true line. If the span from

Unit 2: Scan Conversion 13

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

the center of the pixel to the true line is less or equal to 0.5, the pixel is drawn at that location.
That span is more generally known as the error term.

It must be understood here that the whole algorithm is done in straight integer math with
no multiplication or division in the main loops (no fixed point math either). Basically, during
each iteration through the main drawing loop, the error term is tossed around to identify the
right pixel as close as possible to the true line. Let us now consider these two deltas between
the length and height of the line: dx = x1 - x0; dy = y1 - y0; This is a matter of precision and
since we are working with integers it is necessary to scale the deltas by 2, generating two
new values: dx2 = dx*2; dy2 = dy*2. These are values that will be used to change the error
term. The error term must be first initialized to 0.5 and that cannot be done using an integer.
Further, finally the scaled values must be subtracted by either dx or dy (the original, unscaled
delta values) depending on what the major axis is (either x or y).

Consider drawing a line on a raster grid where we restrict the allowable slopes of the line to
the range 0 ≤ 𝑚 ≤ 1 . If we further restrict the line- drawing routine so that it always
increments x as it plots, it becomes clear that, having plotted a point at (x, y), the routine has
a severely limited range of options as to where it may put the next point on the line:

• It may plot the point (x+1, y), or:

• It may plot the point (x+1, y+1).

So, working in the first positive octant of the plane, line drawing becomes a matter of
deciding between two possibilities at each step. The diagram 3.6 depicts the situation where
the plotting program finds itself having plotted (x, y).

Figure 2.6: Plotted (x, y)

Unit 2: Scan Conversion 14

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

In plotting (x, y) the line drawing routine will, in general, be making a compromise between
what it would like to draw and what the resolution of the screen actually allows it to draw.
Usually the plotted point (x, y) will be in error, and the actual, mathematical point on the line
will not be addressable on the pixel grid. So we associate an error, , with each y ordinate, the
real value of y should be . This error will range from -0.5 to just under +0.5. In moving from
x to x+1 we increase the value of the true (mathematical) y-ordinate by an amount equal to
the slope of the line, m. We will need to choose to plot (x+1, y) if the difference between this
new value and y is less than 0.5.

𝑦+∈ +𝑚 < 𝑦 + 0.5

Otherwise we will plot (x+1, y+1). It should be clear that by doing so, we minimize the total
error between the mathematical line segment and what actually gets drawn on the display.

The error resulting from this new point can now be written back into ∈. This will allow us to
repeat the whole process for the next point along the line, at x+2.The new value of error can
adopt one of two possible values, depending on what new point is plotted. If (x+1,y) is
chosen, the new value of error is given by:

∈𝑛𝑒𝑤 ← (𝑦+∈ +𝑚) − 𝑦

Otherwise it is:

∈𝑛𝑒𝑤 ← (𝑦+∈ +𝑚) − (𝑦 + 1)

This gives an algorithm for a DDA which avoids rounding operations, instead using the error
variable to control plotting:

∈← 0, 𝑦 ← 𝑦1

𝒇𝒐𝒓 𝑥 ← 𝑥1 𝒕𝒐 𝑥2 𝒅𝒐

Plot point at (x. y).

𝑰𝒇 (∈ +𝑚 < 0.5) ∈← ∈ +𝑚

Else

Unit 2: Scan Conversion 15

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

𝑦 ← 𝑦 + 1, ∈←∈ +𝑚 − 1

End If

End for

This still employs floating point values. Consider, however, what would happen if we
multiply across both sides of the plotting test by and then by 2:

∈ +𝑚 < 0.5

∆𝑦
∈ + < 0.5
∆𝑥

2 ∈ ∆𝑥 + 2∆𝑦 < ∆𝑥

All quantities in this inequality are now integral.

Substitute for ∈′ 𝑓𝑜𝑟 ∈ ∆𝑥 . The test becomes:

2(∈′ + ∆𝑦) < ∆𝑥

This gives an integer-only test for deciding which point to plot.

The update rules for the error on each step may also be cast into ∈′ form.

Consider the floating-point versions of the update rules:

∈←∈ +𝑚

∈←∈ +𝑚 − 1

Multiplying through by ∆𝑥 yields:

∈ ∆𝑥 ← ∈ ∆𝑥 + ∆𝑦

∈ ∆𝑥 ← ∈ ∆𝑥 + ∆𝑦 − ∆𝑥

Which is in ∈ ′ form.

∈′ ←∈′ + ∆𝑦

Unit 2: Scan Conversion 16

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

∈′ ←∈′ + ∆𝑦 − ∆𝑥

Using this new “error'' value, ∈′ , with the new test and update equations gives Bresenham's
integer-only line drawing algorithm:

∈′ ← 0, 𝑦 ← 𝑦1

𝑭𝒐𝒓 𝑥 ← 𝑥1 𝒕𝒐𝑥2 𝒅𝒐

Plot point at (x. y).

𝑰𝒇 (2(∈′ + ∆𝑦) < ∆𝑥)

∈′ ←∈′ + ∆𝑦

Else

𝑦 ← 𝑦 + 1, ∈′ ←∈′ + ∆𝑦 − ∆𝑥

End If

End For

So we can conclude by saying that, integer only is efficient and fast and multiplication by 2
can be implemented by left-shift. This version limited to slopes in the first octant, 0 ≤ 𝑚 ≤
1..

Below is the Bresenham algorithm for line segments in the first octant.

Unit 2: Scan Conversion 17

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

void linev6(Screen &s,

unsigned x1, unsigned y1,

unsigned x2, unsigned y2,

unsigned char colour )

int dx = x2 - x1, dy

= y2 - y1, y =
y1,

eps = 0;

for ( int x = x1; x <= x2; x++ ) { s.Plot

(x,y,colour);

eps += dy;

if ( (eps << 1) >= dx ) { y++;

eps -= dx;

Unit 2: Scan Conversion 18

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS -3

9. The Bresenham's line-drawing algorithm is based on drawing an approximation of

the true line. (True/False)
10. If (x+1, y) is chosen, the new value of error is given by:
a. ∈𝑛𝑒𝑤 ← (𝑦+∈ +𝑚) − 𝑦
b. y = m . x+ b
11. Integer only is efficient and fast and multiplication by 2 can be implemented by left-
shift. (True/False)

Unit 2: Scan Conversion 19

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. CIRCLE GENERATING ALGORITHM

Since the circle is a frequently used component in pictures and graphs, a procedure for
generating either full circles or circular arcs is included in most graphics packages. Generally,
a single procedure can be provided to display either circular or elliptical curves.

Definition and Properties of Circles

A circle is defined as the set of points that are all at a given distance r from a center position
(xc, yc) as shown in figure 2.7. This distance relationship is expressed by the Pythagorean
theorem in Cartesian coordinates as

(x- xc)2 +(y -yc)2 = r2 (Equation 2-10)

Figure 2.7: Circle with center coordinates (xc, yc) and radius r.

This equation can be used to calculate the position of points on a circle circumference by
stepping along the x axis in unit steps from xc - r to xc + r and calculating the corresponding
y values at each position as

𝑦 = 𝑦𝑐 ± √𝑟 2 − (𝑥𝑐 − 𝑥)2 (Equation 2-11)

and use the equation 2-11 to compute the pixels of the circle. At the end of it, we are likely to
find a code that looks something like:

Unit 2: Scan Conversion 20

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

public void circleSimple(int xCenter, int yCenter, int radius, Color c)

{
int pix = c.getRGB();
int x, y, r2;
r2 = radius * radius;
for (x = -radius; x <= radius; x++) {
y = (int) (Math.sqrt(r2 - x*x) + 0.5);
raster.setPixel(pix, xCenter + x, yCenter + y);
raster.setPixel(pix, xCenter + x, yCenter - y);
}
}

But this is not the best method for generating a circle. One problem with this approach is that
it involves considerable computation at each step. Moreover, the spacing between plotted
pixel positions is not uniform, as demonstrated in Figure 2.11.

The spacing can be adjusted by interchanging x and y (stepping through y values and
calculating x values) whenever the absolute value of the slope of the circle is greater than 1.
But this merely increases the computation and processing required by the algorithm.

Figure 2.8: Positive half of a circle plotted with equation 2-12 and with (xc, yc) = (0,0)

Another way to eliminate the unequal spacing shown in figure 2.8 is to calculate points along
the circular boundary using polar coordinates r and θ

Unit 2: Scan Conversion 21

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

(figure 2.9). Expressing the circle equation in parametric polar form yields the following pair
of equations.

𝑥 = 𝑥𝑐 + 𝑟 𝑐𝑜𝑠𝜃

𝑦 = 𝑦𝑐 + 𝑟 𝑠𝑖𝑛𝜃 (Equation 2-12)

When a display is generated with these equations using a fixed angular step size, a circle is
plotted with equally spaced points along the circumference. The step size chosen for 8
depends on the application and the display device. Larger angular separations along the
circumference can be connected with straight line segments to approximate the circular
path. For a more continuous boundary on a raster display, the step size at 1/r can be set up.
This plots pixel positions that are approximately 1 unit apart.

Computation can be reduced by considering the symmetry of the circle. The shape of the
circle is similar in each quadrant. One can generate the circle section in the second quadrant
of the xy plane by noting that the two circle sections are symmetric with respect to the axis.
The circle sections in the third and fourth quadrants can be obtained from sections in the
first and second quadrants by considering symmetry about the x axis. This can be taken
further and it can be noted that there is also symmetry between octants. Circle sections in
adjacent octants within one quadrant are symmetric with respect to the 450 line dividing the
two octants. Figure 2.9 depicts that the calculation of a circle point (x, y) in one octant yields
the circle points shown for the other seven octants.

These symmetry conditions are illustrated in figure 2.9, where a point at position (x, y) on a
one-eighth circle sector is mapped into the seven circle points in the other octants of the xy
plane. Taking advantage of the circle symmetry in this way it is possible to generate all pixel
positions around a circle by calculating only the points within the sector from x = 0 to x = y.

Unit 2: Scan Conversion 22

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 2.9: Symmetry of circle

5.1 Midpoint Circle Algorithm

In computer graphics, the midpoint circle algorithm is an algorithm used to determine the
points needed for drawing a circle. The algorithm is a variant of Bresenham's line algorithm,
and is thus sometimes known as Bresenham's circle algorithm.

We will try to simplify the function evaluation that takes place on each iteration of the circle-
drawing algorithm. Our first objective is to simplify the function evaluation that takes place
on each iteration. In midpoint circle algorithm the screen center point is located at (xc, yc)
and calculated pixel position id (x, y).

Screen Position is calculated adding:

xc = xc + x

yc = yc + y

The circle function is defined as:

< 0, if (x, y) is inside the circle boundary

¦circle (x, y) = = 0, if (x,y) is on the circle boundary
> 0, if (x, y) is outsider the circle boundary

Unit 2: Scan Conversion 23

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

circle (x, y) = x2 + y2 – r2 any point (x, y) on the boundary of the circle with radius r satisfies
the equation circle(x, y)=0, that is

Figure 4.4(a) depicts the situation for the circle when f(x, y)=0.

Figure 2.10: (a) Circle f(x, y) =0

Sampling Position: (xk, yk )

Next Determine, (xk+1, yk) or (xk+1, yk-1) (Equation 2-13)

The circle function tests in the above equation 4-4, performed for the midpoints between
pixels near the circle path at each sampling step. The figure 2.11 (b) shows the midpoint
between two pixels. Assuming that we have just plotted the pixel at (xk, yk), we next need to
determine whether the pixel at position (xk+1, yk) or the one at position (xk+1, yk-1) is
closer to the circle.

Figure 2.11: (b) Midpoint point between two pixels.

The decision parameter is calculated as pk =circle (xk+1, yk – ½ )

= (xk+1)2 + (yk- ½)2 – r2

Unit 2: Scan Conversion 24

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

If pk > 0, this midpoint is inside the circle and the pixel on scan line yk is closer to the circle
boundary. Otherwise, the mid-position is outside or on the circle boundary, and we select
the pixel on scan-line yk-1.

Successive decision parameters are obtained using incremental calculations. pk+1 = ¦circle
(xk+1 + 1, yk+1 – ½ )

= [ (xk+1) + 1 ]2 + [yk+1 – ½ ]2 – r2

pk+1 = pk + 2(xk+1) + (yk+12- yk2) – (yk+1 – yk) + 1

where yk+1 is either yk or yk-1, depending on the sign of pk. Increments for obtaining pk+1
are either 2xk+1 + 1 (if pk is negative) or 2xk+1 + 1 – 2yk+1. Evaluation of the terms 2xk+1
and 2yk+1 can also be done incrementally as

2xk+1 = 2xk + 2

2yk+1 = 2yk – 2

Two Terms (0, r) and (0, 2r).

Each successive value is obtained by adding 2 to the previous value of 2x and subtracting 2
from the previous value of 2y. The initial decision parameter is obtained by evaluating the
circle function at the start position (x0, y0) = (0, r):

p0 = circle (1, r- ½)

= 1 + (r- ½)2 – r2

p0 = 5/4 – r

If the radius r is specified as an integer, we can simply round p0 to p0 = 1 – r (for r an integer)

since all increments are integers.

Unit 2: Scan Conversion 25

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

The steps involved in the midpoint circle algorithm as follows:

1. Input radius r and circle centre (xc, yc), and obtain the first point on the circumference
of a circle centered on the origin as (x0, y0) = (0, r).
2. Calculate the initial value of the decision parameter as p0 = 5/4 – r.
3. At each xk position, starting at k=0, perform the following test: if pk < 0, the next point
along the circle centered on (0, 0) is (xk+1, yk) and p

pk+1 = pk + 2xk+1 + 1

Otherwise, the next point along the circle is (xk+1, yk-1) and

pk+1 = pk + 2xk+1 + 1 – 2yk+1

where 2xk+1 = 2xk + 2; and 2yk+1 = 2yk – 2

4. Determine symmetry points in the other seven octants.

5. Move each calculated pixel position (x, y) onto the circular path centered on (xc, yc)
and plot the coordinate values. x = x + xc, y = y + yc
6. Repeat steps 3 through 5 until x ≥y.

SELF-ASSESSMENT QUESTIONS - 4

12. In circle, spacing between plotted pixels can be adjusted by interchanging x and
y, when the absolute value of the slope of the circle is __________ .
13. Midpoint circle algorithm is otherwise called as_______________.
14. Circle sections in adjacent octants within one quadrant are symmetric with
respect to the____________degree line dividing the two octants.
15. Circle (x, y) = x2 + y2 – r2 any point (x, y) on the boundary of the circle with radius
r satisfies the equation when .

a. circle(x, y)=0b. circle(x, y)=1 c. circle(x, y)=-1 d. circle(x, y)>0

Unit 2: Scan Conversion 26

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. ELLIPSE GENERATING ALGORITHM

Loosely stated, an ellipse is an elongated circle. Therefore, elliptical curves can be generated
by modifying circle-drawing procedures to take into account the different dimensions of an
ellipse along the major and minor axes.

An ellipse is defined as the set of points such that the sum of the distances from two fixed
positions (foci) is the same for all points as depicted in figure 2.12.

Figure 2.12: Ellipse generated about foci F1 and F2

If the distances to the two foci from any point P = (x, y) on the ellipse are labeled d1 and d2,
then the general equation of an ellipse can be stated as

d1, + d2, = constant (Equation 2-14)

Expressing distances d1, and d2, in terms of the focal coordinates F1 = (x1, y1) and F2 = (x1,
y2), we have

√(𝒙 − 𝒙𝟏 )𝟐 + (𝒚 − 𝒚𝟏 )𝟐 + √(𝒙 − 𝒙𝟐 )𝟐 + (𝒚 − 𝒚𝟐 )𝟐 = 𝒄𝒐𝒏𝒕𝒂𝒏𝒕 (Equation 2-15)

By squaring this equation, isolating the remaining radical, and then squaring again, we can
rewrite the general ellipse equation in the form

Ax2 + By2 + Cxy + Dx + Ey + F = 0 (Equation 2-16)

where the coefficients A, B, C, D, E, and F are evaluated in terms of the focal coordinates and
the dimensions of the major and minor axes of the ellipse. The major axis is the straight line
segment extending from one side of the ellipse to the other through the foci. The minor axis

Unit 2: Scan Conversion 27

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

spans the shorter dimension of the ellipse, bisecting the major axis at the halfway position
(ellipse center) between the two foci.

An interactive method for specifying an ellipse in an arbitrary orientation is to input the two
foci and a point on the ellipse boundary. With these three coordinate positions, it is possible
to evaluate the constant in equation 2-15. Then, the coefficients in equation can be evaluated
and used to generate pixels along the elliptical path. Ellipse equations are greatly simplified
if the major and minor axes are oriented to align with the coordinate axes. In figure 2.13, an
ellipse is shown in "standard position" with major and minor axes oriented parallel to the x
and y axes. Parameter r, in this case labels the semi major axis, and parameter ry, the semi
minor axis.

Figure 2.13: Ellipse centered at (xc, yc) with major and minor axis

The equation of the ellipse shown in figure 2.13 can be written in terms of the ellipse center
coordinates and parameters r, and ry, as

2
𝑥−𝑥𝑐 2 𝑦−𝑦𝑐
( ) +( ) = 1 (Equation 2-17)
𝑟𝑥 𝑟𝑦

Using polar coordinates r and θ. we can also describe the ellipse in standard position with
the parametric equations:

x = xc,. + r1 cos θ y = yc,. + ry, sin θ

An ellipse has somewhat less symmetry than a circle, so more computations may be required
to find its pixel coordinates. To avoid computations where the slope (rate of change of one

Unit 2: Scan Conversion 28

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

variable with respect to change in the other) is too large, we will need to split the first
quadrant into a region where,

∆𝑦 ∆𝑦 ∆𝑥 ∆𝑥
|∆𝑥 | ≤ 1 |∆𝑥 | ≤ 1 |∆𝑦| ≤ 1 |∆𝑦| ≤ 1 and a region where

The analogous decision function will involve

F(x,y ) =a2 x2 +b 2 y 2 – a 2 b2 f (x, y) = a2x2 +b 2 y 2 – a 2 b2

<0 𝑖𝑓 (𝑥, 𝑦 )𝑖𝑠 𝑖𝑛𝑠𝑖𝑑𝑒 𝑡ℎ𝑒 𝑒𝑙𝑙𝑖𝑝𝑠𝑒

𝑓(𝑥, 𝑦) = {= 0 𝑖𝑓 (𝑥, 𝑦 )𝑖𝑠 𝑜𝑛 𝑡ℎ𝑒 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑒𝑙𝑙𝑖𝑝𝑠𝑒 }
>0 𝑖𝑓 (𝑥, 𝑦) 𝑖𝑠 𝑜𝑢𝑡𝑠𝑖𝑑𝑒 𝑡ℎ𝑒 𝑒𝑙𝑙𝑖𝑝𝑠𝑒

which has the property that As with the circle algorithm, we can start at the top point (0, b)
and take unit steps in the x direction until we reach the point where the slope is -1 where we
switch to unit steps in the (negative) y direction. At each step we will have to compute the
slope of the tangent line:

𝑑𝑦 𝑑𝑦/𝑑𝑚 2𝑏 2 𝑥
= =
𝑑𝑥 𝑑𝑥/𝑑𝑚 2𝑎2 𝑦

At the point where dy/dx = −1, b2x = a2y, we can use the sign of a2y − b2x to determine where
to switch from delta x to delta y zones. Starting at (0, b), the midpoint algorithm, as applied
to the ellipse, entails evaluating the decision parameter:

1 1 2
𝑝𝑥𝑘 = 𝑓 (𝑥𝑘 + 1, 𝑦𝑘 − ) = 𝑏 2 (𝑥𝑘 + 1)2 + 𝑎2 (𝑦𝑘 − ) − 𝑎2 𝑏 2
2 2

Again, the decision parameter is negative only if the midpoint between adjacent candidate
pixels is inside the ellipse boundary, The coordinates of the next pixel to color will be either
(xk+1, yk) or (xk+1, yk−1) depending on whether the decision parameter is negative or not.

This can be adapted to the midpoint algorithm to plot ellipses with rotated axes by using
𝐴𝑥 2 + 𝑏𝑦 2 + 𝐶𝑥𝑦 + 𝐷𝑥 + 𝐸𝑦 + 𝐹 = 0 to work the test parameter for the entire perimeter, or
to plot it without rotation and then rotate it using some efficient rotation routine.

So we will now discuss the pseudo code for an ellipse midpoint algorithm:

Unit 2: Scan Conversion 29

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. Get parameters a, b, h, k for center coordinate h and k and major and minor axis lengths
2a and 2b.
2. Calculate the initial decision parameter value in the first region:

𝑎2 𝑎2
𝑝𝑥0 = 𝑏 2 − 𝑎2 𝑏 + 𝑝𝑥0 = 𝑏 2 − 𝑎2 𝑏 +
4 4

3. Use the formula given above to iterate pxk+1 until b2x>a2y.

4. Rename the current (xk, yk) as (x0, y0) and calculate the initial decision parameter value
in the 2nd region:
1 2
𝑝𝑦0 = 𝑏 2 (𝑥0 + ) − 𝑎2 (𝑦0 − 1)2 − 𝑎2 𝑏 2
2
5. Use the formula given above to iterate pyk+1 until y <= 0.
6. For both regions plot the other three symmetry points.
7. Shift to center at h, k.

SELF-ASSESSMENT QUESTIONS - 5

16. Ellipse has less symmetry than a circle. State (True/False).

17. f(x, y)=0 if
a. (x, y) inside the ellipse
b. (x, y) is on the boundary of the ellipse
c. (x, y) is outside the ellipse
d. (x, y) not in the ellipse

Unit 2: Scan Conversion 30

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. SCAN LINE FILL ALGORITHM

The scan line fill algorithm is an ingenious way of filling in irregular polygons. The algorithm
begins with a set of points. Each point is connected to the next, and the line between them is
considered to be an edge of the polygon. The points of each edge are adjusted to ensure that
the point with the smaller y value appears first. Next, a data structure is created which
contains a list of edges that begin on each scan line of the image. The program progresses
upwards from the first scan line. For each line, pixels that contain an intersection between
this scan line and an edge of the polygon are filled in. Then, the algorithm progresses along
the scan line, turning on when it reaches a polygon pixel and turning off when it reaches
another one, all the way across the scan line.

There are two special cases that are solved by this algorithm. First, a problem may arise if
the polygon contains edges that are partially or completely out of the image. The algorithm
solves this problem by moving pixel values that are outside the image to the boundaries of
the image. This method is preferable to eliminating the pixel completely, because its deletion
could result in a "backwards" coloring of the scan line which means that pixels that should
be on are off and vice versa.

The second case has to do with the concavity of the polygon. If the polygon has a concave
portion, the algorithm will work correctly. The pixel on which the two edges meet will be
marked twice, so that it is turned off and then on. If, however, the polygon is convex at the
intersection of two edges, the coloring will turn on and then immediately off, resulting in
"backwards" coloring for the rest of the scan line. The problem is solved by using the vertical
location of the next point in the polygon to determine the concavity of the current portion.
Overall, the algorithm is very robust that the main challenge it faces is with polygons that
have large number of edges, like circles and ellipses. Filling in such a polygon could be very
costly.

Scan line algorithm:

For each scan line:

1. Find the intersections of the scan line with all edges of the polygon.

Unit 2: Scan Conversion 31

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. Sort the intersections by increasing x coordinate.

3. Fill in all pixels between pairs of intersections.

We will discuss now the scan line algorithm with an example depicted in figure 2.14.

Figure 2.14: Calculation of intersection through scan line

Problem: Calculating intersections is slow.

Solution: Incremental computation / coherence

It can be observed in figure 2.14, that for scan line number 8 the sorted list of x-coordinates
is (2, 4, 9, 13) (b and c are initially no integers). Therefore pixels are filled with x-coordinates
2-4 and 9-13.

Edge Coherence

• Observation: Not all edges intersect each scan line.

• Many edges intersected by scan line i will also be intersected by scan line i+1
• Formula for scan line is y = s, for an edge is y = mx + b
• Their intersection is s = mxs + b --> xs = (s-b)/m
• For scan line s + 1,
• x s+1 = (s+1 - b)/m = xs + 1/m
• Incremental calculation: xs+1 = xs + 1/m

Unit 2: Scan Conversion 32

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Processing Polygons

• Polygon edges are sorted according to their minimum / maximum Y.

• Scan lines are processed in increasing (upward) / decreasing (downward) Y order.
• When the current scan line reaches the lower / upper endpoint of an edge it becomes
active.
• When the current scan line moves above the upper / below the lower endpoint, the
edge becomes inactive and can be observed from figure 4.8.

Figure 2.15: Polygon edges

• Active edges are sorted according to increasing X. Filling the scan line starts at the
leftmost edge intersection and stops at the second. It restarts at the third intersection
and stops at the fourth. . .

Polygons Fill Rules (to ensure consistency)

1. Horizontal edges: Do not include in edge table

2. Horizontal edges: Drawn either on the bottom or on the top.
3. Vertices: If local max or min, then count twice, else count once.
4. Either vertices at local minima or at local maxima are drawn.
5. Only turn on pixels whose centers are interior to the polygon: round up values on the
left edge of a span, round down on the right edge

Polygon Fill Example

1. The edge table (ET) shown in figure 2.15, with edges entries sorted I in increasing y
and x of the lower end.
2. ymax: max y-coordinate of edge
3. xmin: x-coordinate of lowest edge point

Unit 2: Scan Conversion 33

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. 1/m: x-increment used for stepping

5. From one scan line to the next

Figure 2.16: Sample polygon with Active Edge Table (AET)

Following are the processing steps:

1. Set y to smallest y with entry in ET, that is, y for the first non-empty bucket.
2. Init Active Edge Table (AET) to be empty.
3. Repeat until AET and ET are empty:
a. Move form ET bucket y to the AET those edges whose ymin=y (entering edges)
b. Remove from AET those edges for which y=ymax (not involved in next scan line),
then sort AET (remember: ET is presorted)
c. Fill desired pixel values on scan line y by using pairs of x-coords from AET
d. Increment y by 1 (next scan line)
e. For each nonvertical edge remaining in AET, update x for new y

Unit 2: Scan Conversion 34

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

As per the reference of polygon depicted in the figure 2.16, scan line 9 will be and scan line
10 is

SELF-ASSESSMENT QUESTIONS - 6

18. The algorithm is an ingenious way of filling in irregular polygons.

19. ET stands for ________ .

Unit 2: Scan Conversion 35

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

8. BOUNDARY FILL ALGORITHM

We will now discuss the boundary fill algorithm. Here we must start at a point inside the
figure and paint with a particular colour, this filling continues until a boundary colour is
encountered. There are two ways to do this process. First we will discuss the four connected
fill where we recommend the process from left right up down

Procedure Four_fill (x, y, fill_col, bound_col: integer);

var
curr_color: integer;
begin
curr_color := inquire_color (x,y)
if (curr_color <> bound color) and (curr_color <> fill_col) then
begin
set_pixel (x, y, fill_col)
Four_Fill (x+1, y, fill_col, bound_col);
Four_fill (x-1, y, fill_col, bound_col);
Four_fill (x, y+1, fill_col, bound_col);
Four_fill (x, y-1, fill_col, bound_col);
End;

But we face the following problem depicted in figure 4.10.

Figure 2.16: Four fill pattern

The above problem leads to the eight-connected fill algorithm where we test all eight
adjacent pixels. So we can add the four more following calls.

eight_fill (x+1, y-1, )

eight_fill (x+1, y+1, )

Unit 2: Scan Conversion 36

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

eight_fill (x-1, y-1, )

eight_fill (x-1, y+1, )

The above 4-fill and 8-fill algorithms involve heavy duty recursion which may consume
memory and time. Better algorithms are faster, but more complex. They make use of pixel
runs (horizontal groups of pixels).

Unit 2: Scan Conversion 37

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

9. FLOOD FILL ALGORITHM

Sometimes we may want to fill in (or recolour) an area that is not defined within a single
colour boundary. We can paint such areas by replacing a specified interior colour instead of
searching for a boundary colour value. This approach is called a flood-fill algorithm. Here
we start from a specified interior point (x, y) and reassign all pixel values that are currently
set to a given interior colour with the desired fill colour. If the area we want to paint has
more than one interior colour, we can first reassign pixel values so that all interior points
have the same color. Using either a 4-connected or 8-connected approach, we then step
through pixel positions until all interior points have been repainted. The following
procedure flood fills a 4-connected region recursively, starting from the input position.

Void floodFill4 (int x, int y, int fill color , int oldcolor)

if (getpixel (x. y) == oldcolor) {

setcolor ( fill color ) ;

setpixel (x, y ) :

floodFill4 ( x + l , y, fillColor, oldColor):

floodfill4 (x-1, y, fillcolor, oldcolor);

floodFill4 (x, y + l , fillcolor, oldcolor);

floodFill4 ( x , y-1, fillColor, oldcolor);

We can modify procedure floodFill4 to reduce the storage requirements of the stack by filling
horizontal pixel spans, as discussed for the boundary-fill algorithm. In this approach, we

Unit 2: Scan Conversion 38

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

stack only the beginning positions for those pixel spans having the value oldcolor. Starting
at the first position of each span, the pixel values are replaced until a value other than
oldcolor is encountered.

We display a filled polygon in PHlGS and GKS with the function fillArea (n, wcvertices). The
displayed polygon area is bound by a series of n straight line segments connecting the set of
vertex positions specified in wcvertices. These packages do not provide fill functions for
objects with curved boundaries. Implementation of the fillArea function depends on the
selected type of interior fill. We can display the polygon boundary surrounding a hollow
interior, or we can choose a solid color or pattern fill with no border for the display of the
polygon. For solid fill, the fillArea function is implemented with the scan-line fill algorithm
to display a single color area. Another polygon primitive available in PHlGS is fillAreaSet.
This function allows a series of polygons to be displayed by specifying the list of, vertices for
each polygon. Also, in other graphics packages, functions are often provided for displaying a
variety of commonly used fill areas besides general polygons.

SELF-ASSESSMENT QUESTIONS - 7

20. __________ and _____________ are the two algorithm used for boundary fill process.
21. How many parameters required for eight_fill procedure?

a. 3 b. 4 c. 5 d. 2

22. 4-fill and 8-fill algorithms which involves recursion will not consume memory and
time. (True/False)

Unit 2: Scan Conversion 39

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

10. SUMMARY

This unit provides information about what points and lines are and the role of the line
drawing algorithm. A line drawing algorithm is a graphical algorithm for approximating a
line segment on discrete graphical media. On discrete media, such as pixel-based displays
and printers, line drawing requires such an approximation (in nontrivial cases). By contrast
continuous media does not require algorithm to draw a line. In this unit we also discussed
two line drawing algorithms called DDA and Bresenham algorithm. The digital differential
analyzer (DDA) is a scan conversion line algorithm based on calculation of either Dy or Dx.
The line at unit intervals is one coordinate and determines the corresponding integer values
the nearest line for the other coordinate and considers first a line with positive slope. The
Bresenham line algorithm is an algorithm which determines which points in an n-
dimensional raster should be plotted in order to form a close approximation to a straight line
between two given points. It is commonly used to draw lines on a computer screen. Also the
simple circle drawing algorithm has large gaps where the slope approaches vertically and
also with regard to efficiency of calculation. Eight-way symmetry gives more efficient circle
centered at (0, 0). Similarly there is an incremental algorithm for drawing circles called mid-
point circle algorithm using eight-way symmetry. Elliptical curves can be generated by
modifying circle-drawing procedures to take into account the different dimensions of an
ellipse along the major and minor axes. We also discussed the algorithm which generates
ellipse. Scan line fill algorithm intersect scan line with polygon edges and fill between pairs
of intersections. We concluded this unit with a discussion on boundary and flood fill
algorithms to fix a boundary and fill those boundaries with the specified colour.

Unit 2: Scan Conversion 40

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

11. TERMINAL QUESTIONS

1. What are points and Lines?

2. Explain drawing algorithm.
3. Explain DDA algorithm in detail.
4. Discuss Bresenham’s line drawing algorithm.
5. Explain the implementation of Bresenham’s algorithm.
6. Explain the properties of circle.
7. Discuss mid-point circle algorithm.
8. Explain the process of ellipse generating algorithm.
9. Describe scan line fill algorithm.
10. Explain the process of boundary fill algorithm.
11. Explain flood fill algorithm.

Unit 2: Scan Conversion 41

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

12. ANSWERS

(Self Assessment Questions)

1. Digital Differential Analyzer

2. Line drawing
3. b. SetPixel (x,y)
4. The jaggies
5. y = m. x + y
6. Pixels.
7. m and l / m
8. Long line segments
9. True.
10. ∈𝑛𝑒𝑤 ← (𝑦+∈ +𝑚) − 𝑦
11. True.
12. greater than 1
13. Bresenhsam’s algorithm 3. 45
14. a. circle(x, y)=0
15. True
16. b. (x, y) is on the boundary of the ellipse
17. Scan line fill
18. edge table.
19. Four-connected and eight-connected
20. b. 4
21. False. These algorithms consume more memory and time.

Unit 2: Scan Conversion 42

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Terminal Question long answers

1. Points and lines are two of the most fundamental concepts in Geometry, but they are
also the most difficult to define. A point is a location in space. As for a line segment, we
specify a line with two points for more details. Refer section 2.1.
2. Line is a continuous object. However, the computer monitor (screen) consists of a
matrix of pixels. So then, how do we represent the continuous object on this discrete
matrix for more details. Refer section 2.3.
3. The digital differential analyzer (DDA) is a scan conversion line algorithm based on
calculation either Dy or Dx. The line at unit intervals is one coordinate and determine
corresponding integer values nearest line for the other coordinate. For more details
Refer subsection 2.1.
4. The Bresenham's line-drawing algorithm is based on drawing an approximation of the
true line. The true line is indicated in bright color, and its approximation is indicated in
black pixels. For more details Refer subsection 2.2.
5. Consider drawing a line on a raster grid where we restrict the allowable slopes of the
line to the range 0 m 1. If we further restrict the line- drawing routine so that it
always increments x as it plots, it becomes clear that,having plotted a point at (x, y). For
more details Refer subsection 2..2.
6. A circle is a simple shape of Euclidean geometry consisting of the set of points in a plane
that are a given distance from a given point, the centre. The distance between any of
the points and the centre is called the radius. Circles are simple closed curves which
divide the plane into two regions. For more details Refer section 2.6.
7. The midpoint circle algorithm is an algorithm used to determine the points needed for
drawing a circle. The algorithm is a variant of Bresenham's line algorithm. For more
details Refer section 2.7.
8. An ellipse is defined as the set of points such that the sum of the distances from two
fixed positions (foci) is the same for all points. For more details Refer section (2.7).
9. Scan line algorithm intersect scan line with polygon edges and fill between pairs of
intersections. For more details Refer section 2.8.

Unit 2: Scan Conversion 43

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

10. Boundary fill algorithm, start at a point inside the figure and paint with a particular
color, this filling continues until a boundary color is encountered. For more details
Refer section 2.8.
11. Sometimes we want to fill in (or recolor) an area that is not defined within a single color
boundary. We can paint such areas by replacing a specified interior color instead of
searching for a boundary color value. For more details Refer section 2.6.

Unit 2: Scan Conversion 44

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 3: 2D Transformation 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 3
2D Transformation
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Basic Transformations 1 , 2, 3 , 4 , 5 1
2.1 Translation - -
4-7
2.2 Rotation - -
2.3 Scaling - -
3 Matrix Representation and Homogeneous - 2
8 - 11
Coordinates

4 Transformations between Coordinate Systems 6, 7 , 8 3 12 - 15

5 Other Transformation in 2D 9
5.1 Reflection - - 16 - 20
5.2 Shear - -
6 Composite Transformations in 2D 10 4 21 - 25
7 Summary - - 26

8 Terminal Questions - - 26
9 Answers - - 27 - 28

Unit 3: 2D Transformation 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed the scan conversion process which included circle
generation algorithm, ellipse generating algorithm, scan line polygon, fill algorithm and flood
fill algorithm.

In this unit we will discuss the basic geometric transformations of 2D like translation,
rotation, and scaling. Other transformations that are often applied to objects include
reflection and shear. We will also discuss the basic transformations which can be expressed
in matrix form. But many graphic applications involve sequences of geometric
transformations. Hence a general form of matrix is required for representing such
transformations. In geometry, a coordinate system is a system which uses one or more
numbers or coordinates, to uniquely determine the position of a point or other geometric
elements. We will also discuss the transformation of the coordinate system.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Explain the basic transformations of 2D

❖ Explain matrix representation and homogeneous coordinates
❖ Analyze transformations between coordinate system
❖ Discuss the other transformations of 2D
❖ Describe the composite transformations of 2D.

Unit 3: 2D Transformation 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. BASIC TRANSFORMATIONS

Animations are produced by moving the 'camera' or the objects in a scene along animation
paths. Changes in orientation, size and shape are accomplished with geometric
transformations that alter the coordinate descriptions of the objects. The basic geometric
transformations are translation, rotation, and scaling. Other transformations that are often
applied to objects include reflection and shear. In 3D graphics, it is necessary to use 3D
transformations. However, 3D transformations can be quite confusing so it helps to begin
with 2D.

2.1Translation

The most basic transformation is translation. The formal definition of a translation is "every
point of the pre-image is moved the same distance in the same direction to form the image."
In figure 3.1, where the triangle translation explains the concept.

Figure 3.1: Triangle translation

Each translation follows a rule. In this case, the rule is "5 to the right and 3 up." It is also
possible to translate a pre-image to the left, down, or any combination of two of the four
directions. More advanced transformation geometry is done on the coordinate plane. The
transformation for this example would be T(x, y) = (x+5, y+3).

2.2 Rotation

A rotation is a transformation that is performed by "spinning" the object around a fixed point
known as the center of rotation. It is also possible to rotate the object at any degree measure,

Unit 3: 2D Transformation 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

but 90° and 180° are two of the most common rotations and also, rotations are done
counterclockwise.

The figure 3.2 (a) shown at the right is a rotation of 90°, rotated around the center of rotation.
It is essential to note that all dotted lines are the same distance from the center or rotation
than from the point. Also all the dashed lines form 90° angles. That's what makes the rotation
a rotation of 90°. Figure 3.2 (b) an example of a rotation of 180°:

(a) (b)

Figure 3.2: a) Rotation of 90 degree b) rotation of 180 degree

Some geometry lessons go back to algebra by describing the formula that explains the
translation. In the example above, for a 180° rotation, the formula is:

Rotation 180° around the origin: T(x, y) = (-x, -y)

This type of transformation is often called coordinate geometry because of its connection
back to the coordinate plane.

2.3 Scaling

Take the example of wanting to double the size of a 2-D object. What does double mean here?
Does it mean double the size, width only, height only, or double along some line? When we
talk about scaling, it usually means some amount of scaling along each dimension. That is, it
must be specified as to how much the size must be changed along each dimension. In figure
3.3, shown are a triangle and a house both of which have been doubled in width and height
(note that the area is more than doubled).

Unit 3: 2D Transformation 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 3.3: Double scaling

The scaling for the x dimension does not have to be the same as the y dimension. If these are
different, then the object is distorted as we can observe in figure 3.4.

Figure 3.4: Different x and y dimension

When the size is doubled, what happens to the resulting object? In the figure 3.4, the scaled
object is always shifted to the right. This is because it is scaled with respect to the origin.
That is, the point at the origin is left fixed. Thus scaling by more than 1, moves the object
away from the origin and scaling of less than 1, moves the object towards the origin. This is
because of how basic scaling is done. The figure 3.5 has been scaled simply by multiplying
each of its points by the appropriate scaling factor. For example, the point p= (1.5, 2) has
been scaled by 2 along x and .5 along y. Thus, the new point is q = (2*1.5, 5*2) = (1, 1).

Figure 3.5: scaling by proper factor

Unit 3: 2D Transformation 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 1

1. Pick basic transformations from the given options

a. Translation
b. Rotation
c. Scaling
d. All the above
2. While Scaling the object usually shifts to the ____________ .
3. A rotation is a transformation that is performed by ____________ the object around a
fixed point known as the_____________ .
4. We can do __________ if you want to double the size of given objects.

Unit 3: 2D Transformation 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. MATRIX REPRESENTATION AND HOMOGENEOUS COORDINATES

In many applications it is necessary for a sequence of geometric transformations to be

performed. For example, in scenes involving animations we may need translations, rotations
and scaling. To process such combined operations more efficiently, the previous
transformation equations can be reformulated.

Recall section 3.2 where each basic transformation took the form

P’= M1 .P + M2, (3-2)

Where, P’, P, M2 are 2-element column vectors and M1 is a 2 × 2 matrix.

For a sequence of operations, we may first perform scaling, followed by a translation and
then a rotation, each time, calculating newer coordinates from the old. This sequence can be
performed in “one go” by a composite matrix multiplication, without the additive term M2 in
(3-1), if we employ special coordinates known as homogeneous coordinates.

Homogeneous coordinates

In computer graphics we usually use homogeneous coordinates to represent 3D points. Each

coordinate has four dimensions: the normal three plus a “1”. Matrices are 4×4, and they can
encapsulate not only rotations and scales, but also translations and perspective.

First, represent each (x, y) with the homogeneous coordinate triple (xh, yh, h) where

𝑥ℎ 𝑦ℎ
𝑥= ,𝑦 = ,
ℎ ℎ

Thus the homogeneous coordinate representation of the 2D point (x, y) is (h.x, h.y, h). Since
any h Non zero value can be used as the homogeneous parameter, we conveniently choose h
= 1 (we will require other values in 3D), so that: (x, y) has as its homogeneous coordinate
counterpart (x, y, 1).

We can now show that all 2D geometric transformations are just matrix multiplications in
homogeneous coordinates.

Unit 3: 2D Transformation 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

A 2D point (x, y) is translated to the point (x ‘, y’) by applying a shift (or translation) vector (
tx, ty ) to it as follows:

x’ = x+ tx , y’ = y + ty

In column vector notation with

𝑥 𝑥′ 𝑡𝑥
𝑷 = [𝑦] , 𝑷′ = [ ′ ] , 𝑻 = [𝑡 ]
𝑦 𝑦

We can write this as P’ = P + T. Thus,

i) For translation we write

𝑥′ 1 0 𝑡𝑥 𝑥
[𝑦′] = [0 1 𝑡𝑦 ] [𝑦] (3-3)
1 ⏟0 0 1 1
𝑇 (𝑡𝑥 ,𝑡𝑣 )

that is, We have the form

P’ =T (tx , ty).P 3-4)

Where, P’, P are 3-vectors and T is a 3 × 3 matrix. Note that:

1 0 −𝑡𝑥
[𝑇 (𝑡𝑥 , 𝑡𝑦 )]−1
= [0 1 −𝑡𝑦 ] (Prove This!).
0 0 1

ii) The transformation equations for the point (x, y) rotated through an angle θ about (0,
0) to the point (x’, y’) as

x’ = xcosθ – ysinθ y’

= xsinθ + ycosθ

With the rotation matrix,

𝑐𝑜𝑠 𝜃 − 𝑠𝑖𝑛 𝜃
𝑅=[ ]
𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃

Unit 3: 2D Transformation 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

We can write

𝑥′ 𝑐𝑜𝑠 𝜃 − 𝑠𝑖𝑛 𝜃 𝑥
[ ]=[ ] [ ] , 𝑜𝑟 𝑷′ = 𝑅. 𝑷
𝑦′ 𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃 𝑦

Similarly for rotations, about the origin, we write,

𝑥′ 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 𝑥
[𝑦′] = [𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃 0] [𝑦] (3-5)
1 ⏟0 0 1 1
𝑅(𝜃)

that is, we have the form P’ = R(θ).P (3-6)

Here we can show that

𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0
𝑅(𝜃)−1 = 𝑅(−𝜃) = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] = 𝑅(𝜃)𝑇 (3-7)
0 0 1

iii) A scaling transformation alters the size of an object. Here the new coordinates (x’, y’)
are given by

X’ = sxx, y’ = syy

Where, sx sy > 0 are scaling factors in the X ,Y directions respectively. In matrix form,

𝑥′ 𝑠 0 𝑥 𝑠𝑥 0
[ ] = [ 𝑜𝑥 𝑠𝑦 ] [𝑦 ] , 𝑜𝑟 𝑃 ′
= 𝑆. 𝑃 𝑊𝑖𝑡ℎ 𝑆 = 0 𝑠𝑦
𝑦′

For scaling about the origin as fixed point we write and recover

𝑥′ 𝑆𝑥 0 0 𝑥
[𝑦′] = [ 0 𝑆𝑦 0] [𝑦 ] (3-8)
1 ⏟0 0 1 1
𝑆 (𝑠𝑥, 𝑠𝑦 )

that is, we have the form P’ = S (sx, sy) . P (3-9)

Here we can show that

Unit 3: 2D Transformation 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1
0 0
𝑠𝑥
𝑆 (𝑠𝑥, 𝑠𝑦 )−1 = 0 0 (3-10)
1
𝑠𝑦
[0 0 1]

Further, to rotate about a general pivot or scale with regards to general fixed point, we can
use a succession of transformations about the origin.

SELF-ASSESSMENT QUESTIONS - 2

5. Each basic transformation take the form P’ = M1 .P + M2. State True/False.

Unit 3: 2D Transformation 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. TRANSFORMATIONS BETWEEN COORDINATE SYSTEMS

Apart from applying transformations on an object in a particular coordinate system, it is

necessary in many situations (example animation scenes) to be able to transform a
description from one coordinate system to another. Transforming from a Cartesian system
to a non-Cartesian one is complicated and is rarely required in CG. We thus concentrate on
transformations between Cartesian systems referred in figure 5.6, from XY-> X’Y’ with origin
(x0, y0) for the latter:

Figure 5.6: Cartesian system

To find the transformation matrix we proceed as follows:

1. Translate so that the origin (x0, y0) of (X’, Y’) --> (0, 0).
2. Rotate the X’ axis onto the X axis (that is, thro’ -θ)

For translation, we use the matrix:

1 0 −𝑥0
𝑇 (−𝑥0, − 𝑦0 ) = [0 1 −𝑦0 ] (3-11)
0 0 1

Giving the set up:

Unit 3: 2D Transformation 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Then we do a clockwise rotation via

𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0
𝑅(−𝜃) = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] (3-12)
0 0 1

The complete transformation matrix from XY→ X’Y’ is

𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0 1 0 −𝑥0

𝑀𝑥𝑦𝑥′𝑦 = 𝑅(−𝜃). 𝑇(−𝑥0 , −𝑦0 ) = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] . [0 1 −𝑦0 ] (3-13)
0 0 1 0 0 1

This result can also be obtained directly by deriving the relation between the respective
coordinate-distance pairs in the two systems. As an alternative to giving the orientation of
X’Y’ relative to XY as an angle depicted in figure 5.7 is to use unit vectors in the Y’ and X’
directions:

Figure 5.7: X’Y’ relative to XY as an angle

Thus, suppose V is a point vector in the XY system and is in the same direction as the +ve Y’
coordinate axis. If a unit vector along the +ve Y’ coordinate axis is, say,

Unit 3: 2D Transformation 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

𝑉
𝑣 = |𝑉| = (𝑉𝑋 , 𝑉𝑌 ) (3-14)

We take the unit vector along the +ve X’ coordinate axis as

𝑈 = (𝑈𝑥 𝑈𝑦 ) ≡ (𝑣𝑦 , −𝑣𝑥 ) (3-15)

So that the rotation matrix is

𝑢𝑥 𝑢𝑦 0
𝑅 = [ 𝑣𝑥 𝑣𝑦 0](3-16)
0 0 1

As an example, if v = (−1, 0) i.e., Y ' ~ −X axis and X ' ~ +Y axis then

0 1 0
𝑅 = [−1 0 0]
0 0 1

This result also follows from (5-12) by setting θ= 900. Typically, in interactive applications,
it is more convenient to choose a direction V relative to a position P0 which is the origin of
the X’Y’ system rather than relative to the XY origin referred in figure 5.8.

Figure 5.8: setting θ= 900

𝑃1 −𝑃0 𝑉
Then we can use the unit vector |𝑃1 −𝑃0 |
= |𝑉| = 𝑉 ≡ (𝑉𝑥 , 𝑉𝑌 ) (3-17)

with 𝑈 = (𝑈𝑋 , 𝑈𝑦 ) ≡ (𝑉𝑦 , −𝑉𝑥 ) (3-18)

Unit 3: 2D Transformation 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 3

6. Transforming from a Cartesian system to a non-Cartesian one is an easy process.

State True/False.
7. In interactive applications, it is more convenient to choose a direction V relative
to a position ____________ .
8. __________ and _____________ process require to transform the matrix.

Unit 3: 2D Transformation 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. OTHER TRANSFORMATION IN 2D

Apart from scaling, rotation, and translation, some of the other transformation techniques in
2D are as follows:

5.1 Reflection

This produces a mirror image of an object by rotating it 180o about an axis of rotation: For
reflection about the X-axis: the x-values remain unchanged, but the y-values are flipped. The
path of rotation is ┴ XY plane for all points in the body.

(3-20)

For reflection about the Y-axis: the y-values remain unchanged, but the x-values are flipped.
The path of rotation is ┴ XY plane for all points in the body.

(3-21)

For reflection about origin: Both x-values and y-values are flipped. The rotation is about axis
thro’ (0, 0) ┴ XY plane for all points in the body.

Unit 3: 2D Transformation 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

(3-22)

Note that the above matrix is same as the rotation matrix R(θ) with θ = 1800 ; so, both these
operations are equivalent.

For reflection about any other point referred in figure 3.9 is the same as rotation about an
axis through the fixed reflection point Pref =(xref, yref) and ┴ XY plane for all points in the body.

Figure 3.9: Reflection of point

For reflection about the diagonal line y=x:

(3-23)

Unit 3: 2D Transformation 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Prove that (3.23) is the transformation matrix by concatenating the matrices for:

• clockwise rotation through 450 about (0,0), rotating line y = x to X-axis

• reflection about the X-axis
• rotating X-axis back to the line y = x

It is essential to note that this process is also equivalent to reflection about the X-axis +
rotation thro’ +900.

For reflection about the diagonal line y = -x:

To derive the above transformation matrix,

• the identity with matrix for rotation thro ‘ - 450

• with one for reflection about the Y-axis
• with one for counterclockwise rotation thro’ +450.

For reflection about any line y=mx + b:

We employ the combination of translate-rotate-reflect transformations:

• first translate so line passes through (0,0)

• rotate line onto one of X or Y axis
• reflect about this axis
• inverse rotate, and
• inverse translate to restore to original line position

Unit 3: 2D Transformation 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Variations

• Reflections about the coordinate axes or (0,0) can also be implemented as scaling with
negative scaling factors
• Can set elements of reflection matrix to be <, >, ±1.
➢ For > |±1| mirror image is shifted further away
➢ For < |±1| mirror image is shifted nearer axis

5.2 Shear

A shear transformation distorts the shape of an object, causing “internal layers to slide over”.
Two types here are i) a shift in x-values and ii) a shift in y-values.

x-shear:

𝑥 ′ = 𝑥 + 𝑠ℎ𝑋 . 𝑦 1 𝑠ℎ𝑥 0
𝐺𝑖𝑣𝑒𝑛 𝑏𝑦 } 𝑤𝑖𝑡ℎ 𝑚𝑎𝑡𝑟𝑖𝑥 [0 1 0] (3-24)
𝑦′ = 𝑦
0 0 1

Here shx is any real number. For example, shx = 2, changes the square below into a
parm:

x-shear relative to a reference line y=yref :

𝑥 ′ = 𝑥 + 𝑠ℎ𝑋 . (𝑦 − 𝑦𝑟𝑒𝑓 ) 1 𝑠ℎ𝑥 −𝑠ℎ𝑥 . 𝑦𝑟𝑒𝑓

𝐺𝑖𝑣𝑒𝑛 𝑏𝑦 } 𝑤𝑖𝑡ℎ 𝑚𝑎𝑡𝑟𝑖𝑥 [0 1 0 ] (3-25)
𝑦′ = 𝑦
0 0 1

For example, with shx = ½, relative to the line y= y ref = -1 we have:-

Unit 3: 2D Transformation 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

(3-26)

y-shear relative to a reference line x=xref :

1 0 0
𝑥′ = 𝑥
𝐺𝑖𝑣𝑒𝑛 𝑏𝑦 𝑦 ′ = 𝑦 + 𝑠ℎ (𝑥 − 𝑥 )} 𝑤𝑖𝑡ℎ 𝑚𝑎𝑡𝑟𝑖𝑥 [𝑠ℎ𝑌 1 −𝑠ℎ𝑦 . 𝑥𝑟𝑒𝑓 ] (3-27)
𝑦 𝑟𝑒𝑓
0 0 1

This will shift coordinate positions vertically by an amount ∞. Distance from the reference
line x = xref, e.g., with shy = ½, relative to the line x = xref = −1 we have: -

Remark Shears may also be expressed as compositions of the basic transformations. For
example, (3-24) may be written as a + a scaling.

Unit 3: 2D Transformation 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. COMPOSITE TRANSFORMATIONS IN 2D

In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplication with a single composite (concatenated) matrix. We thus consider: -

i) Composite Translations

Let (tx1, ty1) and (tx2, ty2) be 2 successive translation vectors applied to P. Then the final (in
homogeneous coordinates)

𝑷′ = 𝑇(𝑡𝑥2, 𝑡𝑌2 )°[𝑇(𝑡𝑥1, 𝑡𝑦1 ). 𝑷]

= [𝑇(𝑡𝑥2, 𝑡𝑣2 )°𝑇(𝑡𝑥1 , 𝑡𝑣1 )]. 𝑷 (3-28)

In matrix form we have,

1 0 𝑡𝑥2 1 0 𝑡𝑥1 1 0 𝑡𝑥1 + 𝑡𝑥2

𝑇(𝑡𝑥2, 𝑡𝑌2 ) 𝑇(𝑡𝑥1, 𝑡𝑦1 ) = [0 1 𝑡𝑦2 ] [0 1 𝑡𝑦1 ] = [0 1 𝑡𝑦1 + 𝑡𝑦2 ]
0 0 1 0 0 1 0 0 1

= 𝑇(𝑡𝑥1 + 𝑡𝑥2, 𝑡𝑦1 + 𝑡𝑦2 ). (3-29)

Thus, 2 successive translations are additive (add their matrix arguments for the translation
parts).

ii) Composite rotations

For two successive rotations we similarly obtain,

𝑷′ = 𝑅(𝜃2 )°[𝑅(𝜃1 ). 𝑷]

= [ 𝑅(𝜃2 )°𝑅(𝜃1 )]. 𝑷 (3-30)

This can be verified by multiplying out the matrices giving

𝑅(𝜃2 ). 𝑅(𝜃1 ) = 𝑅(𝜃2 + 𝜃1 )

Or, 𝑷′ = 𝑅(𝜃2 + 𝜃1 ). 𝑷 (3-31)

Unit 3: 2D Transformation 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Thus, two successive rotations are additive (add the rotation angles in the corresponding
matrix arguments).

iii) Composite Scalings

For two successive scalings we similarly obtain for their matrices,

𝑆𝑥2 0 0 𝑆𝑥1 0 0 𝑆𝑥1 . 𝑆𝑥2 0 0

[ 0 𝑠𝑦2 0] [ 0 𝑆𝑦1 0] = 0 𝑆𝑦1 . 𝑆𝑦2 0
0 0 1 0 0 1 0 0 1

Or, S (sx2, sy2).S (sx1,sy1)=S(sx1.sx2.sy1.sy2) (3-32)

Thus two successive scalings are multiplicative. For example, if we triple the size of an object
twice => final scaling is 9 × original.

Composite Transformations in 2D in General Cases

In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplications.

i) General Pivot-Point Rotation

If we have a function for rotation about the origin, we can obtain rotation about any pivot
point (xr, yr) by means of the operations:

• Translate object so that (xr, yr) → (0,0)

• Rotate about (0, 0)
• Translate object so that pivot point (0,0) →(xr, yr), the original pivot

Unit 3: 2D Transformation 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Diagrammatically we have:

The composite transformation matrix is then (multiplying in reverse order)

1 0 𝑥𝑟 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 1 0 −𝑥𝑟

[0 1 𝑦𝑟 ] [ 𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] [0 1 −𝑦𝑟 ]
0 0 1 0 0 1 0 0 1

𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 𝑥𝑟 (1 − 𝑐𝑜𝑠𝜃) + 𝑦𝑟 𝑠𝑖𝑛𝜃

= [ 𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 𝑦𝑟 (1 − 𝑐𝑜𝑠𝜃) + 𝑥𝑟 𝑠𝑖𝑛𝜃] (3-33)
0 0 1

That is we obtain the form

T(xr, yr) . R(θ). T(-xr, -yr) ≡ R(xr, yr, θ)

We can use (3-33) to write a function that accepts a general pivot point, (xr, yr)

ii) General Fixed-Point Scaling

Similarly, if we have a function that scales only w.r.t. (0,0) , we can scale

w.r.t. any fixed point ( xf, yf) by means of the steps:

• Translate object so that ( xf, yf)→ (0,0)

• Scale w.r.t. at the origin (0,0)
• Inverse translate object so that fixed point (0,0) → ( xf, yf) the original fixed point.

Unit 3: 2D Transformation 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Diagrammatically we have:

The concatenated matrix is then

1 0 𝑥𝑓 𝑠𝑥 0 0 1 0 −𝑥𝑓 𝑆𝑥 0 𝑥𝑦 (1 − 𝑠𝑥 )
[0 1 𝑌𝑓 ] [ 0 𝑠𝑦 0 ] [0 1 −𝑦𝑓 ] = [ 0 𝑆𝑦 𝑥𝑦 (1 − 𝑠𝑦 )]
0 0 1 0 0 1 0 0 1 0 0 1

T(xf, yf).S (sx, sy).T(-xf, -yf)S(xf, yf, sx, sy) (3-34)

We can use (3-34) to write a function that accepts a general fixed point, (xf, yf)

iii) Scaling in Different Directions

Thus, the scaling factors (sx, sy) are used to apply to stretching along the usual OX-OY
directions. But, when stretching is required along some other OS1-OS2 directions referred in
figure 5.10 with scale factors (s1, s2) where

Figure 5.10: Scaling in different direction

Unit 3: 2D Transformation 24
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

we proceed as follows:

• Apply a rotation so that the OS1 – OS2 axes coincide the OX-OY axes
• Now scale as before
• Apply reverse rotation to the original directions

The composite matrix is

𝑠1 𝑐𝑜𝑠 2 𝜃 + 𝑠2 𝑠𝑖𝑛2 𝜃 (𝑠2 − 𝑠1 )𝑐𝑜𝑠 𝜃 𝑠𝑖𝑛𝜃 0

𝑅(𝜃)−1 𝑆(𝑠1 . 𝑠2 )𝑅(𝜃) = [ (𝑠2 − 𝑠1 )𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 𝑠1 𝑠𝑖𝑛2 𝜃 + 𝑠2 𝑐𝑜𝑠 2 𝜃 0] (3-35)
0 0 1

Example using (5.35) with s1 = 1, s2 = 2, θ=450 sends the unit square to a parallelogram.

Note that if the scaling above was performed in w.r.t at an arbitrary fixed point rather than
the origin, then an additional translation matrix would have to be incorporated into (3-35).

SELF-ASSESSMENT QUESTIONS - 4

9. _______________ Produces a mirror image of an object by rotating it 180o.

10. For reflection about any line y=mx + b, we employ the combination of
___________ ,___________ and_______ .
11. ________________ transformation distorts the shape of an object.
12. What is a composite translation?
13. Scaling factors (sx, sy) used apply __________________along the usual OX-OY directions.

Unit 3: 2D Transformation 25
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. SUMMARY

This unit provides information about the basic transformations like translation, rotation and
scaling and other information like reflection and shear. Matrix representation and
homogeneous coordinates scenes involving animations may need all of translations,
rotations and scaling. To process such combined operations more efficiently, we need to re-
formulate the previous transformation equations. Apart from applying transformations on
an object in a particular coordinate system, it is necessary in many situations to be able to
transform a description from one coordinate system to another, calls for transformations
between coordinate systems. We concluded the other transformations like reflection and
shear, reflection produces the mirror image object where as the shear transform distorts the
shape of an object causing internal layers to slide over.

8. TERMINAL QUESTIONS

1. Explain the basic transformations of 2D.

2. Discuss matrix representation and homogeneous coordinates.
3. Explain transformation between coordinate systems.
4. Explain the other transformations of 2D.
5. Discuss composite transformations in 2D.

Unit 3: 2D Transformation 26
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

9. ANSWERS

Self Assessment Questions

1. d) All the above

2. True
3. Spinning, center of rotation.
4. scaling
5. True
6. False
7. P0
8. Translate and rotate
9. Reflection
10. Translate, rotate and reflect
11. Shear
12. Let (tx1, ty1 ) and (tx2, ty2 ) be 2 successive translation vectors applied to P.
13. Stretching

Terminal Questions

1. Changes in orientation, size and shape are accomplished with geometric

transformations that alter the coordinate descriptions of the objects. The basic
geometric transformations are translation, rotation, and scaling, for more details refer
section 3.2.
2. In many applications we require a sequence of geometric transformations to be
performed. To process such combined operations more efficiently we re-formulate the
previous transformation equations. For more details refer section 3.3.
3. Apart from applying transformations, on an object in a particular coordinate system. It
is necessary in many situations (e.g., animation scenes) to be able to transform a
description from one coordinate system to another. Transforming from a Cartesian
system to a non- Cartesian one is complicated. For more details refer section 3.4.
4. Reflection and shearing are the other two transformations of 2D. Produces a mirror
image of an object by rotating it 180o about an axis of rotation. A shear transformation

Unit 3: 2D Transformation 27
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

distorts the shape of an object, causing “internal layers to slide over. For more details
refer section 3.5.
5. In effecting a sequence of transformations, we shall find that the result is equivalent to
matrix multiplication with a single composite (concatenated) matrix. Composite
translation, composite rotation and composite scaling are involved in composite
transformation. For more details refer section 3.6.

Unit 3: 2D Transformation 28
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 4: 2D Viewing 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 4
2D Viewing
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 2D Viewing Pipeline 1 1 4-5
3 Window to Viewport Coordinate 2, 3 2
6-8
Transformation
4 Clipping Operation 4, 5 3
4.1 Point Clipping - - 9 - 15
4.2 Line Clipping - -
5 Polygon Clipping 6, 7 4 16 - 20
6 Summary - - 21
7 Terminal Questions - - 21

8 Answers - - 22 - 23

Unit 4: 2D Viewing 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed various 2D transformations like translation, rotation, and
scaling. Other transformations that are often applied to objects include reflection and shear.
We also discussed matric representation and homogeneous coordinates as well as
transformations between coordinate systems. We concluded the unit with a discussion on
the various composite transformations of 2D.

In this unit we will explore the concept of 2D viewing as well as 2D viewing pipeline that will
allow the designer to pan the image. Window and viewport are the transformation matrix
that map the window from world coordinates into the viewport in screen coordinates. This
process can be achieved in the following sequence: window in world coordinates, window
translated to origin, window scaled to size of viewport and translated to final position.

We will then discuss about clippings. Three types of clippings are used in this unit - line, point
and polygon clippings.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Discuss 2D viewing pipeline

❖ Explain window to viewport coordinate transformation
❖ List and explain clipping operation
❖ Define polygon clipping

Unit 4: 2D Viewing 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. 2D VIEWING PIPELINE

The section of a scene that is to be shown on the screen is usually referred to as a clipping
window. This is because the unwanted portions are to be discarded or clipped off. After this
section is mapped to device coordinates, its placement can be controlled within the display
window on the screen by putting the mapped image of the clipping window into another
window known as the viewport. What is to be seen is selected by the clipping window, and
where it is shown on the output is the function of the viewport. In fact, several clipping
windows can be defined with each one mapping to a separate viewport, either on one device
or distributed amongst many. For better understanding, let us take the case in the figure 6.1.

a) World coordinates b) Viewport coordinates

Figure 4.1: World and viewport coordinates

2D viewing transformation can be described as:

1. Construct the scene in world coordinates, using modeling coordinates for each part.
2. Set up a 2D viewing system with an oriented window
3. Transform to viewing coordinates
4. Define a viewport in normalized (0..1 or -1...1) coordinates and map it from the view
coordinates
5. Clip all parts outside the viewport
6. Transform to device coordinates

Unit 4: 2D Viewing 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

When all the transformations are done, clipping can be done in normalized coordinates or
device coordinates. The clipping process is fundamental in computer graphics.

SELF-ASSESSMENT QUESTIONS - 1

1. The section of a scene that is to be shown on the screen is usually referred to as a

____________ .
2. After all the transformations the clipping can be done either in _____________ or
___________.
3. putting the mapped image of the clipping window into another window known as
the ________.

Unit 4: 2D Viewing 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. WINDOW TO VIEWPORT COORDINATE TRANSFORMATION

Once object descriptions have been transferred to the viewing reference frame, one must
choose the window extents in viewing coordinates and select the viewport limits in
normalized coordinates. Object descriptions are then transferred to normalized device
coordinates. This is done using a transformation that maintains the same relative placement
of objects in normalized space as was done in viewing coordinates. If for instance, a
coordinate position is at the center of the viewing window, it will be displayed at the center
of the viewport.

Figure 4.2 (a) illustrates the window-to-viewport mapping. A point at position (xw, yw) in
the window is mapped into position (xv, yv) in the associated view-port. To maintain the
same relative placement in the viewport as in the window, it is required that,

(a) (b)

Figure 4.2: Window to view port mapping

In figure 4.2 (b) a point at the position (xm, yw) in a designated window is mapped to
viewport coordinates (xv, yv) so that relative positions in the two areas are the same.

Solving these expressions for the viewport position (xv, yv), we have

xv = xvmin + (xw – xwmin)sx

yv = yvmin + (yw - ywmin)sy

where the scaling factors are

Equations can also be derived with a set of transformations that converts the window area
into the viewport area. This conversion is performed with the following sequence of
transformations:

Unit 4: 2D Viewing 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. Perform a scaling transformation using a fixed-point position (xwmin, ywmin) that scales
the window area to the size of the viewport.
2. Translate the scaled window area to the position of the viewport. Relative proportions
of objects are maintained if the scaling factors are same (sx = sy). Otherwise, world
objects will be stretched or contracted in either e, x or y direction when displayed on
the output device.

Character strings can be handled in two ways when they are mapped to a viewport. The
simplest mapping maintains a constant character size, even though the viewport area may
be enlarged or reduced, relative to the window. This method is employed when text is
formed with standard character fonts that cannot be changed. In systems that allow for
changes in character size, string definitions can be windowed just as other primitives. For
characters formed with line segments, the mapping to the viewport can be carried out as a
sequence of line transformations.

From normalized coordinates, object descriptions are mapped to the viewport display
devices. Any number of output devices can be open in a particular application, and another
window-to-viewport transformation can be performed for each open output device. This
mapping, called the workstation transformation, is accomplished by selecting a window area
in normalized space and a viewport area in the coordinates of the display device. With the
workstation transformation, we gain some additional control over the positioning of parts of
a scene on individual output devices. As illustrated in figure 4.3, it is possible to use work
station transformations to partition a view so that different parts of normalized space can be
displayed on different output devices.

Unit 4: 2D Viewing 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 4.3: Mapping selected parts of a scene in normalized coordinated to different

video monitors with workstation transformations.

SELF-ASSESSMENT QUESTIONS - 2

3. Character strings can be handled in two ways when they are mapped to a
viewport. State True/False.
4. Which sequence is followed to transform Window to viewport process?
a. scaling, translation
b. translation, scaling
c. rotation, scaling
d. Scaling, rotation
5. What is workstation transformation?

Unit 4: 2D Viewing 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. CLIPPING OPERATION

Clipping operations are used to remove unwanted portions of a picture – we either remove
the outside or the inside of a clipping window, which can be rectangular (most common),
polygonal or curved (complex).

4.1 Point Clipping

We will first discuss the basic concepts of clipping before we start with point clipping. It is
desirable to restrict the effect of graphics primitives to a sub- region of the canvas in order
to protect other portions of the canvas. All primitives are clipped to the boundaries of this
clipping rectangle; that is, primitives lying outside the clip rectangle are not drawn. The
default clipping rectangle is the full canvas (the screen), and it is obvious that we cannot see
any graphics primitives outside the screen. A simple example of line clipping can illustrate
this idea: Figure 4.4 shows a simple example of line clipping: the display window is the
canvas and also the default clipping rectangle, thus all line segments inside the canvas are
drawn. The thick line box is the clipping rectangle and the dotted line is the extension of the
four edges of the clipping rectangle.

Unit 4: 2D Viewing 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 4.4: Line clipping

Now with the assumption of a rectangular clip window, point clipping is easy. The point can
be saved if:

xmin <= x <=xmax

ymin <= y <= ymax

If the x coordinate boundaries of the clipping rectangle are Xmin and Xmax, and the y
coordinate boundaries are Ymin and Ymax, then the following inequalities must be satisfied
for a point at (X, Y) to be inside the clipping rectangle:

Xmin < X < Xmax and

Ymin < Y < Ymax

4.2 Line Clipping

This section deals with clipping of lines against rectangles. Although there are specialized
algorithms for rectangle and polygon clipping, it is important to note that other graphic
primitives can be clipped by repeated application of the line clipper.

Parametric representation of a line:

x = x1 + u (x2 - x1)

y = y1 + u (y2 - y1), and 0 <= u <= 1.

If the value of u for an intersection with a clipping edge is outside the range

Unit 4: 2D Viewing 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

0 to 1, then the line does not enter the interior of the window at that boundary. If the value
of u is within this range, then the line does enter the interior of the window at that boundary.

Solve Simultaneous Equations

To clip a line, one needs to consider only its endpoints and not its many interior points. If
both endpoints of a line lie inside the clip rectangle (for instance, AB, refers to the figure 6.4
example), the entire line lies inside the clip rectangle and can be accepted. If one endpoint
lies inside and one outside (example CD, the line intersects the clip rectangle and it is
necessary to compute the intersection point. If both endpoints are outside the clip rectangle,
the line may or may not intersect within the clip rectangle (EF, GH, and IJ), and it is necessary
to perform further calculations to determine whether there are any intersections. The brute-
force approach to clipping a line that cannot be accepted is to intersect that line with each of
the four clip-rectangle edges to see whether any intersection points lie on those edges. If that
is the case, the line cuts the clip rectangle and is partially inside. For each line and clip-
rectangle edge, we must take the two mathematically infinite lines that contain them and
intersect them. Next, it is essential to test whether this intersection point is "interior" – that
is, whether it lies within both the clip rectangle edge and the line. In that case, there is an
intersection with the clip rectangle. In the first example, intersection points G' and H' are
interior, but I' and J' are not.

The Cohen-Sutherland Line-Clipping Algorithm

In the algorithm, first of all, it is detected whether line lies inside the screen or it is outside
the screen. All lines come under any one of the following categories:

1. Visible
2. Not Visible
3. Clipping Case
1. Visible: If a line lies within the window, i.e., both endpoints of the line lies within the
window. A line is visible and will be displayed as it is.
2. Not Visible: If a line lies outside the window it will be invisible and rejected. Such lines
will not display.

Unit 4: 2D Viewing 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. Clipping Case: If the line is neither visible case nor invisible case. It is considered to be
clipped case.

The more efficient Cohen-Sutherland Algorithm performs initial tests on a line to determine
whether intersection calculations can be avoided.

Steps for Cohen-Sutherland Algorithm.

End-points pairs of a line are checked for trivial acceptance or trivial reject using out code.
In case there is neither trivial-acceptance nor trivial-reject, the line is divided into two
segments at a clip edge. The line is iteratively clipped by testing trivial-acceptance or trivial-
rejected, and divided into two segments until completely inside or trivial-rejected.

Trivial Acceptance/Rejection Test

To perform trivial accept and reject tests, the edges of the clip rectangle must be extended to
divide the plane of the clip rectangle into nine regions as shown in the figure 6.5. Each region
is assigned a 4- bit code determined by where the region lies with respect to the outside half
planes of the clip- rectangle edges. Each bit in the out code is set to either 1 (true) or 0 (false);
the 4 bits in the code correspond to the following conditions:

Bit 1: outside half plane of top edge, above top edge Y > Ymax

Bit 2: outside half plane of bottom edge, below bottom edge Y < Ymin

Bit 3: outside half plane of right edge, to the right of right edge X > Xmax

Bit 4: outside half plane of left edge, to the left of left edge X < Xmin

Figure 4.5: Clip rectangle

Unit 4: 2D Viewing 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

As a conclusion, Cohen-Sutherland algorithm is efficient when out code testing can be done
cheaply (for example, by doing bit-wise operations in assembly language) and trivial
acceptance or rejection is applicable to the majority of line segments. (For example, large
windows - everything is inside, or small windows - everything is outside).

Liang-Barsky Algorithm

Faster line clippers have been developed that are based on analysis of the parametric
equation of a line segment, which can be written as:

x = x1 + u Δx

y = y1 + u Δy, where 0 <= u <= 1

Where Δx = x2 - x1 and Δy = y2 - y1. Using these parametric equations, Cryus and Beck
developed an algorithm that is generally more efficient than the Cohen-Sutherland
algorithm. Later, Liang and Barsky independently devised an even faster parametric line
clipping algorithm. Following the Liang-Barsky approach, we first write the point clipping in
a parametric way:

xmin <= x1 + u Δx <= xmax

ymin <= y1 + u Δy <= ymax

Of these the four inequalities can be expressed as:

u * pk <= qk, for k = 1, 2, 3, 4

Where parameters p and q are defined as:

p1 = -Δx, q1 = x1 – xmin

p2 = -Δx, q2 = xmax - x1

p3 = -Δy, q3 = y1 - ymin

p4 = -Δy, q4 = ymax - y1

Unit 4: 2D Viewing 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Any line that is parallel to one of the clipping boundaries has pk = 0 for the value of k
corresponding to that boundary (k = 1, 2, 3, 4 correspond to the left, bottom, and top
boundaries, respectively). If, for that value of k, one also finds qk >= 0, the line is inside the
parallel clipping boundary.

When pk < 0, the infinite extension of the line proceeds from the outside to the inside of the
infinite extension of the particular clipping boundary. If pk > 0, the line proceeds from the
inside to the outside. For a non-zero value of pk = 0, one can calculate the value of u that
corresponds to the point where the infinitely extended line intersects the extension of
boundary k as:

u = qk / p k

For each line, one can calculate values for parameters u1 and u2 that defines that part of the
line that lies within the clip rectangle. The value of u1 is determined by looking at the
rectangle edges for which the line proceeds from the outer side to the inner side. (p < 0). For
these edges we calculate rk = qk / pk.

The value of u1 is taken as the largest of the set consisting of 0 and the various values of r.
Conversely, the value of u2 is determined by examining the boundaries for which the line
proceeds from inside to outside (p > 0). A value of rk is calculated for each of these boundaries
and the value of u2 is the minimum of the set consisting of 1 and the calculated r values. If u1
> u2, the line is completely outside the clip window and it can be rejected. Otherwise, the end
points of the clipped line are calculated from the two values of parameter u.

This algorithm is presented in the following procedure. Line intersection parameters are
initialized to the values u1 = 0 and u2 = 1. For each clipping boundary, the appropriate values
for p and q are calculated and used by the function clip Test to determine whether the line
can be rejected or whether the intersection parameters are to be adjusted.

When p < 0, the parameter r is used to update u1; when p < 0, the parameter r is used to
update u2.

When updating u1 or u2 results in u1 > u2, the line is usually rejected.

Unit 4: 2D Viewing 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Otherwise, one can update the appropriate u parameter only if the new value results in the
shortening of the line. When p = 0 and q < 0, one can discard the line since it is parallel to and
outside of this boundary. If the line has not been rejected after all four values of p and q have
been tested, the endpoints of the clipped line are determined from values of u1 and u2.

SELF-ASSESSMENT QUESTIONS - 3

6. __________________ operations are used to remove unwanted portions of a picture.

7. _________________ can be clipped by repeated application of the line clipper.
8. If both endpoints of a line lie inside the clip rectangle the entire line lies inside the
clip rectangle and can be trivially accepted. (True/False)
9. End-points pairs of the line are checked for trivial acceptance or trivial reject
using out code in _____________ algorithm.
10. Cryus and Beck developed an algorithm called ___________ .

Unit 4: 2D Viewing 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. POLYGON CLIPPING

A polygon is usually defined by a sequence of vertices and edges. If the polygons are unfilled,
line-clipping techniques are sufficient, however if the polygons are filled, the process in more
complicated. A polygon may be fragmented into several polygons in the clipping process, and
the original colour associated with each one. The Sutherland-Hodgeman clipping algorithm
clips any polygon against a convex clip polygon. The Weiler- Atherton clipping algorithm can
clip any polygon against any clip polygon. The polygons may even have holes.

An algorithm that clips a polygon must deal with many different cases. The case is
particularly noteworthy in that the concave polygon is clipped into two separate polygons.
Overall the task of clipping seems rather complex. Each edge of the polygon must be tested
against each edge of the clip rectangle; new edges must be added, and existing edges must
be discarded, retained, or divided. Multiple polygons may result from clipping a single
polygon. It is important to have an organized method to deal with all these cases.

Sutherland and Hodgman's Polygon-Clipping Algorithm

Sutherland and Hodgman's polygon-clipping algorithm uses a divide-and- conquer strategy:

It solves a series of simple and identical problems which when combined solve the overall
problem. The simple problem is to clip a polygon against a single infinite clip edge. Four clip
edges, each defining one boundary of the clip rectangle successively clip a polygon against a
clip rectangle.

It is essential to note the difference between this strategy for a polygon and the Cohen-
Sutherland algorithm for clipping a line: The polygon clipper clips against four edges in
succession, whereas the line clipper tests outcode to see which edge is crossed, and clips only
when necessary.

Steps of Sutherland-Hodgman's polygon-clipping algorithm

• Polygons can be clipped against each edge of the window one at a time. Windows/edge
intersections, if any, are easy to find since the X or Y coordinates are already known.
• Vertices which are kept after clipping against one window edge are saved for clipping
against the remaining edges.

Unit 4: 2D Viewing 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

• Note that the number of vertices usually changes and will often increase.
• The Divide and Conquer approach is used.

Here is a STEP-BY-STEP example of polygon clipping.

Four Cases of polygon clipping against one edge

Figure 4.6 shows the clip boundary that determines a visible and invisible region. The edges
from vertex i to vertex i+1 can be one of the four types:

Case 1 : Wholly inside visible region - save endpoint

Case 2 : Exit visible region - save the intersection

Case 3 : Wholly outside visible region - save nothing

Case 4 : Enter visible region - save intersection and endpoint

Because clipping against one edge is independent of all others, it is possible to arrange the
clipping stages in a pipeline. The input polygon is clipped against one edge and any points
that are kept are passed on as input to the next stage of the pipeline. In this way four polygons
can be at different stages of the clipping process simultaneously. This is often implemented
in hardware.

Figure 4.6: Polygon clipping

Unit 4: 2D Viewing 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Pipeline Clipping Approach

An array of records show the most recent point that was clipped for each clip-window
boundary. The main routine passes each vertex p to the clipPoint routine for clipping against
the first window boundary. If the line defined by endpoints p and s (boundary) crosses this
window boundary, the intersection is calculated and passed to the next clipping stage. If p is
inside the window, it is passed to the next clipping stage. Any point that survives clipping
against all window boundaries is then entered into the output array of points. The array first
Point stores for each window boundary the first point flipped against that boundary. After
all polygon vertices have been processed, a closing routine clips lines defined by the first and
last points clipped against each boundary.

Shortcoming of Sutherlands – Hodgeman Algorithm

Convex polygons are correctly clipped by the Sutherland-Hodgeman algorithm, but concave
polygons may be displayed with extraneous lines. This occurs when the clipped polygon
should have two or more separate sections. But since there is only one output vertex list, the
last vertex in the list is always joined to the first vertex. There are several things that can be
done to correct display concave polygons. For instance one could split the concave polygon
into two or more convex polygons and process each convex polygon separately.

Another approach to check the final vertex list for multiple vertex points is, along any clip
window boundary and correctly join pairs of vertices. Finally, one could use a more general
polygon clipper, such as wither, the Weiler- Atherton algorithm or the Weiler algorithm
described in the next section.

Weiler-Atherton Polygon Clipping

In this technique, the vertex-processing procedures for window boundaries are modified so
that concave polygons are displayed correctly. This clipping procedure was developed as a
method for identifying visible surfaces, and it can be applied with arbitrary polygon-clipping
regions.

The basic idea in this algorithm is that instead of always proceeding around the polygon
edges as vertices are processed, one can sometimes follow the window boundaries. Which

Unit 4: 2D Viewing 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

path is followed depends on the polygon- processing direction (clockwise or

counterclockwise) and whether the pair of polygon vertices currently being processed
represents an outside-to-inside pair or an inside-to-outside pair. For clockwise processing
of polygon vertices, the following rules are used:

➢ For an outside-top inside pair of vertices, follow the polygon boundary

➢ For an inside-to-outside pair of vertices, follow the window boundary in a clockwise
direction

In figure 6.7, the processing direction in the Wieler-Atherton algorithm and the resulting
clipped polygon is shown for a rectangular clipping window.

Figure 6.7: Rectangular polygon clipping

Unit 4: 2D Viewing 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 4

11. A polygon is usually defined by a sequence of _____________________

and_______________ .
12. Sutherland-Hodgeman clipping algorithm clips any polygon against a convex clip
polygon. (True/False)
13. Sutherland and Hodgman's polygon-clipping algorithm uses a
________________ strategy.
14. Weiler-Atherton Polygon Clipping vertex-processing procedures for window
boundaries are modified so that _____________ are displayed correctly.

Unit 4: 2D Viewing 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. SUMMARY

Let us recapitulate the unit now. This unit started with the discussion of 2D viewing pipeline
that would allow the designer to pan the image content. Window to viewport coordinate
transformation, in which once the object descriptions have been transferred to the viewing
reference frame, one must choose the window extents in viewing coordinates and select the
viewport limits in normalized coordinates. We also discussed the various clipping operations
like point, line and polygon. The Cohen-Sutherland algorithm is efficient when out code
testing can be done cheaply (for example, by doing bit-wise operations in assembly
language) and trivial acceptance or rejection is applicable to the majority of line segments.
The Liang-Barsky algorithm is more efficient than the Cohen Sutherland algorithm, since
intersection calculations are reduced.

7. TERMINAL QUESTIONS

1. Explain 2D view pipeline.

2. Discuss the window to viewport coordinate transformation.
3. Explain clipping operation.
4. Discuss Cohen-Sutherland line clipping algorithm.
5. Discuss Liang-Barsky algorithm.
6. Explain polygon clipping.

Unit 4: 2D Viewing 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

8. ANSWERS

Self Assessment Questions

1. Clipping window
2. Normalized and devised coordinates.
3. True.
4. a) scaling, translation
5. Any number of output devices can be open in a particular application, and another
window-to-viewport transformation can be performed for each open output device. This
mapping, called the workstation transformation
6. Clipping
7. Graphic primitives
8. True
9. Cohen-Sutherland Line-Clipping
10. Liang-Barsky
11. Vertices and edges
12. True
13. Divide-and-conquer
14. Concave polygons

Terminal Questions

1. 2D viewing transformation can be stated as Construct the scene in world coordinates,

using modeling coordinates for each part. Set up a 2D viewing system with an oriented
window, transform to viewing coordinates. For more details refer section 6.2.
2. Once object descriptions have been transferred to the viewing reference frame, we
choose the window extents in viewing coordinates and select the viewport limits in
normalized coordinates. Object descriptions are then transferred to normalized device
coordinates. For more details refer section 6.3.
3. Clipping operations are used to remove unwanted portions of a picture – we either
remove the outside or the inside of a clipping window, which can be rectangular (most

Unit 4: 2D Viewing 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

common), polygonal or curved (complex). Clipping can be done on line, point and
polygon. For more details refer section 6.4.
4. The more efficient Cohen-Sutherland Algorithm performs initial tests on a line to
determine whether intersection calculations can be avoided. For more details refer sub
section 6.4.2.
5. Faster line clippers have been developed that are based on analysis of the parametric
equation of a line segment, which we can write in the form: for more details refer
subsection 6.4.2.
6. A polygon is usually defined by a sequence of vertices and edges. If the polygons are
unfilled, line-clipping techniques are sufficient however, but if the polygons are filled,
the process in more complicated. For more details refer section 6.5.

Unit 4: 2D Viewing 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 5: 3D Transformation & Viewing 1

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 5
3D Transformation & Viewing
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Methods for modelling and performing 1, 2, 3, 4, 5, 1, 2
geometric transformations in 3D
2.1 Translation - - 4 - 13
2.2 Rotation - -
2.3 Scaling - -
3 Other Transformations - 3
3.1 Reflection - - 14 - 15
3.2 Shears - -
4 Rotation about an Arbitrary Axis in Space 6, 7 - 16 - 19
5 Reflection through an Arbitrary Plane - 4 20 - 21
6 Methods for obtaining views of a 3D scene 8 5
6.1 3D Viewing Pipeline - - 22 - 25
6.2 3D Viewing Coordinate System - -
7 Summary - - 26

8 Terminal Questions - - 26
9 Answers - - 27 - 28

Unit 5: 3D Transformation & Viewing 2

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed the concept of 2D viewing including viewing pipeline,
window to viewport coordinate transformation, various clipping operations like point
clipping, circle and polygon clipping.

In this unit we will discuss 3D transformation, which includes basic transformations like
rotation, scaling and translation and the other transformations like reflection and shearing.
These transformations allow the developer to reposition, resize, and reorient models
without changing the base values that define them. We will also discuss the concept of
rotation about an arbitrary axis in space and reflection through an arbitrary plane. 3D
projection is a method of mapping three-dimensional points to a two- dimensional plane. We
will also look at parallel projection in this unit. This unit will conclude with a discussion on
3D viewing pipeline and coordinate system.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Explain the transformations of 3D.

❖ Describe reflection and shear transformations of 3D.
❖ Explain the rotation about an arbitrary axis in space
❖ Explain reflection through an arbitrary plane
❖ Explain projections and 3D viewing

Unit 5: 3D Transformation & Viewing 3

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. METHODS FOR MODELLING AND PERFORMING GEOMETRIC

TRANSFORMATIONS IN 3D

3D transformations refer to the process of changing the position, orientation, and size of a
three-dimensional object in space. These transformations are typically applied to 3D models
in computer graphics, video games, and virtual reality applications to create the illusion of
movement and depth. 3D transformations allow a developer to reposition, resize, and
reorient models without changing the base values that define them

Definition of a 3D Point: A point is similar to its 2D counterpart; an extra component, Z, is

added for the 3rd axis which is shown in figure 5.1.

Figure 5.1: 3D point

Points are now represented with 3 numbers: <x, y, z>. This particular method of representing
3D space is the "left-handed" coordinate system. In the left-handed system the x axis
increases going to the right, the y axis increases going up, and the z axis increases going into
the page/screen. The right-handed system is similar but with the z-axis pointing in the
opposite direction.

Transformations

A static set of 3D points or other geometric shapes on a screen may not be very interesting.
A paint program is enough to produce one of these. To make a program interesting, one may
want a dynamic landscape on the screen. For this the points will have to move in the world
coordinate system, and even the point-of-view (POV) may be required to move. In short, the

Unit 5: 3D Transformation & Viewing 4

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

objective could be to model the real world. The process of moving points in space is called
transformation, and can be divided into translation, rotation scaling and other kind of
transformations.

5.1Translation

3D translation is a transformation that involves moving a 3D object from one position to

another in space. It involves changing the x, y, and z coordinates of each vertex in the object
by a certain amount. Translation is used to move a point, or a set of points, linearly in space.
In the context of 3D, each point has 3 coordinates that is, x, y and z. Similarly, the translation
distances can also be specified in any of the 3 dimensions. These translation distances
denoted as tx, ty and tz.

For any point P (x, y, z) after translation, there is P′ (x′, y′, z′) wherein,

x′ = x + tx,

y′ = y + ty,

z′ = z + tz

and (tx, ty, tz) is the translation vector

Now this can be expressed as a single matrix equation: P′ = P + T

Where

𝑥 𝑥′ 𝑡𝑥
′
𝑃 = [𝑦 ] 𝑃 = [𝑦 ′ ] 𝑇 = [𝑡𝑦 ]
𝑧 𝑧′ 𝑡𝑧

Now we will discuss the 3D translation with an example. Take the instance of a case where
one may want to move a point “3 meters east, -2 meters up, and 4 meters north.”

Steps for Translation

Given a point in 3D and a translation vector, it can be translated as follows:

Point3D point = (0, 0, 0)

Unit 5: 3D Transformation & Viewing 5

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Vector3D vector = (10, -3, 2.5)

Adding vector to point

point.x = point.x + vector.x;

point.y = point.y + vector.y;

point.z = point.z + vector.z;

And finally we have translated point.

Homogeneous Coordinates

Analogous to their 2D Counterpart, the homogeneous coordinates for 3D translation can be

expressed as:

𝑥′ 1 0 0 𝑡𝑋 𝑥
𝑦′ 0 1 0 𝑡𝑦 𝑦
[ ′ ] = [0 0 1 𝑡𝑧 ] = [ 𝑧 ]
𝑍
0 00 1 1
1

Abbreviated as: P’ = T (tx, ty, tz). P

On solving the RHS of the matrix equation, one gets:

𝑥′ 𝑥 + 𝑡2
𝑦′ 𝑦 + 𝑡𝑦
[ ′ ]=[ ]
𝑍 𝑧 + 𝑡𝑧
1 1

This shows that each of the 3 coordinates gets translated by the corresponding translation
distance.

Unit 5: 3D Transformation & Viewing 6

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 1

1. POV Stands for ______________ .

2. _________________ is used to move a point, or a set of points, linearly in space.
3. Define 3D point.

2.2 Rotation

Rotation is the process of moving a point in space in a non-linear manner. More particularly,
it involves moving the point from one position on a sphere whose center is at the origin, to
another position on the sphere. But what could be the purpose of doing something like this?
Allowing the point of view to move around is only an illusion – projection requires that the
POV be at the origin. When the user thinks the POV is moving, it actually means that all the
points are being translated in the opposite direction; and when the user thinks the POV is
looking down a new vector, all the points are being rotated in the opposite direction.

Normalization: The process of moving points in such a manner that the POV is at the origin
looking down the +Z axis is called normalization. Rotation of a point requires the need to
know the coordinates for the point, and the rotation angles.

It is important to know three different angles: how far to rotate around the X axis (YZ
rotation, or “pitch”); how far to rotate around the Y axis (XZ plane, or “yaw”); and how far to
rotate around the Z axis (XY rotation, or “roll”). Conceptually, three rotations are done
separately. First, one rotates around one axis, followed by another, then the last. The order
of rotations is important when rotations are cascaded; one must rotate first around the Z
axis, then around the X axis, and finally around the Y axis.

To show how the rotation formulas are derived, let us rotate the point

<x, y, z> around the Z axis with an angle of θ degrees as shown in figure 5.2

Unit 5: 3D Transformation & Viewing 7

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 5.2: Rotation with x, y, z points

In the figure 5.2, it can be noted that, when one rotates around the Z axis, the Z element of
the point does not change. In fact, it is possible to just ignore the Z as its position post rotation
is known. If the Z element is ignored, then it is similar to rotating the two-dimensional point
<x, y> through the angle θ. This is the way a 2-D point is rotated as shown in figure 5.3. For
simplicity, consider the pivot at origin and rotate point P (x, y) where x = r cosФ and y = r
sinФ

If rotated by θ then: x′

= r cos(Ф + θ)

= r cosФ cosθ – r sinФ sinθ

and

y′ = r sin(Ф + θ)

= r cosФ sinθ + r sinФ cosθ

Unit 5: 3D Transformation & Viewing 8

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 5.3: Rotation of 2D point

Replacing r cosФ with x and r sinФ with y, there is: x′

= x cosθ – y sinθ

and

y′ = x sinθ + y cosθ and

z′ = z (as it does not change when rotating around z-axis) (5-1)

Now for rotation around other axes shown in figure 5.4, cyclic permutation helps form the
equations for yaw and pitch as well: In the equations (5-1), replacing x with y and y with z
gives equations for rotation around x-axis. In the modified equations if y is replaced with z
and z with x then the equations for rotation around y-axis is arrived at.

Figure 5.4: Rotation with other axes

Rotation about x-axis (i.e. in yz plane):

Unit 5: 3D Transformation & Viewing 9

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

x′ = x

y′ = y cosθ – z sinθ z′ = y sinθ + z cosθ

Rotation about y-axis (i.e. in xz plane): x′ = z sinθ + x cosθ

y′=y

z′ = z cosθ – x sinθ

Using Matrices to Create 3D

A matrix is usually defined as a two-dimensional array of numbers. However, it is much more

useful to think of a matrix as an array of vectors. By vectors, one means an ordered set of
numbers (a tuple in mathematics terms). 3D graphics vectors and points can be used
interchangeably for this, since they are both 3-tuples (and triples). In general one works with
“square” matrices. This means that the number of vectors in the matrix is the same as the
number of elements in the vectors that comprise it. Mathematically, a matrix is shown as a
2-D array of numbers surrounded by vertical lines. For example:

|x1 y1 z1|

|x2 y2 z2|

|x3 y3 z3|

This is designates as a 3x3 matrix (the first 3 is the number of rows, and the second 3 is the
number of columns). The “rows” of the matrix are the horizontal vectors that make it up; in
this case, <x1, y1, z1>, <x2, y2, z2>, and <x3, y3, z3>. In mathematics, vertical vectors are
called “columns.” In this case they are < x1, x2, x3>, <y1, y2, y3> and <z1, z2, z3>. The most
important thing to be done with a matrix is to multiply it by a vector or another matrix. One
simple rule followed when multiplying something by a matrix: multiply each column by a
multiplicand and store this as an element in the result. As mentioned earlier, each column
can be considered a vector, so when multiplying by a matrix, one is merely doing a bunch of
vector multiplies. So which vector multiply does one use-the dot product, or the cross
product? The dot product is used.

Unit 5: 3D Transformation & Viewing 10

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Another simple rule followed when multiplying a matrix by something is: multiply each row
by the multiplier. Again, rows are just vectors, and the type of multiplication is the dot
product.

Let us look at some examples. To begin with let us assume that there is a matrix M, and it
must be multiplied by a point < x, y, z>. The first thing known is that the vector rows of the
matrix must contain three elements (in other words, three columns). Because these rows
should be multiplied by a point using a dot product, and to do that, the two vectors must have
the same number of elements. Since there will be a dot product for each row in M, one is
likely to end up with a tuple that has one element for each row in

M. As stated earlier, we work almost exclusively with square matrices. If the requirement for
us is three columns, M will also have three rows. Let us see:

1 0 0
< 𝑥, 𝑦, 𝑧 > ∗ |0 1 0| = {< 𝑥, 𝑦, 𝑧 >∗< 1,0,0 > , < 𝑥, 𝑦, 𝑧 >< 0,1,0 >, < 𝑥, 𝑦, 𝑧 > ∗
0 0 1
< 0,0,1 >} = 𝑥, 𝑦, 𝑧

Using Matrices for rotation Roll (rotate about the z axis) with the matrix representation as
shown in figure 5.5:

𝑥′ 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 0 𝑥
𝑦′ 𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0 0 𝑦
[ ′ ]=[ 0 0 1 0] = [ 𝑧 ]
𝑍
0 0 01 1
1

Pitch (rotate about the X axis):

𝑥′ 1 0 0 0 𝑥
𝑦′ 0 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 𝑦
[ ′ ] = [0 𝑠𝑖𝑛𝜃 𝑐𝑜𝑛𝜃 0] = [ 𝑧 ]
𝑍
0 0 0 1 1
1

Yaw (Rotate about the Y axis):

Unit 5: 3D Transformation & Viewing 11

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

𝑥′ 𝑐𝑜𝑠𝜃 0 𝑠𝑖𝑛𝜃 0 𝑥
𝑦′ 0 1 0 0 𝑦
[ ′ ] = [−𝑠𝑖𝑛𝜃 0 𝑐𝑜𝑛𝜃 0] = [ 𝑧 ]
𝑍
0 0 0 1 1
1

Figure 5.5: rotate and translate

2.3Scaling

Coordinate transformations for scaling relative to the origin are

X` = X. Sx

Y` = Y. Sy

Z` = Z. Sz

Scaling an object with transformation changes the size of the object and repositions the
object relative to the coordinate origin. If the transformation parameters are not all equal,
relative dimensions in the object are changed.

Uniform Scaling: Here the original shape of an object is preserved with uniform scaling (Sx =
Sy = Sz)

Differential Scaling: Here the original shape of an object is not preserved because of
differential scaling (Sx <> Sy <> Sz)

Unit 5: 3D Transformation & Viewing 12

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Scaling Relative to the Coordinate Origin:

Scaling transformation of a position P = (x, y, z) relative to the coordinate origin can be

written as

𝑆𝑥 0 0 0
𝑆 0 0
[ 0 𝑦 𝑆 0]
0 0 𝑧
0 0 0 1

Scaling with respect to a selected fixed position (Xf, Yf, Zf) can be represented with the
following transformation sequence:

➢ Translate the fixed point to the origin.

➢ Scale the object relative to the coordinate origin
➢ Translate the fixed point back to its original position.

For these three transformations one can have a composite transformation matrix by
multiplying three matrices into one as shown below.

1 0 0 𝑋𝑓 𝑆𝑥 0 0 0 1 0 0 −𝑋𝑓 𝑆𝑥 0 0 (1 − 𝑠𝑥 )𝑋𝑓
010 𝑌 𝑆 0 0 0 1 0 −𝑌 𝑆 0 (1 − 𝑠𝑦 )𝑌𝑓
[0 0 1 𝑓 ] [ 0 𝑦 𝑆 0] [0 0 1 𝑓 ] 0 𝑦 𝑆
𝑍𝑓 0 0 𝑧 −𝑍𝑓 0 0 𝑧 (1 − 𝑠𝑧 )𝑍𝑓
000 1 0 0 0 1 000 1 [0 0 0 1 ]

SELF-ASSESSMENT QUESTIONS - 2

4. _____________ is the process of moving a point in space in a non- linear manner.

5. What is vector?
6. We preserve the original shape of an object with a uniform of ( Sx = Sy = Sz) is
called ________________
7. Scaling an object with transformation changes the size of the object and
repositions the object relative to the coordinate origin (True/False).

Unit 5: 3D Transformation & Viewing 13

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. OTHER TRANSFORMATIONS

Apart from translation, rotation and scaling, following are the other two transformations.

3.1 Reflection

A three-dimensional reflection can be performed relative to a selected reflection axis or with

respect to a selected reflection plane. In general, three-dimensional reflection matrices are
set up in a manner similar to that of two dimensions. Reflections relative to a given axis are
equivalent to 180 degree rotations. The matrix representation for this reflection of points
relative to the X axis is as under.

1 0 00
0 −1 0 0
[0 0 1 0]
0 0 01

The matrix representation for this reflection of points relative to the Y axis

−1 0 0 0
0 100
[ 0 0 1 0]
0 001

The matric representation for this reflection of points relative to the XY plane is
10 0 0
01 0 0
[0 0 −1 0]
00 0 1

3.2 Shear

Shearing transformation can be used to modify object shapes. As an example of three-

dimensional shearing, the following transformation produces a z-axis shear:

10𝛼0
01𝑏0
[0 0 1 0]
0001

Unit 5: 3D Transformation & Viewing 14

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Parameters a & b can be assigned as well as real values. The effect of this transformation
matrix is to alter X and Y coordinate values by an amount that is proportional to the z value,
while leaving the z coordinate unchanged.

Y-axis Shear:

1𝑎00
0100
[0 𝑐 1 0]
0001

X-axis Shear:

1000
100
[𝑏
𝑐 0 1 0]
0001

SELF-ASSESSMENT QUESTIONS - 3

8. A three-dimensional reflection can be performed relative to a selected____________ .

9. Shearing transformation can be used to ______________ object shapes.

Unit 5: 3D Transformation & Viewing 15

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. ROTATION ABOUT AN ARBITRARY AXIS IN SPACE

Rotation about an arbitrary axis in 3D space is a transformation that involves rotating a 3D

object around a line that does not necessarily pass through the origin. This type of rotation
is also known as axis-angle rotation because it is defined by an axis of rotation and an angle
of rotation.

To perform a rotation about an arbitrary axis in space, we need to first define the axis of
rotation. The axis of rotation can be represented by a unit vector, which indicates the
direction of the line of rotation. Next, we need to specify the angle of rotation, which is the
amount of rotation around the axis.

The general case of rotation about an arbitrary axis in space frequently happens in, robotics,
animation, and simulation. Since the technique for rotation about a coordinate axis is known,
the underlying procedural idea is to make the arbitrary rotation axis coincident with one of
the coordinate axes as shown in figure 5.6. Assume an arbitrary axis in space passing through
the point (x0, y0,z0) with direction cosines (cx, cy, cz). Rotation about this axis by some angle
δ is accomplished using the following procedure

Figure 5.6: Multiple rotations about a local axis system

Unit 5: 3D Transformation & Viewing 16

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Translate so that the point (x0, y0, z0) is at the origin of the coordinate system.

➢ Perform appropriate rotations to make the axis of rotation coincident with the z-axis.
➢ Rotate about the z-axis by the angle δ.
➢ Perform the inverse of the combined rotation transformation.
➢ Perform the inverse of the translation.

In general making an arbitrary axis passing through the origin coincident with one of the
coordinate axes requires two successive rotations about the other two coordinate axes. To
make the arbitrary rotation axis coincident with the z-axis, first rotate about the x-axis and
then about the y-axis. To determine the rotation angle , α, about the x-axis used to place the
arbitrary axis in the xz plane, first project the unit vector along the axis onto the yz plane as
shown in fig 5.7 (a). The y and z components of the projected vector are cy and cz, the
direction cosines of the unit vector along the arbitrary axis. In fig 7.7 (a) it is shown that

𝑑 = √𝑐𝑦 2 + 𝐶𝑧 2

and

𝑐𝑧 𝑐𝑦
𝑐𝑜𝑠 𝑎 = 𝑠𝑖𝑛 𝑎 = (5-1) and (5-2)
𝑑 𝑑

After rotation about the x-axis into the xz plane, the z component of the unit vector is d, and
the x component is cx. The direction cosine in the x direction as shown in fig 5.7(b). The
length of the unit vector is, of course, 1. Thus, the rotation angle β about the y-axis required
to make the arbitrary axis coincident with the z-axis is

𝑐𝑜𝑠𝛽 = 𝑑 𝑠𝑖𝑛𝛽 = 𝑐𝑥 (5-3)

Then the complete transformation is

[M]=[T][Rx] Ry] [Rb] [Ry]-1[Rx]-1[T]-1(5-4)

Where the required translation matrix is

Unit 5: 3D Transformation & Viewing 17

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1 0 0 0
0 1 0
[𝑇] = [ 0 0 1 0] (5-5)
0
−𝑥𝑜 −𝑦𝑜 −𝑧𝑜 1

The transformation matrix for rotation about the x-axis is

1 0 0 0 1 0 0 0
0 𝑐𝑜𝑠 𝛼 𝑠𝑖𝑛 𝑎 0 0 𝑐𝑧 /𝑑 𝑐𝑦 /𝑑 0
[𝑅𝑥 ] = [0 −𝑠𝑖𝑛 𝑎 𝑐𝑜𝑠 𝛼 ] = [0 ](5-6)
0 −𝑐𝑦 /𝑑 𝑐𝑧 /𝑑 0
0 0 0 1 0 0 0 1

Figure 5.7: Rotation required to make the unit vector op coincident with the z-axis a)
rotation about x b) rotation about y.

And about the y axis is

𝑐𝑜𝑠 (−𝛽) 0 −𝑠𝑖𝑛 (−𝛽) 0 𝑑 0 𝑐𝑥 0

0 1 0 0 0 10 0
[𝑅𝑦 ] = [
𝑠𝑖𝑛 (−𝛽) 0 𝑐𝑜𝑠 (−𝛽) 0] = [−𝑐𝑧 0 𝑑 0] (5-7)
0 0 0 1 0 00 1

Finally, the rotation about the arbitrary axis is given by a z-axis rotation matrix,

𝑐𝑜𝑠 𝛿 𝑠𝑖𝑛𝛿 0 0
[𝑅𝛿 ] = [ −𝑠𝑖𝑛𝛿 𝑐𝑜𝑠𝛿 0 0
0 0 1 0](5-8)
0 0 0 1

Unit 5: 3D Transformation & Viewing 18

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

In practice the angles α and β are not explicitly calculated. The elements of the rotation
matrices [Rx] and [Ry] in equation (5-4) are obtained from equations (5-1 to 5-3) at the
expense of two divisions and a square root calculation. Although developed with the
arbitrary axis in the first quadrant, these results are applicable in all quadrants.

If the direction cosines of the arbitrary axis are not known, they can be obtained knowing a
second point on the axis (x1, y1, z1) by normalizing the vector from the first to the second
point. Specifically, the vector along the axis from (x0, y0, z0) to (x1, y1, z1) is

[v]=[(x1- x0) (y1-y0) (z1-z0)

Normalized, it yields the direct cosines

[(𝑥1 −𝑥0 )(𝑦1 −𝑦0 )(𝑧1 −𝑧0 )]

[𝑐𝑥 𝑐𝑦 𝑐𝑧 ] = 1 (5-9)
[(𝑥1 −𝑥0 )2 (𝑦1 −𝑦0 )2 (𝑧1 −𝑧0 )2 ]2

Unit 5: 3D Transformation & Viewing 19

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. REFLECTION THROUGH AN ARBITRARY PLANE

The transformation cause reflection through the x=0, y=0, n=0 coordinate planes,
respectively. Often it is necessary to reflect an object through a plane other than one of these.
This can be accomplished using a procedure incorporating the previously defined simple
transformations, one possible procedure is:

Translate a known point P, which lies in the reflection plane, to the origin of the coordinate
system. Rotate the normal vector to the reflection plane at the origin until it is coincident
with the +z-axis (Eqs. 5-6 and 5-7); this makes the reflection plane the z=0 coordinate plane.

After applying the above transformations to the object, reflect the object through the z=0
coordinate plane. Perform the inverse transformations to those given above to achieve the
desired result.

Then the general transformation is

[M] = [T] [Rx] [Ry] [Rfitz][Ry]-1[Rz]-1[T]-1

Where the matrices [T], [Rx], [Ry] are given by equations (5-5) to (5-7), respectively.(x0, y0,
z0) = (px, py, py), the components of point P in the reflection plane; and (cx, cy, cz) are the
direction cosines of the normal to the reflection plane.

SELF-ASSESSMENT QUESTIONS - 4

10. To make arbitrary rotation axis coincident with the z-axis, first rotate about
the____________ and then about the ________ .
11. The transformation cause reflection through the x= , y= , n= ___________
coordinate planes.

a. 0, 0, 0 b. 0, 0, 1 c. 1, 1, 0 d. 1, 1, 1

Unit 5: 3D Transformation & Viewing 20

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 5: 3D Transformation & Viewing 21

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. METHODS FOR OBTAINING VIEWS OF 3D SCENE

1 0 𝐿1 cos(𝑝ℎ𝑖) 0
0 1 𝐿1 sin(𝑝ℎ𝑖) 0
𝑃𝑎𝑟𝑒𝑙𝑙𝑒𝑙 𝑀 = [0 0 0]
1
00 0 1

3D viewing is the process of creating a 2D representation of a 3D scene that can be displayed

on a 2D surface such as a computer screen or a piece of paper. This process involves
projecting the 3D scene onto a 2D plane, while preserving the depth and perspective of the
scene.

Perspective projection is a technique that simulates the way the human eye perceives depth
and distance. In this technique, objects that are farther away from the viewer appear smaller
than objects that are closer. The projection is achieved by drawing lines from the viewer's
eye through each vertex of the 3D object and intersecting them with a projection plane. The
resulting image has the appearance of depth and perspective.

Orthographic projection is a technique that does not take into account the viewer's
perspective. In this technique, the 3D object is projected onto a 2D plane along parallel lines
that are perpendicular to the plane. This results in an image that is flat and does not have the
appearance of depth or perspective.

In addition to perspective and orthographic projection, other techniques such as ray tracing
and rasterization can be used for 3D viewing. Ray tracing involves tracing the path of light
rays as they interact with objects in the scene to create a realistic image. Rasterization
involves converting the 3D object into a series of pixels that can be displayed on a 2D surface.

Overall, 3D viewing is an important component of 3D graphics and is used in a wide range of

applications such as video games, virtual reality, and architectural design. The choice of
projection technique depends on the desired level of realism and the specific requirements
of the application.

Unit 5: 3D Transformation & Viewing 22

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Viewing a scene in 3D is much more complicated than 2D viewing. In the latter, the viewing
plane on which a scene is projected from WCs is basically the screen, except for its
dimensions. In 3D, one can choose different viewing planes, directions to view from and
positions to view from. There is also a choice in how to project from the WC scene onto the
viewing plane. In the process of viewing a 3D scene, a coordinate system for viewing is set
up, which holds the viewing or “camera” parameters: position and orientation of a viewing
or projection plane (~ camera “film”).

6.1 3D Viewing pipeline

Generating a view of a 3D scene on an output device is similar to taking a photograph of it,

except that many more possibilities are opened up in the way the “camera” is positioned, its
aperture (view volume) is chosen, the orientation and position of the view plane is selected.
The following summarizes the steps involved from the actual construction of a 3D scene to
its ultimate depiction on a device:

➢ Construct objects in modeling coordinates (MCs)

➢ Pass object description through the modeling transformation to a WC scene.
➢ Pass scene description through the viewing transformation to view coordinates (VCs)
➢ Pass through the projection transformation to projection coordinates (PCs)
➢ Pass through the normalizing transformation and clipping algorithms to normalized
coordinates (NCs)
➢ Pass through the viewport transformation to device coordinates (DCs)

6.2 3D Viewing Coordinate System

To establish the 3D viewing reference frame we first select a world- coordinate position P0
= (x0, y0, z0) for the viewing origin. This is also called the view point or viewing position.

Then one chooses a view up vector V which defines its y-direction, yv and in addition a vector
giving the direction along which viewing is done defining its zv direction. The view plane or
projection plane is usually taken as a plane that is ┴ zv -axis and is set at a position zvp from
the origin. Its orientation is specified by choosing a view-plane normal vector N which also

Unit 5: 3D Transformation & Viewing 23

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

specifies the direction of the positive zv direction. In figure 5.14 right-handed systems are
indicative of the set up typically employed.

Figure 5.8: View coordinate system

The direction of viewing is usually taken as the −N (or −zv ) direction, for RH coordinate
systems (or in the opposite direction corresponding to LH coordinate systems). .

Choosing the view-plane normal N :

• can take as “out from object” by taking N = P0 OriginWC

• or, from a reference point ref Pref (“look at point”) in scene to P0 i.e. N=p0
– Pref
• or, define direction cosines for it using angles WC X,Y,Z axes. Choosing the view-up
vector V :

Requires it to be ┴ N, but since it is not easy to establish usually take

• V = (0,1,0) = WC Y direction and

• adjust or let code/package adjust

Forming the viewing coordinate frame:

Having chosen N the unit normal vector n is formed, for the zv direction, form the unit vector
u for the xv direction, and then adjust V to get a new unit vector v for the yv direction, using
cross-products to obtain each one orthogonal to the plane of the other two:

Unit 5: 3D Transformation & Viewing 24

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

𝑁
𝑛= = (𝑛𝑥, 𝑛𝑦 , 𝑛𝑧 )
|𝑁|

𝑉×𝑛
𝑢= = (𝑢𝑥, 𝑢𝑦 , 𝑢𝑧 )
|𝑉|

𝑣 = 𝑛 × 𝑢 = (𝑣𝑥, 𝑣𝑦 , 𝑣𝑧 )

This system is called a uvn viewing coordinate reference frame.

Setting up the view-plane:

Finally, the view-plane is chosen as a plane ┴ n (or the zy -axis, at some point on it (at some
distance from the view-frame origin).

SELF-ASSESSMENT QUESTIONS - 5

12. 2D WCs origin P0 = (x0 , y0 , z0 ) is called the ____________ .

Unit 5: 3D Transformation & Viewing 25

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. SUMMARY

This unit provides information about the 3D transformations like rotation, translation and
scaling and other transformations including reflection and shear. Translation is used to move
a point, or a set of points, linearly in space. Rotation is the process of moving a point in space
in a non-linear manner. Scaling an object with transformation changes the size of the object
and repositions the object relative to the coordinate origin. Shearing transformation can be
used to modify object shapes. Rotation about an arbitrary axis in space frequently occurs, in
cases like robotics, animation, and simulation. Reflection through an arbitrary plane is the
transformation cause reflection through the x=0, y=0, n=0 coordinate planes, respectively.
Projection is a way of converting the object in N-dimensional system to N-1 dimensions. The
unit also discussed parallel projection with its types called orthographic and oblique
projection. it also focused on 3D viewing and the steps involved from the actual construction
of a 3D scene to its ultimate depiction on a device:

8. TERMINAL QUESTIONS

1. Explain 3D transformations.
2. Discuss the other transformations of 3D.
3. Explain rotation about an arbitrary axis in space.
4. Explain the reflection through an arbitrary plane.
5. Explain 3D viewing.

Unit 5: 3D Transformation & Viewing 26

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

9. ANSWERS

Self Assessment Questions

1. point-of-view
2. Translation
3. A point is similar to its 2D counterpart; we simply add an extra component, Z, for the 3rd
axis
4. Rotation
5. Ordered set of numbers
6. uniform scaling
7. True
8. reflection axis.
9. Modify
10. x-axis y-axis
11. a) 0, 0,0
12. view point

Terminal Questions Answers

1. A static set of 3D points or other geometric shapes on screen is not very interesting.
You could just use a paint program to produce one of these. To make your program
interesting, you will want a dynamic landscape on the screen. For more details refer
section 5.2.
2. Reflection and shear are the two other transformations of the 3D. reflection and
shearing these transformations allow the developer to reposition, resize, and reorient
models without changing the base values that define them. For more details refer
section 5.3.
3. Since the technique for rotation about a coordinate axis is known, the underlying
procedural idea is to make the arbitrary rotation axis coincident with one of the
coordinate axes. For more details refer section 5.4.

Unit 5: 3D Transformation & Viewing 27

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. The transformation cause reflection through the x=0, y=0, n=0 coordinate planes,
respectively. Often it is necessary to reflect an object through a plane other than one of
these. For more details refer section 5.5.
5. Viewing a scene in 3D is much more complicated than 2D viewing, where in the latter
the viewing plane on which a scene is projected from WCs is basically the screen, except
for its dimensions. For further details refer section 5.6.

Unit 5: 3D Transformation & Viewing 28

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 6: Curves 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 6
Curves
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Introduction to Spline Curves 1 1 4-6
3 Bezier Curves 2, 3 2 7-9
4 B-Spline Curves 4, 5, 6 - 10 - 12
5 Rational B-Spline Curves - 3 13 - 16
6 Introduction to Surfaces - 4
6.1 Bezier Surfaces - - 17 - 20
6.2 B-Spline Surfaces - -
7 Summary - - 21

8 Terminal Questions - - 22
9 Answers - - 22 - 23

Unit 6: Curves 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed 3D transformation and viewing techniques. We covered

basic transformations like translation, rotation, scaling as well as other transformations like
rotation and shearing. We also discussed rotation about an arbitrary axis in space, reflection
through an arbitrary plane, general parallel projection, clipping, viewport clipping and 3D
viewing.

Every curve or surface can be defined by a set of parametric functions. For instance (x, y, z)
coordinates of the points of the curve can be given as:

x= X (t), y = Y (t), z=Z (t), t being the parameter and x, y, z being polynomial functions in t. If
x, y and z are 1st degree polynomials, a line segment will be defined. In that case, two knowns
only (that is two points or a point and a slope) will be sufficient to define this curve. If x, y, z
are 2nd degree polynomials, a parabola segment is defined and 3 knowns will be necessary
to describe it (that is, 3 points or two points and a tangent). For higher degree polynomials,
describing the curve will involve more knowns. This number of knowns is what is called the
order of the curve, and is always the degree of the polynomials plus 1. Lower degree
polynomials describe very restrictive curves, being either lines or parabolas, which are
always planar curves. Various approaches have been conceptualized by mathematicians, for
instance, Bezier curves, B-Spline curves, Non uniform rational B-spline curves and surfaces,
which are discussed in this unit.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Discuss the spline curves

❖ Explain Bezier and B-Spline curves
❖ Discuss Non uniform rational B-Spline curves
❖ Explain Bezier and B-Spline surfaces.

Unit 6: Curves 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. INTRODUCTION TO SPLINE CURVES

Every graphics system has some form of primitive draw lines. Using these primitives one can
draw many complex shapes. However, as these shapes get more complex and finely detailed,
so does the data needed to describe them accurately.

The worst case scenario is the curve. A curve can be described by a finite number of short
straight segments. However, on close inspection, this is only an approximation. To get a
better approximation one can use more segments per unit length. This increases the amount
of data required to store the curve and makes it difficult to manipulate. It is necessary to have
a method for representing these curves in a mathematical fashion. Ideally, descriptions will
be:

➢ Reproducible - the representation should give the same curve every time;
➢ Computationally Quick;
➢ Easy to manipulate, especially important for design purposes;
➢ Flexible
➢ Easy to combine with other segments of curve.

The types of curve that will be discussed fall into two broad categories: interpolating and
approximation curves. Interpolating curves pass through the points used to describe it,
whereas an approximating curve will get near the points. The exact definition of ‘near’ will
be discussed later. The points through which the curve passes are known as knots; the curve
described by the equation is often referred to as a spline. This term originated in manual
design, where a spline is a thin strip, held in place by weights to create a curve which could
then be traced. In the same way knots are now used to describe a curve.

Everyone who has ever tried to apply simple linear interpolation to find a value between
pairs of data points will be aware that such attempts are unlikely to provide reliable results
if the data being used is anything other than broadly linear. In an attempt to deal with
inherent non-linearity, the next step usually involves some sort of polynomial interpolation.
This generally leads to far more stable and robust interpolation and fitting, but is also
potentially a difficult area as the end points, monotonicity, convexity and continuity of
derivatives, all make their influences felt, in often contradictory ways. One of the most

Unit 6: Curves 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

popular ways of dealing with these issues is to use splines. In their most general form, splines
can be considered as a mathematical model that associate a continuous representation of a
curve or surface with a discrete set of points in a given space. Spline fitting is an extremely
popular form.

The list of properties that can be regarded as a convenient set of features against which the
usefulness of various spline types can be measured are:

• Affine Invariance: The affine transformation of a spline should be obtained by

applying the transformation to its control points. This is usually expressed in terms of
a normality constraint.
• Convex Hull: The spline should be entirely contained in the convex of its control lattice.
This is captured via a combination of the normality constraint with a positivity
constraint.
• Variance Diminution: The number of intersections between the spline and a plane
should be at most equal to the number of intersections between the plane and the
control lattice, which means that the spline should have fewer oscillations than its
control lattice. This property is ensured by combining the normality and positivity
constraints with a regularity constraint.
• Local Control: Each control point should only exert influence on the shape of the spline
in a restricted zone. This property is generated by a locality constraint. A given spline
fitting method may offer varying degrees of local control depending on the influence of
any given control point. Blanc and Schlick express this in terms of Lp locality, such that
a spline can be said to have Lp locality wherein each control point influences at most p
segments.
• Smooth and Sharp Shapes: The spline should permit the mixing of sharp and smooth
sections within the same curve. Parametric continuity does not provide any
information on the shape of the curve, so that geometric continuity can be imposed,
with the requirement that sharp shapes are G0 and smooth shapes are G2 geometrically
continuous. Parametric continuity is also required to ensure G2 continuity.
• Intuitive Shape Parameters: In addition to the control parameters, the spline should
also provide additional degrees of freedom (such as weights, tension, bias and

Unit 6: Curves 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

curvature, generally referred to as shape parameters), which should allow the user to
pull the spline locally toward one or more control points in an intuitive fashion.
• Existence of Refinement Algorithms: The spline model should lend itself to the use
of refinement or subdivision techniques which serve to increase the degree of freedom
for a spline without modifying its shape.
• Conic Representation: The spline model should permit the representation of conic
sections and therefore support a wide range of curves and surfaces such as circles,
ellipses, spheres and cylinders etc.
• Approximation/Interpolation: Spline models should provide both approximation
and interpolation splines in a unified formulation.

SELF-ASSESSMENT QUESTIONS - 1

1. A _____________ can be described by a finite number of short straight segments.

2. _____________ and ____________ are the two broad categories of curve.

Figure 8.1: Cubic curve

Unit 6: Curves 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. BEZIER CURVES

Bezier curves are a mathematical representation of smooth curves often used in computer
graphics, animation, and other related fields. They were named after French engineer Pierre
Bezier, who first used them in the design of automobile bodies for Renault in the 1960s.

A Bezier curve is defined by a set of control points that define its shape. These control points
are used to calculate the curve's position and shape by interpolating between them. The
curve itself is defined as a polynomial function of one variable, which can be represented
using a parametric equation.

Bezier curves can be of different orders, meaning they can have different numbers of control
points. For example, a quadratic Bezier curve has three control points, while a cubic Bezier
curve has four control points. Higher-order curves can also be created by adding more
control points.

Bezier curves are often used in computer graphics software to create smooth shapes and
curves, such as lines, curves, and arcs. They are also used in animation to create motion paths
for objects and characters. Additionally, they are widely used in vector graphics editors, such
as Adobe Illustrator and Inkscape, to create shapes and curves.

Bezier curves are defined using four control points, known as knots. Two of these are the
end points of the curve, while the other two effectively define the gradient at the end points.
These two points control the shape of the curve. The curve is actually a blend of the knots.
This is a recurring theme of approximation curves which involves defining a curve as a blend
of the values of several control points. The figure 6.1 shows a Bezier curve; which shows how
the shape of the curve is affected by changing the knots.

Unit 6: Curves 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 6.1: Bezier curve

Bezier curves are more useful than any other type mentioned so far; however, they still do
not manage much local control. Increasing the number of control points does lead to slightly
more complex curves, but as is evident from the figure 6.2, the detail suffers due to the nature
of blending of all the curve points together.

Figure 6.2: Bezier Curve from six points

Unit 6: Curves 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 2

3. A quadratic parametric spline may be written as

P= a2t2+a1t+a0 (State True/False)

4. Higher degree of flexibility achieved through the use of ____________ .

5. What are knots?
6. The curve is actually a blend of the knots. (State True/False)

Unit 6: Curves 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. 4B-SPLINE CURVES

B-spline curves are a type of mathematical curve that is commonly used in computer
graphics, CAD (computer-aided design), and other areas where curves need to be accurately
represented. B-splines are a generalization of Bezier curves and can represent a wide variety
of curve shapes.

The "B" in B-spline stands for "basis," which refers to the set of functions that are used to
construct the curve. These basis functions are typically piecewise polynomial functions of a
certain degree, such as quadratic or cubic. The degree of the basis functions determines the
smoothness of the resulting curve.

To construct a B-spline curve, a set of control points is used to define the curve's shape. The
curve is then created by smoothly interpolating between the control points using the basis
functions. This results in a smooth and continuous curve that passes through each of the
control points.

B-spline curves have a number of advantages over other types of curves. For example, they
are more flexible than Bezier curves and can represent a wider range of curve shapes. They
are also more efficient to calculate, as they can be easily subdivided and modified without
changing the shape of the curve.

B-spline curves have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design curves for car bodies, airplane wings, and other complex shapes.

The main problem with Bezier curves is their lack of local control. Simply increasing the
number of control points adds little local control to the curve. This is due to the nature of the
blending used for Bezier curves. They combine all the points to create the curve. The obvious
solution is to combine only those points nearest to the current parameter. For this the points
can be defined as lying in a parametric space at equal intervals. Figure 6.3 shows the
positions of the knot.

Unit 6: Curves 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 6.3: positions of knot

These points are labeled internally from 0 to (number of points)-1. To calculate the curve at
any parameter t, a Gaussian curve is placed over the parameter space. This curve is actually
an approximation of a Gaussian as shown in the figure 6.4; it does not extend to infinity at
each end, just to +/- 2 by using the following equations:

Figure 6.47: Approximate Gaussian Curve

Unit 6: Curves 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

This curve peaks at a value of 2/3, and at +/- 1, its value is 1/6. When this curve is placed
over the array of control points, it gives the weighting of each point. As the curve is drawn,
each point will in turn become the heaviest weighted; therefore gaining in local control. The
figure 8.8 shows this curve after action. Notice how the curve seems to go haywire at either
end.

At P0, the Gaussian curve covers points from -1 to 1 (at points -2 and 2 the Gaussian weight
is zero). The point at -1 is not defined, so the curve has an undefined value. In this example
it is being pulled towards the origin.

Figure 6.5: Free B-Spline

Unit 6: Curves 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. NON UNIFORM RATIONAL B-SPLINE CURVES

Non-Uniform Rational B-Spline (NURBS) curves are a type of B-spline curve that adds
additional flexibility by allowing for non-uniform knot vectors and weights.

In a NURBS curve, each control point is assigned a weight, which determines its influence on
the shape of the curve. The weights can be used to create curves with varying thickness or to
add more control over the shape of the curve. Additionally, the knot vector can be non-
uniform, meaning that the basis functions are not evenly spaced along the curve. This allows
for greater control over the shape of the curve and can be used to create more complex
shapes.

A NURBS curve (Non Uniform Rational B-Spline Curve) is defined by its order, a set of
weighted control points, and a knot vector. NURBS curves and surfaces are generalizations
of both B-splines and Bézier curves and surfaces, the primary difference being the weighting
of the control points which makes NURBS curves rational (non-rational B-splines are a
special case of rational B-splines). Whereas Bézier curves evolve into only one parametric
direction, usually called s or u, NURBS surfaces evolve into two parametric directions, called
s and t or u and v.

By evaluating a NURBS curve at various values of the parameter, the curve can be
represented in Cartesian two- or three-dimensional space. Likewise, by evaluating a NURBS
surface at various values of the two parameters, the surface can be represented in Cartesian
space.

NURBS curves and surfaces are useful for a number of reasons:

• They are invariant under affine as well as perspective transformations: operations like
rotations and translations can be applied to NURBS curves and surfaces by applying
them to their control points.
• They offer one common mathematical form for both standard analytical shapes
(example conics) and free-form shapes.
• They provide the flexibility to design a large variety of shapes.

Unit 6: Curves 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

• They reduce the memory consumption when storing shapes (compared to simpler
methods).
• They can be evaluated reasonably quickly by numerically stable and accurate
algorithms.

In the next sections, NURBS is discussed in one dimension (curves). It should be noted that
all of this can be generalized to two or even more dimensions.

Control Points

The control points determine the shape of the curve. Typically, each point of the curve is
computed by taking a weighted sum of a number of control points. The weight of each point
varies according to the governing parameter. For a curve of degree d, the weight of any
control point is only nonzero in d+1 intervals of the parameter space. Within those intervals,
the weight changes according to a polynomial function (basis functions) of degree d. At the
boundaries of the intervals, the basis functions go smoothly to zero, the smoothness being
determined by the degree of the polynomial. For example, the basis functions of degree one
is a triangle function. It rises from zero to one, then falls to zero again. While it rises, the basis
function of the previous control point falls. In that way, the curve interpolates between the
two points, and the resulting curve is a polygon, which is continuous, but not differentiable
at the interval boundaries, or knots. Higher degree polynomials have correspondingly more
continuous derivatives. It must be noted that within the interval the polynomial nature of
the basis functions and the linearity of the construction make the curve perfectly smooth, so
it is only at the knots that discontinuity can arise.

The fact that a single control point only influences those intervals where it is active is a highly
desirable property, known as local support. In modeling, it allows the changing of one part
of a surface while keeping other parts equal.

Adding more control points allows better approximation to a given curve, although only a
certain class of curves can be represented exactly with a finite number of control points.
NURBS curves also feature a scalar weight for each control point. This allows for more
control over the shape of the curve without unduly raising the number of control points. In

Unit 6: Curves 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

particular, it adds conic sections like circles and ellipses to the set of curves that can be
represented exactly. The term rational in NURBS refers to these weights.

Knot Vector

The knot vector is a sequence of parameter values that determines where and how the
control points affect the NURBS curve. The number of knots is always equal to the number
of control points plus curve degree plus one. The knot vector divides the parametric space
in the intervals mentioned earlier, usually referred to as knot spans. Each time the
parameter value enters a new knot span, a new control point becomes active, while an old
control point is discarded. It follows the rule that the values in the knot vector should be in
non decreasing order, so (0, 0, 1, 2, 3, 3) is valid while (0, 0, 2, 1, 3, 3) is not.

Consecutive knots can have the same value. This then defines a knot span of zero length,
which implies that two control points are activated at the same time (and of course two
control points become deactivated). This has an impact on continuity of the resulting curve
or its higher derivatives. For instance, it allows the creation of corners in an otherwise
smooth NURBS curve. A number of coinciding knots is sometimes referred to as a knot with
a certain multiplicity. Knots with multiplicity two or three are known as double or triple
knots respectively. The multiplicity of a knot is limited to the degree of the curve, since a
higher multiplicity would split the curve into disjoint parts and it would leave control points
unused. For first-degree NURBS, each knot is paired with a control point.

General Form of a NURBS Curve

Using the definitions of the basic functions Ni,n , NURBS curve takes the following form.

𝑘
𝑁𝑖,𝑛 𝑤𝑖 ∑𝑘𝑖=1 𝑁𝑖,𝑛 𝑤𝑖 𝑃𝑖
𝐶 (𝑢 ) = ∑ 𝑘 𝑃 = 𝑘
∑𝑗=1 𝑁𝑗,𝑛 𝑤𝑗 𝑖 ∑𝑖=1 𝑁𝑖,𝑛 𝑤𝑖
𝑖=1

In this, k is the number of control points Pi and wi is the corresponding weights. The
denominator is a normalizing factor that evaluates to one if all weights are one. This can be
seen from the partition of unity property of the basis functions. It is customary to write this
as,

Unit 6: Curves 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

𝐶(𝑢) = ∑ 𝑅𝑖,𝑛, 𝑃𝑖
𝑖=1

In which the functions

𝑁𝑖,𝑛 𝑤𝑖
𝑅𝑖,𝑛 = 𝑘
∑𝑗=1 𝑁𝑗,𝑛 𝑤𝑗

SELF-ASSESSMENT QUESTIONS – 3

7. State the disadvantage of the Bezier curves.

8. NURBS stands for _______________
9. NURBS increase the memory consumption when storing shapes. (State
True/False)
10. The control points determine the shape of the ___________ .
11. The ____________ is a sequence of parameter values that determines where and how
the control points affect the curve.

Unit 6: Curves 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. INTRODUCTION TO SURFACES
When comparing mathematics in two and three dimensions, there are many similarities.
Very often, the techniques used in the simpler two-dimensional case can be easily extended
to cover three dimensions. Some of the curve representations presented in the previous
sections easily extend to three dimensions and can therefore represent surfaces.

When creating a curve, a single parametric dimension is used, as well as defined points
within this dimension, and then used this to create a curve. For a surface, two orthogonal
parametric dimensions of points are required. These form a rectangular mesh. At any point
in parametric space, two blending functions are used, one in each parametric direction. For
every knot defined, the Cartesian product of the two blending functions is calculated and this
is the weight given to that knot. The sum of all the weights will still be one as it was for a
curve. The most commonly used methods of representing curved surfaces in computing are
Bézier surfaces and B-spline surfaces and these are discussed here.

6.1 Bezier Surfaces

Bezier surfaces are a type of mathematical surface that are commonly used in computer
graphics, CAD (computer-aided design), and other areas where surfaces need to be
accurately represented. Bezier surfaces are a generalization of Bezier curves and can
represent a wide variety of surface shapes.

Like Bezier curves, Bezier surfaces are defined by a set of control points. The surface is then
created by smoothly interpolating between the control points using a set of basis functions.
The basis functions used for Bezier surfaces are typically tensor product polynomials of a
certain degree, such as quadratic or cubic.

To construct a Bezier surface, a rectangular grid of control points is used to define the
surface's shape. The surface is then created by smoothly interpolating between the control
points using the basis functions. This results in a smooth and continuous surface that passes
through each of the control points.

Unit 6: Curves 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Bezier surfaces have a number of advantages over other types of surfaces. For example, they
are more flexible than NURBS surfaces and can represent a wider range of surface shapes.
They are also more efficient to calculate, as they can be easily subdivided and modified
without changing the shape of the surface.

Bezier surfaces have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design surfaces for car bodies, airplane wings, and other complex shapes.

To create a Bézier surface, a mesh of Bézier curves is blended using the blending function

𝑚 𝑛

𝑃(𝑢, 𝑣) = ∑ ∑ 𝑝𝑗𝑘 𝐵𝐸𝑍𝑗,𝑚 (𝑣)𝐵𝐸𝑍𝑘,𝑛 (𝑢)

𝑗−0 𝑘−0

where j and k are points in parametric space and represents the location of the knots in
real space. The Bézier functions specify the weighting of a particular knot. They are called
the Bernstein coefficients.

The definition of the Bézier functions is

BEZ k,n (u) = C (n,k)uk(1-u)u-k

where C(n,k) represents the binary coefficients. When u=0, the function is one for k=0 and
zero for all other points. When two orthogonal parameters are combined, a Bézier curve can
be found along each edge of the surface, as defined by the points along that edge. Bézier
surfaces are useful for interactive design and were first applied to car body design.

6.2 B-Spline Surfaces

B-spline surfaces are a type of mathematical surface that are commonly used in computer
graphics, CAD (computer-aided design), and other areas where surfaces need to be
accurately represented. B-spline surfaces are a generalization of B-spline curves and can
represent a wide variety of surface shapes.

To construct a B-spline surface, a rectangular grid of control points is used to define the
surface's shape. The surface is then created by smoothly interpolating between the control

Unit 6: Curves 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

points using a set of basis functions. The basis functions used for B-spline surfaces are
typically tensor product polynomials of a certain degree, such as quadratic or cubic.

Like B-spline curves, B-spline surfaces can have a non-uniform knot vector and weights
assigned to each control point. The knot vector determines the distribution of the basis
functions along the surface, while the weights determine the influence of each control point
on the shape of the surface. This added flexibility allows for more control over the shape of
the surface and can be used to create more complex shapes.

B-spline surfaces have a number of advantages over other types of surfaces. For example,
they are more flexible than NURBS surfaces and can represent a wider range of surface
shapes. They are also more efficient to calculate, as they can be easily subdivided and
modified without changing the shape of the surface.

B-spline surfaces have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design surfaces for car bodies, airplane wings, and other complex shapes. B-
spline surfaces are supported by many popular CAD and 3D modeling software packages,
including AutoCAD, SolidWorks, and Rhino.

A B-Spline surface can be created using a similar method as the Bézier surface. For B-Spline
curves, two phantom knots are used to clamp the ends of the curve. For a surface, phantom
knots will be needed all around the knots as shown below for an M+1 by N+1 knot surface.

]There are two extra rows and two extra columns of knots in parametric space surrounding
the real knots, where these knots are placed determines the shape of the surface at the edges.
The method described here gives similar results to the method used for Bézier surfaces; that

Unit 6: Curves 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

is, the edges of the surface form a B-Spline curve of the edge knots. This means some of the
boundary conditions are:

or 0 <= m <= M and 0 <= n <= N. These conditions are essentially the same as the two-
dimensional case. It means that the weighting of a sample taken at the boundary m=0 is
dependent only on knots along the m=0 boundary (the phantom knots at m=-1 balance out
the real knots at m=1). The remaining boundary conditions make the surface corners and
the corner knots coincide. The co-ordinate of the corner as set by P0,0 (and hence the
parametric knot at {-1,-1}) is

This gives us a surface that interpolates the corner knots and forms B- Spline curves down
each side.

SELF-ASSESSMENT QUESTIONS – 4

12. Commonly used methods of representing curved surfaces in computing are_________

and ___________ surfaces.
13. We place knots which determine the shape of the surface at the edges. (True/False).

Unit 6: Curves 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. SUMMARY

Let us recapitulate the contents of this unit content. The types of curve discussed fall into
two broad categories: interpolating and approximation curves. Interpolating curves pass
through the points used to describe it, whereas an approximating curve get near to the
points, Parametric equations can be used to generate curves that are more general than
explicit equations of the form y=f(x). We also discussed the Bezier curves which are defined
using four control points, known as knots. Two of these are the end points of the curve, while
the other two effectively define the gradient at the end points. B-Spline curves overcome the
short fall of the

Bezier curves. NURBS or as Non uniform Rational B-Spline Curve is used to represent the
curve in Cartesian two- or three-dimensional space. We also explored the reasons for the
usefulness of the NURBS curves and surfaces.. We concluded this unit with the discussion of
Bezier and B-Spline surfaces.

Unit 6: Curves 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

8. TERMINAL QUESTIONS

1. Explain the spline curve.

2. Explain Bezier curves.
3. Explain B spline curves
4. Discuss Non uniform rational B-spline curves.
5. Define surfaces and explain its types

9. ANSWERS

Self Assessment Questions

1. Curve
2. interpolating and approximation curves. 3. y=f(x).
3. True
4. parametric equations
5. Bezier curves are defined using four control points
6. True
7. Lack of local control
8. Non Uniform Rational B-Spline Curve.
9. False, Which reduces the memory consumption
10. Curve
11. knot vector.
12. Bézier , B-spline
13. True.

Terminal Questions

1. Everyone that has ever tried to apply simple linear interpolation to find a value
between pairs of data points will be only too aware that such attempts are extremely
unlikely to provide reliable results. For more details refer section 6.2.

Unit 6: Curves 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. Bezier curves are defined using four control points, known as knots. Two of these are
the end points of the curve, while the other two effectively define the gradient at the
end points. For more details refer section 6.3.
3. NURBS curves and surfaces are generalizations of both B-splines and Bézier curves and
surfaces, the primary difference being the weighting. For more details refer section 6.4.
4. The most commonly used methods of representing curved surfaces in computing are
Bézier surfaces and B-spline surfaces are discussed here. For more details refer section
6.6.

Unit 6: Curves 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 7: Hidden Surfaces 1

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 7
Hidden Surfaces
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Hidden Surface Determination 1 1 4-5
3 Visible-Surface Detection Methods - - 6 - 10
4 Back Face Detection 2, - 11
5 BSP Tree Method 3, 4 2 12 - 13
6 Depth-Sort Algorithm 5, 6 , 7, 8 - 14 - 15
7 Z-buffer Algorithm - 3 16 - 18

8 Scan Line Method 9, 10 4 19 - 21

9 Fractal Geometry 11 - 22 - 23
10 Wire Frame Methods 12 5 24 -25
11 Summary - - 26
12 Terminal Questions - - 26
13 Answers - - 27 - 28

Unit 7: Hidden Surfaces 2

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed about curve and surface representation. We also discussed
Bezier curves, B-Spline curves, and Rational B-Spline curves.

In this unit, we will discuss hidden surfaces, determination of hidden surfaces and visible
surface detection methods. We will also discuss the z-buffer algorithm, depth-sort algorithm,
back face detection method, BSP tree method scan line method and so on. We will conclude
this unit with a discussion of fractal geometry and wire frame methods.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Explain the hidden surface determination

❖ Discuss object based and image based methods
❖ Explain BSP tree and Scan line method
❖ Discuss depth-sort algorithm
❖ Explain wire frame methods and fractal geometry

Unit 7: Hidden Surfaces 3

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. HIDDEN SURFACE DETERMINATION

In 3D computer graphics, hidden surface determination (also known as hidden surface

removal (HSR), occlusion culling (OC) or visible surface determination (VSD)) is the process
used to determine which surfaces and parts of surfaces are not visible from a given
viewpoint. A hidden surface determination algorithm offers a solution to the visibility
problem, which was one of the first major problems in the field of 3D computer graphics. The
process of hidden surface determination is sometimes called hiding, and such an algorithm
is also called a hider. The analogue for line rendering is hidden line removal. Hidden surface
determination is necessary to render an image correctly, ensuring one cannot look through
walls in virtual reality.

There are many techniques for hidden surface determination. These are fundamentally an
exercise in sorting, and usually vary as per the order in which the sorting is done and manner
in which the problem is subdivided. Sorting large quantities of graphics primitives is usually
done by the divide and conquer method.

Hidden Surface Removal

• Hidden surface removal (HSR) determines which polygons are nearest to the viewer at
a given pixel
• Key criterion: A point P occludes a point Q (and thus Q is “hidden”) if P and Q lie on the
same ray (line) from the camera or eye and P is between the camera location and Q as
shown in figure 7.1.

Figure 7.1: hidden point

• Calculating this ray is tough with a frustum, but normalizing that frustum to a cube
(which the projection matrix does) transforms the oblique rays to straightforward
parallelism with the z axis.

Unit 7: Hidden Surfaces 4

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

• Thus, at the earliest, HSR happens after the projection matrix is applied- explaining the
separation between the projection and viewport transformations.

SELF-ASSESSMENT QUESTIONS -1

1. HSR stands for __________ .

2. Sorting large quantities of graphics primitives is usually done by__________ .

Unit 7: Hidden Surfaces 5

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. VISIBLE-SURFACE DETECTION METHODS

Before moving on to visible surface detection, the following terms shall be reviewed:

1. Modeling Transformation:

In this stage, objects are transformed in their local modeling coordinate systems into a
common coordinate system called the world coordinates.

2. Perspective Transformation (in a perspective viewing system):

After Modeling Transformation, Viewing Transformation is carried out to transform

objects from the world coordinate system to the viewing coordinate system. Later,
objects in the scene are further perspectively transformed. The effect of such an
operation is that, after the transformation the view volume in the shape of a frustum
becomes a regular parallelepiped. The transformation equations are shown as follows
and are applied to every vertex of each object:

x' = x * (d/z),

y' = y * (d/z), z' = z

Where (x, y, z) is the original position of a vertex, (x', y', z') is the transformed position
of the vertex, and d is the distance of image plane from the center of projection.

It is important to note that perspective transformation is different from perspective

projection: Perspective projection projects a 3D object onto a 2D plane perceptively.
Perspective transformation converts a 3D object into a deformed 3D object. After the
transformation, the depth value of an object remains unchanged. Before the
perspective transformation, all the projection lines converge to the center of the
projection. After the transformation, all the projection lines are parallel to each other.
Deformed

Perspective Projection = Perspective Transformation + Parallel Projection

Unit 7: Hidden Surfaces 6

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. Clipping:

In 3D clipping, all objects and parts of objects which are outside of the view volume are
removed. After perspective transformation has been done, the 6 clipping planes, which
form the parallelepiped, are parallel to the 3 axes and hence clipping is straight
forward. Hence the clipping operation can be performed in 2D. For example, one may
first perform the clipping operations on the x-y plane and then on the x-z plane.

Problem Definition of Visible-Surface Detection Methods:

To identify the parts of a scene that are visible from a chosen viewing position, surfaces
which are obscured by other opaque surfaces along the line of sight (projection) are
invisible to the viewer.

Characteristics of Approaches:

– Require large memory size

– Require long processing time

– Applicable to which types of objects

Considerations:

– Complexity of the scene

– Type of objects in the scene

– Available equipment

– Static or animated

Classification of Visible-Surface Detection Algorithms:

1. Object-space Methods
➢ Compare objects and parts of objects to each other within the scene definition to
determine which surfaces, as a whole, should be labelled as visible:

For each object in the scene do:

Unit 7: Hidden Surfaces 7

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Begin

➢ Determine those parts of the object whose view is unobstructed by other parts of it, or
any other object with respect to the viewing specification.
➢ Draw those parts in the object color.

End

➢ Compare each object with all the other objects to determine the visibility of the object
parts.
➢ If there are n objects in the scene, complexity = O(n2)
➢ Calculations are performed at the resolution in which the objects are defined (only
limited by the computation hardware).
➢ Process is unrelated to display resolution or the individual pixel in the image and the
result of the process is applicable to different display resolutions.
➢ Display is more accurate but computationally more expensive as compared to image
space methods because step 1 is typically more complex. For instance, it could be due
to the possibility of intersection between surfaces.
➢ Suitable for scenes with small number of objects and objects with simple relationship
with each other.
2. Image-Space Methods (Mostly used)

Visibility is determined point by point at each pixel position on the projection plane.

For each pixel in the image do Begin

➢ Determine the object closest to the viewer that is pierced by the projector through
the pixel
➢ Draw the pixel in the object color.

End

➢ For each pixel, examine all n objects to determine the one closest to the viewer.
➢ If there are p pixels in the image, complexity depends on n and p (O (np)).
➢ Accuracy of the calculation is bound by the display resolution.

Unit 7: Hidden Surfaces 8

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

➢ A change of display resolution requires re-calculation.

Application of Coherence in Visible Surface Detection Methods:

• Making use of the results calculated for one part of the scene or image for other nearby
parts.
• Coherence is the result of local similarity
• As objects have continuous spatial extent, object properties vary smoothly within a
small local region in the scene. Calculations can then be made incremental.

Types of coherence:

1. Object Coherence:

Visibility of an object can often be decided by examining a circumscribing solid (which

may be of simple form, for example, a sphere or a polyhedron.)

2. Face Coherence:

Surface properties computed for one part of a face can be applied to adjacent parts after
small incremental modification. (For example, if the face is small, it can sometimes be
assumed that if one part of the face is invisible to the viewer, the entire face is also
invisible).

3. Edge Coherence:

The visibility of an edge changes only when it crosses another edge, so if one segment
of a non-intersecting edge is visible, the entire edge is also visible.

4. Scan Line Coherence:

Line or surface segments visible in one scan line are also likely to be visible in adjacent
scan lines. Consequently, the image of a scan line is similar to the image of adjacent scan
lines.

5. Area and Span Coherence:

Unit 7: Hidden Surfaces 9

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

A group of adjacent pixels in an image is often covered by the same visible object. This
coherence is based on the assumption that a small enough region of pixels will most
likely lie within a single polygon. This reduces computation effort involved in searching
for those polygons which contain a given screen area (region of pixels) as in some
subdivision algorithms.

6. Depth Coherence:

The depths of adjacent parts of the same surface are similar.

7. Frame Coherence:

Pictures of the same scene at successive points in time are likely to be similar, despite
small changes in objects and viewpoint, except near the edges of moving objects. Most
visible surface detection methods make use of one or more of these coherence
properties of a scene.

Unit 7: Hidden Surfaces 10

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. BACK FACE DETECTION

In a solid object, there are surfaces which are facing the viewer (front faces) and there are
surfaces which are opposite to the viewer (back faces) as shown in figure 9.2. These back
faces contribute to approximately half of the total number of surfaces. As these surfaces
cannot be seen, they can be removed from the clipping process with a simple step. This is
done to save processing time.

Figure 7.2: Solid object

Each surface has a normal vector. If this vector points in the direction of the center of
projection, it is a front face and can be seen by the viewer. If it points away from the center
of projection, it is a back face and cannot be seen by the viewer. The test is very simple, if the
z component of the normal vector is positive, then, it is a back face. If the z component of the
vector is negative, it is a front face. It must be noted that this technique only works well for
non-overlapping convex polyhedra. In other cases where there are concave polyhedra or
overlapping objects, it is necessary to apply other methods to further determine where the
obscured faces are partially or completely hidden by other objects (For example, using the
Depth-Buffer Method or Depth-sort Method).

Unit 7: Hidden Surfaces 11

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. BSP TREE METHOD

The BSP (Binary Space partitioning) tree algorithm is based on the observation that a
polygon will be scanned and converted correctly (i.e., will not overlap incorrectly or be
overlapped incorrectly by other polygons). Scan and conversion begins with the other end
of the viewer then followed by the viewer’s end to avoid the overlapping of polygons.

It must be ensured that this is so for each polygon. The algorithm makes it easy to determine
a correct order for scan conversion by building a binary tree of polygons, the BSP tree. The
BSP tree’s root is a polygon selected from those to be displayed and the algorithm works
correctly, no matter which is picked. The root polygon is used to partition the environment
into two half-spaces. One half-space contains all the remaining polygons in front of the root
polygon, relative to its surface normal; the other contains all polygons behind the root
polygon. Any polygon lying on both sides of the root polygon’s plane is split by the plane, and
its front and back pieces are assigned to the appropriate half-space. One polygon each from
the root polygon’s front and back half-spaces becomes its front and back child respectively,
and each child is recursively used to divide the remaining polygons in its half-space in the
same fashion.

Build a BSP-tree

– Select a partition plane

– Partition the set of polygons with the plane

– Recursion with each of the two new sets

Figure 7.3: Polygon and tree

Unit 7: Hidden Surfaces 12

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Tree traversal for a given view position V

If V1 is on the front of the plane, then traverse down the back side If V2 is on the front of the
plane, then traverse down the front side

As per the reference from the figure 9.4, painting order from V1: 3, 5, 1, 4b, 2, 6, 4a and the
painting order from V2: 6, 4a, 2, 4b, 1, 3, 5

Figure 7.4

SELF-ASSESSMENT QUESTIONS -2

3. Surfaces which are opposite to the viewer is __________ .

a. back faces b. Front space

c. Side space d. None of the above

4. Determine a correct order for scan conversion by building a binary tree of

polygons, called _________ .
5. Write the sequence of procedure to build BSP tree.

Unit 7: Hidden Surfaces 13

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

6. DEPTH-SORT ALGORITHM

The painter's algorithm is a simplified version of the depth-sort algorithm. In the depth-sort
algorithm, there are 3 steps that are performed:

Step 1: Sort objects from near to far (smallest to largest z coordinate).

Step 2: Resolve ambiguities, rearranging the object list and splitting faces as necessary.

Step 3: Render each face on the rearranged list in ascending order of smallest z coordinate
(from back to front).

Resolving ambiguities in step 2 is what replaces simple rendering of the whole face of the
object. Let the object farthest away be called as P. Before this object is rendered, it must be
tested against other objects which can be called Q, where the z extent of q overlaps the z
extent of P. This is done to prove that P cannot obscure Q and that P can therefore be written
before Q. Up to 5 tests are performed, in order of increasing complexity.

1. Do the polygons' x extents not overlap?

2. Do the polygons' y extents not overlap?
3. Is P entirely on the opposite side of Q's plane from the viewpoint?
4. Is Q entirely on the same side of P's plane as the viewpoint?
5. Do the projections of the polygons onto the (x,y) plane not overlap?

If all 5 tests fail, the assumption is that P obscures Q. So to test whether Q could be rendered
before P, tests 3 and 4 are performed again, with the polygons reversed:

3'. Is Q entirely on the opposite side of P's plane from the viewpoint? 4'. Is P entirely on the
same side of Q's plane as the viewpoint?

The figure 9.5 is a top down view, relative to the viewpoint, of objects P and Q.

Figure 7.5: Top down view

Unit 7: Hidden Surfaces 14

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

In this case, test 3' succeeds. So, Q is moved to the end of the list of objects, and the old Q
becomes the new P. The next picture is a front view. The tests are inconclusive: The objects
intersect each other.

Figure 7.6: Front view

No matter which is P or Q, there is no order in which P and Q could be rendered correctly.

Instead, either P or Q must be split by the plane of the other. The original un split object is
discarded, its pieces are inserted in the list in proper order, and the algorithm proceeds as
before. The figure 9.7 shows an example of splitting:

Figure 7.7: Splitting

The figure 9.8 shows a more subtle case: it is possible to move each object to the end of the
list to place in the correct order relative to one, but not both, of the other objects.

Figure 7.8: Moving object to end of list

This would result in an infinite loop. To avoid looping, the object that moves to the end of
the list are marked. If the first 5 tests fail and the current Q is marked, tests 3' and 4' are not
performed. Instead, one of the objects is split, as if tests 3' and 4' had both failed, and the
pieces are reinserted into the list.

Unit 7: Hidden Surfaces 15

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. Z-BUFFER ALGORITHM

The basic idea is to test the z-depth of each surface to determine the closest (visible) surface.
To do this, declare an array z buffer (x, y) with one entry for

for each polygon P

for each pixel (x, y) in P
compute z_depth at x, y
if z_depth < z_buffer (x, y) then
set_pixel (x, y, color)
z_buffer (x, y) <= z_depth

each pixel position. Initialize the array to the maximum depth. Note: if one has performed a
perspective depth transformation, then all z values 0.0 <= z (x, y) <="1.0". So it is necessary
to initialize all values to 1.0. The algorithm is as follows:

The polyscan procedure can easily be modified to add the z-buffer test. The computation of
the z_depth (x, y) is done using coherence calculations similar to the x-intersection
calculations. It is actually a bi-linear interpolation, that is., interpolation both down (y) and
across (x) scan lines.

Advantages of z-buffer algorithm: It always works and is simple to implement.

Disadvantages:

• May paint the same pixel several times and computing the color of a pixel may be
expensive. As a result it might compute the color only if it passes the z_buffer test. It is
also possible to sort the polygons and scan front to back (reverse of painter's
algorithm). While this tests all the polygons but avoids the expense of computing the
intensity and writing it to the frame buffer.
• Large memory requirements: if used real (4 bytes) then for 640 x 480 resolution, the
requirement would be 4bytes/pixel = 1,228,000 bytes. Usually one uses a 24-bit z-
buffer, so 900,000 bytes or 16 - bit z-buffer = 614,000 bytes. Note: For VGA mode 19
(320 x 200 and use only 240 x 200) the requirement is only 96,000 bytes for a 16-bit
z_buffer. However, one may need additional z - buffers for special effects, like shadows.

Unit 7: Hidden Surfaces 16

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

An alternative method for computing z depth values using plane equations is as follows:

1. Perform a perspective depth transformation (to maintain planes).

2. Compute plane equations (not the same as before the perspective transformation)
3. For each pixel in PDC (xp, yp) find the x , y wdc values (using an inverse point transform)
4. Put x, y into plane equation to find z in wdc
5. Perform z buffer test on z.

Now recall the 2D viewing transformation in the procedure Point Viewing Transform, where:

Sx (VTScaleX) Sy(VTScaleY)

Cx (VTConstX) Cy(VTConstY)

All of which are functions of the window, viewport, and PDC. Now in Point Viewing
Transform there is:

xp <-- round(x* Sx + Cx)

yp <-- round(y * Sy + Cy)

or x = (xp - Cx) / Sx = xp/Sx - Cx/ Sx can be computed Cx/Sx, Cy/Sy once and

y = (yp - Cy) / Sy = yp/Sy - Cy/Sy called as Ax, Ay

Now as one scans across a polygon Xp+1 = Xp + 1

So X w = xp/Sx - Ax and Xw+1 = Xp+1/Sx - Ax = Xp/Sx + 1/Sx - Ax = Xw + 1/Sx

Similarly if Yp+1 = Yp + 1 then Yw+1 = Yw + 1/Sy

Now from the plane equations Z(Xw, Yw) = (- A*Xw - B*Yw + D) / C So Z(Xw+1, Yw) = (-A *
(X w + 1/Sx) - B*Yw + D) / C

= Z(Xw, Yw) - (A * (1/Sx)) / C

Similarly Z(Xw, Yw+1) = ZXw, Yw) - (B(1/Sy)) / C

So for each polygon compute the terms (A * (1/Sx)) / C and (B*(1/Sy)) / C

Unit 7: Hidden Surfaces 17

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

So can find Xw, Yw, Zw at the polygon vertices and use the above to compute rest of Zw
values.

SELF-ASSESSMENT QUESTIONS - 3

6. Perspective transformation is different from perspective projection. (State

True/False)
7. What is 3D clipping?
8. A group of adjacent pixels in an image is often covered by the_______ .
9. State the advantage of z-buffer algorithm.

Unit 7: Hidden Surfaces 18

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

8. SCAN LINE METHOD

In this method, as each scan line is processed, all polygon surfaces intersecting that line are
examined to determine which ones are visible. Across each scan line, depth calculations are
made for each overlapping surface to determine which is nearest to the view plane. When
the visible surface has been determined, the intensity value for that position is entered into
the image buffer. Figure 9.9 depicts scan line projection of two surfaces.

Figure 7.9: Scan line projection of two surfaces

Unit 7: Hidden Surfaces 19

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

For each scan line do

Begin
For each pixel (x,y) along the scan line do ------------ Step 1
Begin
z_buffer(x,y) = 0
Image_buffer(x,y) = background_color
End
For each polygon in the scene do------------ Step 2
Begin
For each pixel (x,y) along the scan line that is covered by
the polygon do
Begin
2a. Compute the depth or z of the polygon at pixel location (x,y).
2b. If z < z_buffer(x,y) then
Set z_buffer(x,y) = z
Set Image_buffer(x,y) = polygon's colour
End
End
End

– Step 2 is not efficient because not all polygons necessarily intersect with the scan line.

– Depth calculation in 2a is not needed if only one polygon in the scene is mapped onto
a segment of the scan line.

– To speed up the process:

With similar idea, every scan line span can be filled by spans as shown in figure 7.10.

When a polygon overlaps on a scan line, depth calculations are performed at their edges to
determine which polygon should be visible at which span. Any number of overlapping
polygon surfaces can be processed using this method. Depth calculations are performed only
when there are overlapping polygons. It is necessary to take advantage of coherence along
the scan lines as one passes from one scan line to the next. If there are no changes in the
pattern of the intersection of polygon edges with the successive scan lines, it is not necessary
to do depth calculations. This works only if surfaces do not cut through or otherwise
cyclically overlap each other. If cyclic overlap happens, the surfaces can be divided to
eliminate the overlaps.

Unit 7: Hidden Surfaces 20

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 7.10: polygon with subdividing

SELF-ASSESSMENT QUESTIONS - 4

10. Depth calculation in 2a is not needed if only ____________ polygon in the scene is
mapped onto a segment of the scan line.
11. _________ method all polygon surfaces intersecting that line are examined to
determine which are visible.
12. When polygon overlaps on a scan line, we perform depth calculations. (State
True/False)

Unit 7: Hidden Surfaces 21

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

9. FRACTAL GEOMETRY

A fractal is a simple mathematical expression that generates an infinitely complex geometric

shape. As one "zooms in" to examine the detail of the geometry, more and more complexity
is revealed. While in some cases, a zoomed-in view will reveal a small-scale version of the
whole, over and over again in others they become wildly different at some point. There are
actually many different types and classifications of fractals and literally an infinite number
of possible ways to view any given one. The resulting patterns are at the same time beautiful
and chaotic, organized and random. In some cases a strange and illogical geometric situation
arises, such as in the fractal (called the Mandelbrot set), which contains a finite area bounded
by an infinitely long perimeter!

Since the 19th century, fractals have been regarded as merely a form of mathematics and
geometry with little or no practical purpose. However, in the 1970's, the mathematician
Benoit Mandelbrot adopted a more abstract definition of dimension than what is generally
used in standard Euclidean geometry. He suggested that the fractal must be handled
mathematically as though it has fractional dimensions, rather than strictly a whole number
of dimensions.

Then in 1987, Dr. Michael Barnsley discovered the Fractal Transform which can detect
fractal codes in real-world images and natural formations. This led to some practical uses,
such as fractal image compression which is widely used in multimedia computer
applications.

These images were generated using the freeware Fractint, by the Stone Soup Group, an in-
depth and versatile fractal program which has been perfected and added-to for many years.
One can use the program to zoom in on the fractals, in which case the displayed pattern is
recalculated at the higher resolution and new detail is revealed, or to rotate to different
angles, generate a 3D map of the fractal or project it onto a 3D surface, and to change the
color pallet and even cycle the colors to produce some very dynamic and hypnotic effects.
Literally, an infinite number of possible patterns is possible using even a single fractal type
or function. Figure 7.11 Beautiful lambda function with a zoomed-in view.

Unit 7: Hidden Surfaces 22

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 7.11: Beautiful lambda function with a zoomed-in view

Unit 7: Hidden Surfaces 23

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

10. WIRE FRAME METHODS

A wire frame model is a visual presentation of a three dimensional or physical object used in
3D computer graphics. It is created by specifying each edge of the physical object where two
mathematically continuous smooth surfaces meet, or by connecting an object's constituent
vertices using straight lines or curves. The object is projected onto the computer screen by
drawing lines at the location of each edge. The term wireframe comes from designers using
metal wire to represent the 3D shape of solid objects. 3D wireframe allows the construction
and manipulation of solids and solid surfaces. 3D solid modeling technique efficiently draws
high quality representation of solids than the conventional line drawing. Figure 7.12 exhibits
the wireframe image using hidden line removal.

Using a wire frame model allows the visualization of the underlying design structure of a 3D
model. Traditional 2-D views and drawings can be created by appropriate rotation of the
object and selection of hidden line removal via cutting planes.

Figure 7.12: wireframe image using hidden line removal

Since wireframe renderings are relatively simple and fast to calculate, they are often used in
cases where a high screen frame rate is needed (for instance, when working with a
particularly complex 3D model, or in real-time systems that model exterior phenomena).
When greater graphical detail is desired, surface textures can be added automatically after
completion of the initial rendering of the wireframe. This allows the designer to quickly
review chansolids or rotate the object to new desired views without long delays associated
with more realistic rendering.

Unit 7: Hidden Surfaces 24

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 5

13. __________ is a simple mathematical expression that generates an infinitely complex

geometric shape.
14. A wire frame model is a visual presentation of a ___________ object.

Unit 7: Hidden Surfaces 25

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

11. SUMMARY

This unit provides information about the hidden surfaces. In 3D computer graphics, hidden
surface determination (also known as hidden surface removal (HSR), occlusion culling (OC)
or visible surface determination (VSD)). We also discussed visible surface detection
methods. We advantages and disadvantages of the z-buffer algorithm were studied. There

are surfaces which are facing the viewer (front faces) and there are surfaces which are
opposite to the viewer (back faces). The BSP tree algorithm is based on the observation that
a polygon will be scan converted correctly. The painter's algorithm is a simplified version of
the depth-sort algorithm. Scan line algorithm, scan line is processed; all polygon surfaces
intersecting that line are examined to determine which are visible. Fractal is a simple
mathematical expression that generates an infinitely complex geometric shape A wire frame
model is a visual presentation of a three dimensional or physical object used in 3D computer
graphics.

12. TERMINAL QUESTIONS

1. Explain hidden surface determination.

2. Discuss the visible surface detection methods.
3. Explain back face detection.
4. Elaborate on BSP tree method.
5. Discuss depth sort algorithm.
6. What is Fractal geometry?

Unit 7: Hidden Surfaces 26

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

13. ANSWERS

Self Assessment Questions

1. Hidden Surface Removal

2. divide and conquer
3. a) back faces
4. BSP tree
5. Selection, partition, recursion
6. True
7. Remove all objects and parts of objects which are outside of the view volume.
8. same visible object
9. It always works and is simple to implement.
10. one
11. Scan line method.
12. True
13. Fractal
14. three dimensional

Terminal Questions

1. Hidden surface removal (HSR) determines which polygons are nearest to the viewer at
a given pixel. For more details refer section 7.2.
2. Before going to visible surface detection, we first review and discuss the following
modeling transformation, perspective transformation. For more details refer section
7.3.
3. In a solid object, there are surfaces which are facing the viewer (front faces) and there
are surfaces which are opposite to the viewer (back faces). For more details refer
section 7.4.
4. The BSP tree algorithm is based on the observation that a polygon will be scan
converted correctly. For more details refer section 7.5.

Unit 7: Hidden Surfaces 27

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. Otherwise called as Painter’s algorithm is a simplified version of the depth-sort

algorithm. In the depth-sort algorithm, there are 3 steps that are performed. For more
details refer section 7.6.
6. Basically, a fractal is a simple mathematical expression that generates an infinitely
complex geometric shape. As you "zoom in" to examine the detail of the geometry, more
and more complexity is revealed. For more details refer section 7.9.

Unit 7: Hidden Surfaces 28

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

BACHELOR OF COMPUTER APPLICATIONS

SEMESTER 5

DCA3142
GRAPHICS AND MULTIMEDIA

Unit 8: Coloring and Shading Models 1

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Unit 8
Coloring and Shading Models
Table of Contents

SL Fig No / Table SAQ /

Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Process Of Altering Color Of An - -
4
Object/Surface/Polygon In 3d Scene
3 Light And Colour Model 1, 2 1, 2
3.1 Light - - 5 - 12
3.2 Color Model - -
4 Interpolative Shading And Texture 3, 4 ,5, 6 3, 4
4.1 Flat Shading - - 13 - 18
4.2 Interpolative Shading - -
5 Summary - - 19
6 Terminal Questions - - 19
7 Answers - - 20 - 21

Unit 8: Coloring and Shading Models 2

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the previous unit, we discussed the concept of hidden surfaces, depth comparison, z-buffer
algorithm; back face detection, BSP tree method, the - rinter’s algorithm, scan-line algorithm,
hidden line elimination, wire frame methods and fractal - geometry.

In this unit we will discuss the colour and shading models. Colour model is an orderly system
for creating a whole range of colours from a small set of primary colours. The two famous
colour models RGB and CMYK are discussed here. Shading deals with the appearance of
objects depends, among other things, on the lighting that illuminates the scene and on the
interaction of light with the objects in the scene. Also discussed here is the texture mapping
which focused on texture resources, mapping from surfaces into texture space, texture and
phong reflectance as well as aliasing.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Explain the color models in graphics.

❖ Discuss basic lighting and reflection
❖ List and discuss shading
❖ Explain the role of texture mapping

Unit 8: Coloring and Shading Models 3

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

2. PROCESS OF ALTERING COLOR OF AN OBJECT/SURFACE/POLYGON

IN 3D SCENE

The process of altering the color of an object, surface, or polygon in a 3D scene can vary
depending on the software or programming language being used, but in general, it involves
the following steps:

1. Access the object/surface/polygon: First, you need to identify the specific object,
surface, or polygon that you want to change the color of. This can be done by selecting
it from the 3D scene or referencing it in your code.
2. Define the new color: Next, you need to define the new color that you want to apply.
This can be done using RGB values (red, green, blue), hexadecimal codes, or other color
models.
3. Apply the new color: Finally, you need to apply the new color to the
object/surface/polygon. This can be done by setting the object's material properties,
changing the color of the polygon's vertices, or using shaders to modify the color of the
object in real-time.

In most 3D software and programming languages, there are specific functions or APIs that
allow you to perform these steps. For example, in the popular 3D modeling software Blender,
you can change the color of an object by selecting it, opening the Properties panel, and
modifying the Material properties. In the programming language Python, you can use the
PyOpenGL library to modify the color of objects in a 3D scene.

Unit 8: Coloring and Shading Models 4

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

3. LIGHT AND COLOUR MODEL

3.1 Light

• Basic Lighting and Reflection

So for, we have studied the geometric aspects of how objects are transformed and
projected to images. We will discuss the shading of objects, what the appearance of
objects depends on among other things, on the lighting that illuminates the scene, and
on the interaction of light with the objects in the scene. Some of the basic qualitative
properties of lighting and object reflectance need to be modelled include:

Light Source – There are different types of sources of light, such as point sources
(example, a small light at a distance), extended sources (example, the sky on a cloudy
day), and secondary reflections example, light that bounces from one surface to
another).

Reflectance – Different objects reflect light in different ways. For example, diffuse
surfaces appear the same when viewed from different directions, whereas a mirror
looks very different from different points of view.

• Simple Reflection Models

Discussed here is a simplified model of lighting that is easy to implement and fast to
compute, and used in many real-time systems such as OpenGL. This model is an
approximation and does not fully capture all of the effects observed in the real world.

Diffuse reflection

Let us begin with the diffuse reflectance model. A diffuse surface is one that appears similarly
bright from all viewing directions. That is, the emitted light appears independent of the
viewing location. Let𝑝̅ be a point on a diffuse surface with normal 𝑛⃗, light by a point light
source in direction 𝑠 from the surface. The reflected intensity of light is represented as:

Ld (𝑝̅ ) = rd I max(0, 𝑠 · 𝑛⃗ ) (Equation 8-1)

Unit 8: Coloring and Shading Models 5

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

where I is the intensity of the light source, rd is the diffuse reflectance of the surface, and is
the direction of the light source. This equation requires the vectors to be normalized, i.e., ||
𝑠 || = 1, ||𝑛⃗ = 1||.

The 𝑠 · 𝑛⃗ , term is called the foreshortening term. When a light source projects light
obliquely at a surface, that light is spread over a large area, and less portion of the light hits
any specific point. For example, imagine pointing a flashlight directly at a wall versus in a
direction nearly parallel. In the latter case, the light from the flashlight will spread over a
greater area, and individual points on the wall will not be as bright.

For colour rendering, it is important to specify the reflectance in colour (as (r d,R, T d,G, Td,B)),
and specify the light source in colour as well (IR, IG, IB). The reflected colour of the surface is
then represented as:

L d,R(𝑝̅ )= r d,R IR max(0, 𝑠 · 𝑛⃗ ,) (Equation 8-2)

L d,G(𝑝̅ ) = r d,G IG max(0, 𝑠 · 𝑛⃗ , ) (Equation 8-3)

L d,B( ) = r d,B IB max(0, 𝑠 ·𝑛⃗ , ) (Equation 8-4)

Perfect Specular Reflection

For pure specular (mirror) surfaces, the incident light from each incident direction ⃗⃗⃗
𝑑𝑖 is
reflected toward a unique emittant direction ⃗⃗⃗⃗
𝑑𝑒 . The emittant direction lies in the same
plane as the incident direction ⃗⃗⃗
𝑑𝑖 and the surface normal 𝑛⃗ , and the angle between 𝑛⃗ and
⃗⃗⃗⃗
𝑑𝑒 is equal to that between 𝑛⃗ and ⃗⃗⃗
𝑑𝑖 as shown in figure 8.1. One can show that the emittant
direction is given by ⃗⃗⃗⃗ ⃗⃗⃗𝑖 ) 𝑛⃗, − ⃗⃗⃗
𝑑𝑒 = 2(𝑛⃗ , ·𝑑 𝑑𝑖 .

Figure 8.1: Specular reflection

Unit 8: Coloring and Shading Models 6

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

In perfect specular reflection, the light emitted in direction ⃗⃗⃗⃗

𝑑𝑒 can be computed by reflecting
⃗⃗⃗⃗ ⃗⃗⃗𝑖 ) 𝑛⃗, − ⃗⃗⃗
𝑑𝑒 across the normal as 2(𝑛⃗ , ·𝑑 𝑑𝑖 ) , and determining the incoming light in this
direction. (Again, all vectors are required to be normalized in these equations).

General Specular Reflection

Many materials exhibit a sign,ificant specular component in their reflectance, but only a few
are perfect mirrors. First, most specular surfaces do not reflect all light, and that is easily
handled by introducing a scalar constant to attenuate intensity. Second, most specular
surfaces exhibit some form of off-axis specular reflection. That is, many polished and shiny
surfaces (like plastics and metals) emit light in the perfect mirror direction and in some
nearby directions as well. These off-axis specularities may however look a little blurred.
Good examples are highlights on plastics and metals. More precisely, the light from a distant
point source in the direction of 𝑠 is reflected into a range of directions about the perfect
mirror directions 𝑚
⃗⃗ = 2(𝑛⃗. 𝑠)𝑛⃗ − 𝑠 . One common model for this is the following:

⃗⃗⃗⃗𝑒 ) = 𝑟𝑠 𝐼𝑚𝑎𝑥 (0, 𝑚

𝐿𝑠 (𝑑 ⃗⃗ . ⃗⃗⃗⃗
𝑑𝑒 ) 𝛼 , (Equation 8-5)

where rs is called the specular reflection coefficient, I is the incident power from the point
source, and α ≥ 0 is a constant that determines the width of the specular highlights. As α
increases, the effective width of the specular reflection decreases. In the limit, as α increases,
this becomes a mirror. The intensity of the specular region is proportional to max(0, cos φ)α,
⃗⃗ and ⃗⃗⃗⃗
where φ is the angle between 𝑚 𝑑𝑒 . One way to understand the nature of specular
reflection is to plot this function as depicted in figure 8.2.

Unit 8: Coloring and Shading Models 7

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Figure 8.2: Plot of specular intensity as a function of viewing angle φ.

Ambient Illumination

The diffuse and specular shading models are easy to compute, but often appear artificial. The
biggest issue is the point light source assumption, the most obvious consequence of which is
that any surface normal pointing away from the light source will have a radiance of zero. A
better approximation to the light source is a uniform ambient term plus a point light source.
This is a still a remarkably crude model, but it is much better than the point source by itself.
Ambient illumination is modelled by:

La( 𝑃̅ ) = ra Ia (8-6)

where ra is often called the ambient reflection coefficient, and Ia denotes the integral of the
uniform illuminant.

Unit 8: Coloring and Shading Models 8

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS - 1

1. Point source and extended are the two sources of __________ .

2. What is diffuse surface?
3. The diffuse and specular shading models are easy to compute, and appears natural.
(State True/False)

A colour model is an abstract mathematical model describing the way the colours can be
represented as tuples of numbers, typically as three or four values or colour components.
When this model is associated with a precise description of how the components are to be
interpreted (viewing conditions, and so on) and the resulting set of colours is called colour
space. This section describes the methods by which human colour vision can be modelled.

3. 2 Color Model

• RGB Color Model

The RGB color model is a color model used in digital imaging and computer graphics.
The name "RGB" stands for Red, Green, and Blue, which are the primary colors of light.
The RGB color model works by combining these three primary colors in various
proportions to produce a wide range of colors.

In this model, each color is represented by a value between 0 and 255, with 0 indicating
no color and 255 indicating the maximum amount of color. Thus, any color can be
represented by a combination of three numbers, one for each primary color.

For example, pure red would be represented as (255, 0, 0), pure green as (0, 255, 0),
and pure blue as (0, 0, 255). White would be represented as (255, 255, 255), while black
would be represented as (0, 0, 0).

The RGB color model is widely used in computer graphics, as it is the basis for the colors
displayed on computer screens and other digital devices. It is also used in digital
cameras, scanners, and other imaging devices.

Unit 8: Coloring and Shading Models 9

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

The main purpose of the RGB colour model is for sensing, representation, and display
of images in electronic systems, such as televisions and computers, though it has also
been used in conventional photography. Before the electronic age, the RGB colour
model already had a solid theory behind it, based in human perception of colours. It
suggested that media that transmit light (such as television) use additive colour mixing
with primary colours of red, green, and blue, each of which stimulates one of the three
types of the eye's colour receptors with as little stimulation as possible of the other two.
This is called "RGB" colour space. Mixtures of light of these primary colours cover a
large part of the human colour space and thus produce a large part of human colour
experiences. This is why colour television sets or colour computer monitors need only
to produce mixtures of red, green and blue light. Other primary colours could be used
in principle, but with red, green and blue the largest portion of the human colour space
can be captured. Unfortunately there is no exact consensus as to what loci in the
chromaticity diagram the red, green, and blue colours should have. RGB is a device-
dependent colour model. Different devices detect or reproduce a given RGB value
differently, since the colour elements (such as phosphors or dyes) and their response
to the individual R, G, and B levels vary from manufacturer to manufacturer, or even
within the same device over time. Thus an RGB value does not define the same colour
across devices without some kind of colour management.

• HSV and HSL Representation

HSV and HSL are two alternative color models used to represent colors in digital
imaging and computer graphics.

HSV (Hue, Saturation, Value) and HSL (Hue, Saturation, Lightness) are both cylindrical
coordinate systems, which means they represent colors in three dimensions. However,
while they share some similarities, they are different in their approach to color
representation.

HSV represents colors based on their hue, saturation, and value. Hue refers to the color
itself, such as red, green, or blue. Saturation represents the intensity or purity of the
color, with 0 being a shade of gray and 100 being the most intense or pure color. Value

Unit 8: Coloring and Shading Models 10

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

represents the brightness of the color, with 0 being black and 100 being the brightest
possible color.

HSL, on the other hand, represents colors based on their hue, saturation, and lightness.
Hue again refers to the color itself, while saturation represents the intensity or purity
of the color, as in the HSV model. Lightness, however, is different from value. It
represents the perceived brightness of the color, with 0 being black and 100 being
white.

Both models are often used in computer graphics and image editing software, as they
offer an alternative way to manipulate and adjust colors. For example, adjusting the hue
in HSV can shift the color spectrum, while adjusting the lightness in HSL can make an
image appear brighter or darker.

Because HSL and HSV are simple transformations of device-dependent RGB models, the
physical colours they define depend on the colours of the red, green, and blue primaries
of the device or of the particular RGB space, and on the gamma correction used to
represent the amounts of those primaries. Each unique RGB device therefore, has
unique HSL and HSV spaces to accompany it, while numerical HSL or HSV values
describe the different colours for each basis RGB space. Both of these representations
are used widely in computer graphics, and one or the other is often more convenient
than RGB, but both are also criticized for not adequately separating colour-making
attributes

• CMYK Colour model

CMYK colour model (process colour, four colour) is a subtractive colour model, used in
colour printing, and is also used to describe the printing process itself. CMYK refers to
the four inks used in some types of colour printing: cyan, magenta, yellow, and key
(black). Though it varies according to the print house, press operator, press
manufacturer and press run, the ink is typically applied in the order of the abbreviation.
The "K" in CMYK stands for key since in four-colour printing cyan, magenta, and yellow
printing plates are carefully keyed or aligned with the key of the black key plate. Some
sources suggest that the "K" in CMYK comes from the last letter in "black" and was

Unit 8: Coloring and Shading Models 11

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

chosen because B already indicates blue. This explanation, though plausible and useful
as a mnemonic, is incorrect. The CMYK model works by partially or entirely masking
colours on a lighter, usually white, background. The ink reduces the light that would
otherwise be reflected. Such a model is called subtractive because inks "subtract"
brightness from white.

In additive colour models such as RGB, white is the "additive" combination of all
primary coloured lights, while black is the absence of light. In the CMYK model, it is the
opposite: white is the natural colour of the paper or other background, while black
results from a full combination of coloured inks. To save money on ink, and to produce
deeper black tones, unsaturated and dark colours are produced by using black ink
instead of the combination of cyan, magenta and yellow.

SELF-ASSESSMENT QUESTIONS - 2

4. The main purpose of the RGB colour model is for the ______________ ,
______________ and _______________ in electronic systems.
5. RGB is a device-dependent colour model. (State True/False).
6. HSL stands for

a. hue, saturation, and lightness b. hue, saturation, and LaCie

c. huge, saturation, and level d. hue, selection, and lightness

7. In CMYK ,which colour the K represents?

Unit 8: Coloring and Shading Models 12

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

4. INTERPOLATIVE SHADING AND TEXTURE

Shading is a process used in drawing for depicting levels of darkness on paper by applying
media more densely or with a darker shade for darker areas, and less densely or with a
lighter shade for lighter areas. There are various techniques of shading including cross
hatching where perpendicular lines of varying closeness are drawn in a grid pattern to shade
an area. The closer the lines are, the darker the area appears. Likewise, the farther apart the
lines are, the lighter the area appears.

4.1 Flat Shading

Flat shading is a lighting technique used in 3D computer graphics to shade each polygon of
an object based on the angle between the polygon's surface normal and the direction of the
light source, their respective colors and the intensity of the light source. It is usually used for
high speed rendering where more advanced shading techniques are computationally
expensive. As a result of flat shading, all the polygon's vertices are coloured with one colour,
allowing differentiation between adjacent polygons. Specular highlights are rendered poorly
with flat shading. If there happens to be a large specular component at the representative
vertex, that brightness is drawn uniformly over the entire face. If a specular highlight doesn’t
fall on the representative point, it is missed entirely. Consequently, one does not include the
specular reflection component in the shading computation.

4.2 Interpolative Shading

The idea of interpolative shading is to avoid computing the full lighting equation at each pixel
by interpolating quantities at the vertices of the faces.

Gouraud Shading

Gouraud shading is considered superior to flat shading, which requires significantly less
processing than Gouraud shading but usually results in a faceted look. If a mesh covers more
pixels in screen space than it has vertices, interpolating colour values from samples of
expensive lighting

Unit 8: Coloring and Shading Models 13

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

calculations at vertices and is less processor intensive than performing the lighting
calculation for each pixel as in Phong shading. However, highly localized lighting effects will
not be rendered correctly, and if a highlight lies in the middle of a polygon but does not
spread to the polygon's vertex, it will not be apparent in Gouraud rendering. Conversely, if a
highlight occurs at the vertex of a polygon, it will be rendered correctly at this vertex (as this
is where the lighting model is applied), but will be spread unnaturally across all neighboring
polygons via the interpolation method. The problem is easily spotted in a rendering which
ought to have a specular highlight moving smoothly across the surface of a model as it
rotates. Gouraud shading will instead produce a highlight continuously fading in and out
across neighboring portions of the model, peaking in intensity when the intended specular
highlight passes over a vertex of the model.

Phong Shading

Phong shading is a technique used in computer graphics to produce a smooth shading effect
on 3D surfaces by interpolating surface normals across a polygon mesh. . It is also called
Phong interpolation or normal-vector interpolation shading This shading method was
invented by Bui Tuong Phong in 1975, and it has since become a fundamental tool for
rendering realistic images in 3D graphics.. Specifically, it interpolates surface normals across
rasterized polygons and computes pixel colours based on the interpolated normals and a
reflection model.

Phong shading improves upon Gouraud shading and provides a better approximation of the
shading of a smooth surface. Phong shading assumes a smoothly varying surface normal
vector. The Phong interpolation method works better than Gouraud shading when applied
to a reflection model that has small specular highlights such as the Phong reflection model.

The most serious problem with Gouraud shading occurs when specular highlights are found
in the middle of a large polygon. Since these specular highlights are absent from the
polygon's vertices and Gouraud shading interpolates based on the vertex colors, the specular
highlight will be missing from the polygon's interior. This problem is fixed by Phong shading.

Unlike Gouraud shading, which interpolates colours across polygons, in Phong shading a
normal vector is linearly interpolated across the surface of the polygon from the polygon's

Unit 8: Coloring and Shading Models 14

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

vertex normals. The surface normal is interpolated and normalized at each pixel and then
used in a reflection model, like the Phong reflection model, to obtain the final pixel colour.
Phong shading is more computationally expensive than Gouraud shading

since the reflection model must be computed at each pixel level instead of at each vertex.

In modern graphics hardware, variants of this algorithm are implemented using pixel or
fragment shaders.

SELF-ASSESSMENT QUESTIONS - 3

8. _____________ is a process used in drawing for depicting levels of darkness on paper

by applying media.
9. Flat shading is a lighting technique used in 3D computer graphics to shade each
polygon. (State True/False)
10. __________ and ________________ are the two types of interpolative shading.

Texture Mapping

Texture mapping is the process of determining where in a particular space a texture will be
applied. A texture consists of a series of pixels (also called texels), each occupying a texture
coordinate determined by the width and height of the texture. These texture coordinates are
then mapped into values ranging from 0 to 1 along a, u and v axes (u is width, v is height).
This process is called UV mapping. The resulting coordinates are UV coordinates. It is
usually preferred to give objects a more varied and realistic appearance through complex
variations in reflectance that convey textures. Surface marking variations in albedo (that is
the total light reflected from ambient and diffuse components of reflection). Areas covered
in this sections are, where textures come from, how to map textures onto surfaces, how
texture changes reflectance and shading, scan conversion under perspective warping, and
aliasing.

Unit 8: Coloring and Shading Models 15

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

A) Texture Sources

Texture Procedure: Textures may be defined procedurally. As input, a procedure

requires a point on the surface of an object, and it outputs the surface albedo at that
point. Examples of procedural textures include checkerboards, fractals, and noise as
shown in figure 8.3.

Figure 8.3: A procedural checkerboard pattern applied to a teapot

Digital Images

To map an arbitrary digital image to a surface, one can define texture coordinates (u, v) 𝜖 [0,
1]2. For each point [u0, v0] in texture space, one gets a point as shown in figure 8.4.

Figure 8.4: Coordinates on digital image

B) Mapping from Surfaces into Texture Space

For each face of a mesh, it is necessary to specify a point (μi, νi) for vertex

𝑝̅𝑖 i. Then define a continuous mapping from the parametric form of the surface 𝑠 (α, β) onto
the texture, that is define m such that (μ, ν) = m(α, β).

Unit 8: Coloring and Shading Models 16

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

Example: For a surface of revolution, 𝑠(α, β) = (cx(α) cos(β), cx(α) sin(β), cz(α)). So let 0 ≤ α ≤
1 and 0 ≤ β ≤ 2π.

Then μ = α and ν = β/2π depicted in the figure 10.5.

Figure 8.5: Shows 3D surface, Texture space and Image

C) Texture and Phong Reflectance

Scale texture values in the source image to be in the range 0 ≤ τ ≤ 1 and use them to
scale the reflection coefficients rd and ra. That is,

𝑟̃𝑑 = 𝑡𝑟𝑑 ,

𝑟̃𝑑 = 𝑡𝑟𝑎 .

One could also multiply τ by the specular reflection, in which case it would mean simply
scaling E from the Phong model.

D) Aliasing

A problem with high resolution texturing is aliasing, which occurs when adjacent
pixels in a rendered image are sampled from pixels that are far apart in a texture image.
By down-sampling or reducing the size of a texture, aliasing can be reduced for far away
or small objects, but then textured objects look blurry when close to the viewer. What
one really wants is a high resolution texture for nearby viewing, and down-sampled
textures for distant viewing. A technique called mipmapping gives us this by pre-
rendering a texture image at several different scales as shown in figure 8.6. For
example, a 256x256 image might be down-sampled to 128x128, 64x64, 32x32, 16x16,

Unit 8: Coloring and Shading Models 17

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

and so on. Then it is up to the renderer to select the correct mipmap to reduce aliasing
artifacts at the scale of the rendered texture.

Figure 8.6: An aliased high resolution texture image (left) and the same texture after
mipmapping (right)

SELF-ASSESSMENT QUESTIONS – 4

11. What are the sources of texture?

12. Name the problem we face during texturing.

Unit 8: Coloring and Shading Models 18

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. SUMMARY

This unit provides information about the RGB and CYMK colouring models. RGB is a colour
model that uses the three primary (red, green, blue) additive colors, which can be mixed to
make all other colours. HSV and HSL (hue, saturation, value and hue, saturation, lightness),
were introduced in the late 1970s. HSV and HSL improve on the colour cube representation
of RGB by arranging colours of each hue in a radial slice, around a central axis of neutral
colours which ranges from black at the bottom to white at the top. The section on basic
lighting and reflection discussed about the simple reflection models. Shading is a process
used in drawing for depicting levels of darkness on paper by applying media more densely
or with a darker shade for darker areas, and less densely or with a lighter shade for lighter
areas. Flat and interpolative shadings were covered in this section. Texture mapping is the
process of determining where in a particular space a texture will be applied. Discussed here
was the sources of texture. This unit concluded with the discussion of texture and phong
reflectance and aliasing.

6. TERMINAL QUESTIONS

1. Explain different color model.

2. Describe basic lighting and reflection.
3. Explain simple reflection model.
4. Define shading? Discuss its types.
5. Explain texture mapping.
6. What is aliasing?

Unit 8: Coloring and Shading Models 19

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

7. ANSWERS

Self Assessment Questions

1. Light
2. A diffuse surface is one that appears similarly bright from all viewing directions
3. False.
4. sensing, representation, and display of images
5. True.
6. a. hue, saturation, and lightness
7. Key(black)
8. Shading
9. True.
10. Gouraud and phong
11. Texture procedure and Digital images.
12. Aliasing.

Terminal Questions

1. A color model is an abstract mathematical model describing the way colors can be
represented as tuples of numbers, typically as three or four values or color components.
When this model is associated with a precise description of how the components are to
be interpreted (viewing conditions, etc.). For further details refer section 8.3.
2. How the appearance of objects depends, among other things, on the lighting that
illuminates the scene, and on the interaction of light with the objects in the scene. For
more details refer section 8.3.
3. Simplified model of lighting that is easy to implement and fast to compute, and used in
many real-time systems such as OpenGL. This model will be an approximation and does
not fully capture all of the effects we observe in the real world. For more details refer
sub section 8.4.1
4. Shading is a process used in drawing for depicting levels of darkness on paper by
applying media more densely or with a darker shade for darker areas, and less densely
or with a lighter shade for lighter areas. For more details refer section 8.5.

Unit 8: Coloring and Shading Models 20

DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)

5. Texture mapping is the process of determining where in a particular space a texture

will be applied. A texture consists of a series of pixels (also called texels). For more
details refer section 8.5.1.
6. A problem with high resolution texturing is aliasing, which occurs when adjacent pixels
in a rendered image are sampled from pixels that are far apart in a texture image. For
more details refer sub section 8.5 (D).

Unit 8: Coloring and Shading Models 21

Graphics and Multimedia Systems Unit 9

Unit 9 Multimedia
Structure:
9.1 Introduction
Objectives
9.2 Introduction and Concepts of Multimedia
Definition
Medium
9.3 Uses of Multimedia
9.4 Role of Hypertext and Hypermedia
9.5 Image and Video
9.6 Standards in Multimedia
9.7 Summary
9.8 Terminal Questions
9.9 Answers

9.1 Introduction
We perceive the environment through our senses. These senses, that is, sight
and hearing are brought into play as we interact with our surroundings. Our
sensory organs send signals to the brain, which interprets this interaction.
The process of communication which is sending messages from one person
to another is dependent on the understanding abilities of our senses. In
general, the more information that is perceived by the receiver, the more
effective communication will be.
Let us take the case of a person wanting to tell a friend the details of a trip
made during the vacations. With advancements in technology, there are
different ways through which this communication can happen. Some of the
methods to communicate are:
Case 1: Assuming a letter is written to the friend describing the trip. In this
case the friend can just read the text but not see the expression and
excitement of the writer. Similarly, the writer would need to wait for the friend
to reply to know how he/she felt.
Case 2: Assuming a few photographs taken during the trip are sent along with
the letter, then the friend can visualize the fun the writer had.
Case 3: Assuming that communication was over phone, then the friend can
hear the person’s excitement over the trip in his/her voice and understand
Manipal University Jaipur B1552 Page No.: 177
Graphics and Multimedia Systems Unit 9

the emotions better. Similarly the friend’s reactions are also spontaneous over
the phone.
Case 4: Assuming the case of a video chat with the friend. Here it is possible
for both participants to see each other, hear each other and share a
conversation.
In each case, the message or information is conveyed but with a different
approach. Therefore the more information is sent, the greater is the impact
of the communication and the different media discussed like letter (text),
photograph (image), telephone (voice), video chat (video) form the basic
components of multimedia.
In this unit, we will discuss the definition and basic concepts of multimedia
and where the multimedia can be used. We will also discuss the concept of
hypertext, hypermedia and its relationship with multimedia. Finally we will
conclude this unit with the discussion of standards available for images, video
and audio
Objectives:
After studying this unit, you should be able to:
• discuss the basic concepts of multimedia
• list and explain the various mediums of multimedia
• explain the uses of multimedia
• differentiate and explain hypertext and hypermedia
• discuss the image, video and audio standards

9.2Introduction and Concepts of Multimedia

As the name implies, multimedia is the integration of multiple forms ofmedia.
This includes text, graphics, audio, video, and so on. For example, a
presentation involving audio and video clips would be considered a
"multimedia presentation." Educational software that involves animations,
sound, and text is called "multimedia software." CDs and DVDs are often
considered to be "multimedia formats" since they can store a lot of data and
most forms of multimedia require a lot of disk space.
9.2.1Definition
Multimedia refers to the use of multiple forms of media or content, such as
text, graphics, audio, video, and animations, in a single presentation or
communication. It involves the integration of different types of media to create
Manipal University Jaipur B1552 Page No.: 178
Graphics and Multimedia Systems Unit 9
an interactive and engaging experience for the viewer or user. Examples of
multimedia include websites, video games, virtual reality experiences, and
presentations that combine text, images, and video. The purpose of
multimedia is to convey information or entertainment in a dynamic and
engaging way that stimulates multiple senses and enhances the overall
experience for the user.

Manipal University Jaipur B1552 Page No.: 179

Graphics and Multimedia Systems Unit 9

The introduction to terminology begins with the notion multimedia, followed by the
description of media and the important properties of multimediasystems. Actually the word
“Multimedia” comes from the Latin words ‘multus’ which means “numerous” and media
which means “middle”. Incidentally the word ‘media’ conveys the meaning ‘intermediary’
therefore; multimedia means multiple intermediaries or multiple means. The multiple
means by which the information/data is stored, transmitted or presented are:
➢ Text (example books, letters, and newspapers) includes both unformatted
text comprising of characters from limited character set, andformatted text
strings that are used for the structuring access and presentation of
electronic documents.
➢ Images and Graphics (example photographs, charts, maps, logos, and
sketches) include computer generated images, comprising lines curves
and circles, and digitized images of documents and pictures.
➢ Audio/Sound (example radio, gramophone, records and audio
cassettes) includes both low fidelity speeches as used in telephony as
well as high fidelity stereophonic music that are used in CD’s.
➢ Video and Animation (example T.V, radio cassettes and motion pictures)
includes short sequence of moving images (in video clips) and complete
movies/film.
9.2.2Medium
The meaning of the word media varies according to the context in which it is
used. Our definition of medium is a means to distribute and represent
information. Media can be text, graphics, pictures, voice, sound and music.
Media can be classified with respect to different criteria that is, perception,
representation, presentation, storage, transmission, and information
exchange. Each of these criteria will be discussed in detail.
Perception Medium: The perception media helps us to sense our
environment. We can perceive information mostly through seeing and hearing
the information. The perception of information through seeing or visual
media includes text, graphics, image and video. The perception of
information through hearing which is auditory media includes music, sound
and voice.
Representation Medium: Representation media refers to how the
information is represented internally by the computer. There are various
formats used for representing media information in a computer. For example,

Manipal University Jaipur B1552 Page No.: 180

Graphics and Multimedia Systems Unit 9
each character of a text is encoded in ASCII (American Standard Code for
Information Interchange) code. Also, an image can be coded in JPEG format.
Presentation Medium: Presentation media refers to the physical systems or
devices used for the input and output of information. The media like paper,
computer monitor and speaker that are used to deliver the information are
called output media while keyboards, mouse, cameras are the input media.
Storage Medium: Storage media refers to various physical means for storing
the information such as magnetic tapes, magnetics disks, or optical disks.
However the storage of information is not limited to the componentsof a
computer. Therefore, paper is also storage medium.
Transmission Medium: Transmission media refers to the physical means
such as coaxial cable, fiber optics, or radio waves and so on that allows the
transmission of information. This symbolizes the different information carriers
that enable continuous data transmission. Therefore, storage media are
excluded from this kind of medium.
Information Exchange Medium: Information exchange media refers to all
media used to transport information, that is all storage and transmission
media. For example, electronic mailing system in which information is
exchanged by storing it on a medium and transporting the medium from
source to destination.
Representation Values and Representation Spaces: Every media that has
been discussed defines representation values and representation spaces
which involve our five senses.
Paper or computers screens are examples of visual representation spaces.
When watching a movie in a cinema hall then the movie screen becomesthe
representation space. Similarly a sound system defines acoustic
representation space. Representation spaces are part of the above described
presentation media used to output information.

Manipal University Jaipur B1552 Page No.: 181

Graphics and Multimedia Systems Unit 9

Representation values define how information from several media is

represented. Text is a medium that represents a sentence through a
sequence of characters. Similarly, voice is a medium that represents
information acoustically in the form of pressure waves. In some media, the
representation values can be properly interpreted by us like taste,
temperature and smell. Other media requires a set of predefined symbols to
understand the information like text, gestures.
Self-Assessment Questions
1. The term______________is synonymous for interactive multimedia.
2. Multimedia comes from the Latin word .
3. Perception media helps us to sense our environment. (True/False)
4. -------------is a medium that represents information acoustically in the form of
pressure waves

9.3Uses of Multimedia
Multimedia has found large applications in various areas including, but not
limited to, advertisements, art, education, entertainment, engineering,
medicine, mathematics, business, scientific research and spatial temporal
applications. Some examples are as follows:
Entertainment: Multimedia is heavily used in the entertainment industry,
especially to develop special effects in movies and animations. Computer
games are also one of the main applications of multimedia because of the
high amount of interactivity involved.
Education: In education, multimedia is used to produce computer-based
training courses (popularly called CBTs) and reference books like
encyclopedia and manuals. A CBT lets the user go through a series of
presentations, text about a particular topic, and associated illustrations in
various information formats. Edutainment is an informal term used to describe
the combination of education with entertainment, especially multimedia
entertainment.
Industry: In the industrial sector, multimedia is used as a way to help present
information to shareholders, superiors and co-workers. Multimedia is also
helpful for providing employee training, advertising and selling products all
over the world via virtually unlimited web-based technology. For example, in
case of tourism and travel industry, travel companies can market packaged

Manipal University Jaipur B1552 Page No.: 182

Graphics and Multimedia Systems Unit 9
tours by showing glimpse of the places they would like to visit, details on
lodging and food, site seeing, special offers and so on.
Medicine: In medicine, multimedia technologies are used to produce high
quality images of human bodies and practice complicated surgical
procedures. Doctors can get trained by looking at a virtual surgery and they
can simulate how the human body is affected by diseases spread by viruses
and bacteria and then develop techniques to prevent it. Tele-medicine isone
such example.
Engineering Applications: Multimedia is used widely in designing
mechanical, electrical, electronic and architectural parts through the use of
Computer Aided Design (CAD) and Computer Aided Manufacturing (CAM)
applications. These enable engineers to develop a model of products from
various perspectives. It allows them to try out different combinations
depending on the requirements before deciding the final product
implementation.
Self-Assessment Questions
5. CAM is used in the field of .
a. Engineering b. Entertainment c. Medicine d. Games
6. Multimedia is used to produce CBT in the field of .
7. ----is an example of using virtial surgery and simulating human body to act like
its affected by virus.

9.4 Role of Hypertext and Hypermedia

Hypertext is text displayed on a computer or other electronic device with
references (hyperlinks) to other text that the reader can immediately access,
usually by a mouse click or key press sequence.
Hypertext based on the user demand which leads the user towards the related
information. It is considered as the recent interface to overcome some of the
limitations of the traditional text. Unlike a traditional text hyperlink supports
the dynamic organization through connections and links. Hypertext can be
used for different situation. For instance, when a user clicks a web page on a
related subject, it may load a window with that word, the definition may appear
or a video clip may run.
Types and Uses of Hypertext
Hypertext documents can either be static (prepared and stored in advance)
or dynamic (continually changing in response to user input). Static hypertext
Manipal University Jaipur B1552 Page No.: 183
Graphics and Multimedia Systems Unit 9
can be used to cross-reference collections of data in documents, software
applications, or books on CD. A well-constructed system can also incorporate
other user-interface conventions, such as menus and command
lines. Hypertext can develop very complex and dynamic systems of linking
and cross-referencing.
Hypermedia is used as a logical extension of the term hypertext, in which
graphics, audio, video, plain text and hyperlinks intertwine to create a
generally nonlinear medium of information. This contrasts with the broader
term multimedia, which may be used to describe non-interactive linear
presentations as well as hypermedia. Hypermedia should not be confused
with hypergraphics or super-writing which is not a related subject. The World
Wide Web is a classic example of hypermedia, whereas a non-interactive
cinema presentation is an example of standard multimedia due to the
absence of hyperlinks. The most modern hypermedia is delivered via
electronic pages from a variety of systems. Audio hypermedia is emerging
with voice command devices and voice browsing.
Hypertext and Hypermedia
Communication reproduces knowledge stored in the human brain via several
media. Documents are one method of transmitting information. Reading a
document is an act of reconstructing knowledge. In an ideal case,knowledge
transmission starts with an author and ends with a reconstructionof the same
ideas by a reader. Today’s ordinary documents (excluding hypermedia), with
their linear form, support neither the reconstruction of knowledge, nor simplify
its reproduction. Knowledge must be artificially serialized before the actual
exchange. Hence, it is transformed into a linear document and the structural
information is integrated into the actual content. In the case of hypertext and
hypermedia, a graphical structure is possible in a document which may
simplify the writing and reading processes.
A book or an article on a paper has a given structure and is represented in a
sequential form. Although it is possible to read individual paragraphs without
reading previous paragraphs, authors mostly assume a sequential reading
pattern. Therefore, many paragraphs refer to previous learning in the
document. Novels, as well as movies, for example, always assume a pure
sequential reception. Scientific literature can consist of independentchapters,
although mostly a sequential reading is assumed. Technical documentation
(example manuals) consists often of a collection of relatively independent

Manipal University Jaipur B1552 Page No.: 184

Graphics and Multimedia Systems Unit 9
information units. A lexicon or reference book about the Airbus, for example,
is generated by several authors and always only parts are read
sequentially. There also exist many cross references in such documentations
which lead to multiple searches at different places for the reader. Here, an
electronic help facility, consisting of information links, can be very significant.
The figure 9.1 shows an example of such a link. The arrows point to such a
relation between the information units (Logical Data Units - LDU’s). In a text
(top left in the figure), a reference to the landing properties of aircrafts is given.
These properties are demonstrated through a video sequence (bottom left in
the figure). At another place in the text, sales of landing rights for the whole
USA are shown (this is visualized in the form of a map, using graphics- bottom
right in the figure). Further information about the airlines with their landing
rights can be made visible graphically through a selection of a particular city.
A special information about the number of the different airplanes sold with
landing rights in Washington is shown at the top right in the figure with a bar
diagram. Internally, the diagram information is presented in table form. The
left bar points to the plane, which can be demonstrated with a video clip.

Figure 9.1: Hypertext data - linking information of different media

Hypertext System
A hypertext system is mainly determined through non-linear links of
information. Pointers connect the nodes. The data of different nodes can be
represented with one or several media types. In a pure text system, only text

Manipal University Jaipur B1552 Page No.: 185

Graphics and Multimedia Systems Unit 9

parts are connected. Hypertext is viewed as an information object which

includes links to several media. Figure 9.2 represents the relation between
multimedia, hypermedia and hypertext.

Multimedia Hyper Hypertext

media

Figure 9.2: Multimedia, hypermedia and hypertext relationship

Multimedia System
A multimedia system contains information which is coded at least in a
continuous and discrete medium. For example, if only links to text data are
present, then it is not a multimedia system, it is a hypertext. A video
conference, with simultaneous transmission of text and graphics, generated
by a document processing program, is a multimedia application, although it
does not have any relation to hypertext and hypermedia.
Hypermedia System
As the figure 9.2 shows, a hypermedia system includes the non-linear
information links of hypertext systems and continuous and discrete media of
multimedia systems. For example, if a non-linear link consists of text and
video data, then this is a hypermedia, multimedia and hypertext system.
Self Assessment Questions
8. Hypertext can be either or .
9. What is hypermedia?
10.A non-interactive cinema presentation is an example of ------------

Manipal University Jaipur B1552 Page No.: 186

Graphics and Multimedia Systems Unit 9

9.5 Image and Video

9.5.1 : Digital video : refers to a video signal that has been converted into a digital format, allowing
it to be processed, stored, and transmitted using digital technology. In contrast to analog video, which
is represented as a continuous waveform, digital video is represented as discrete numerical values,
typically in binary code, that can be manipulated and stored by computers and other digital devices.
• In the early 1980s, Sony introduced the first digital recording system for broadcast television,
called the Betacam. This system used a digital signal to record video onto magnetic tape,
providing higher quality and greater flexibility than traditional analog recording methods.
• The 1990s saw the introduction of several new digital video formats.
• In 1995 and 1996 Professional digitаl videоtарe and DVD plays were released along with
WRAL-TV becomes the first television station.
• In the early 2000s, digital video began to be widely adopted for online streaming and
distribution. And all TV stations nationwide in US began in 2009.
Digital media is used in the field of
• Education
• Entertainment
• Information
• Advertising
Characteristics of video :
Analog video signals are continuous and vary in voltage or amplitude over time. Analog video signals
are typically transmitted through analog broadcast or recorded on analog media such as VHS tapes
or analog film. Analog video can degrade over time or through repeated copying, resulting in lower
quality and visual noise or distortion.
Digital video, on the other hand, represents video as a series of binary numbers. Digital video is
typically recorded on digital media such as DVDs, Blu-ray discs, or digital files such as MP4 or MOV.
Digital video is less prone to degradation over time and can be copied and distributed without loss of
quality.
Аsрeсt Ratio
• Dimension of width to height.: Aspect ratio in computer graphics refers to the proportional
relationship between the width and height of an image or screen. It is usually expressed as
a ratio of the width to the height, such as 4:3, 16:9, or 21:9.
Frаme Rate: It is Speed аt which video frames аррeаr. It is Measured in frames per second (fрs).
frame rate refers to the number of frames or still images displayed per second in a video or animation.
The frame rate is an important factor in determining the quality and smoothness of a video or
animation. A higher frame rate generally results in smoother motion and less flicker, while a lower
frame rate can result in a choppy or stuttering appearance.

File formats :
Manipal University Jaipur B1552 Page No.: 187
Graphics and Multimedia Systems Unit 9
There are several common video file formats used in computer graphics and video production. Some
of the most popular ones include:
MP4: This is a widely used video format that is compatible with most devices and platforms. It
provides good compression while maintaining high quality, making it suitable for streaming and
sharing online.
AVI: This is a popular format used for storing video files on Windows-based computers. It supports
multiple codecs and is capable of storing high-quality video files.
MOV: This is a video format developed by Apple that is widely used for video editing and production.
It supports high-quality video and audio, and is compatible with both Mac and Windows operating
systems.
WMV: This is a video format developed by Microsoft that is commonly used for streaming and sharing
online. It provides good compression and is compatible with most Windows-based devices.
FLV: This is a video format commonly used for online video streaming and sharing, especially on
websites such as YouTube. It provides good compression and is compatible with most web browsers.

Digital Video Equipments for various uses includes :

• Computer
• Microphones
• Videо Camera
• Tripods
• Lighting
• Cables or Connectors
• Videо Switcher
• Recordable Media

9.5.2 : Image :
• Image representation deals with creation, representation and management оf images оn the
computer display.
• It also Deals with professional and reasonable perspectives оf virtual image synthesis.
Images can be stored in digital or physical formats and can be viewed or displayed in a variety of
ways, including on screens, printed on paper, or projected onto surfaces.
An image made by a computer can show a basic scene as well as complex scenes.
Images can be created using a variety of techniques, such as scanning a physical photograph or
drawing, rendering a 3D model, or drawing directly onto a computer using a digital tablet or stylus.
Once an image is created, it can be manipulated and edited using specialized software to adjust
color, brightness, contrast, and other visual properties.
In 1963, Ivan Sutherland created a groundbreaking computer program called Sketchpad, which
allowed users to create and manipulate simple images using a light pen and a computer display. This
marked the beginning of interactive computer graphics, and Sketchpad is considered one of the first
examples of a graphical user interface.
Throughout the 1960s and 1970s, researchers continued to develop new techniques and
technologies for computer graphics, including raster graphics, vector graphics, and 3D graphics. In
1972, the first video game, Pong, was created using computer graphics, and it quickly became a
cultural phenomenon.
Manipal University Jaipur B1552 Page No.: 188
Graphics and Multimedia Systems Unit 9
Image files are digital files that contain visual information, typically in the form of a bitmap or a vector
graphic. There are several types of image files, each with its own characteristics and intended uses.
• An image file refers to any pictorial representation that is stored in the computer memory.
• An image file format refers to the particular format in which an image file is stored.
• A file format stores the number of rows and columns of image pixels.
• File formats is important in the process of printing, process of scanning and internet use.

File Formats :
Here are some of the most common types of image files:
JPEG: JPEG (Joint Photographic Experts Group) is a popular file format for digital photos and other
images with complex color gradients. It uses lossy compression, which means that some of the
original image data is discarded to reduce the file size.
PNG: PNG (Portable Network Graphics) is a file format that supports transparent backgrounds and
is commonly used for web graphics and logos. It uses lossless compression, which means that the
original image data is preserved and can be edited without losing quality.
GIF: GIF (Graphics Interchange Format) is a file format that supports animation and is commonly
used for small, simple animations and web graphics. It uses lossless compression and a limited color
palette, which makes it less suitable for photos and other complex images.
TIFF: TIFF (Tagged Image File Format) is a high-quality file format that is often used for printing and
professional graphics applications. It

Image compression :
There are two types of image compression: lossless and lossy.
Lossless compression algorithms preserve all the original data in the image, while still reducing its
size. Common lossless compression techniques include Run-Length Encoding (RLE), Huffman
coding, and Lempel-Ziv-Welch (LZW) compression. Lossless compression is commonly used in
medical imaging and other fields where it is important to preserve all the original data.
On the other hand, lossy compression algorithms discard some of the original data to achieve greater
compression. The degree of loss can be controlled to balance the reduction in file size with the loss
in image quality. Lossy compression algorithms are commonly used in digital photography and web
graphics, where the file size is more important than the minor loss in image quality.

Self assessment questions :

11.Who created a groundbreaking computer program called Sketchpad?
12.------------- is a high-quality file format that is often used for printing and professional
graphics applications
13.--------- video signals are continuous and vary in voltage or amplitude over time

9.6 : Standards in multimedia :

Some of the most important standards in multimedia include:

MPEG (Moving Picture Experts Group): This is a set of standards for compressing and
decompressing digital video and audio. MPEG has developed several widely used video

Manipal University Jaipur B1552 Page No.: 189

Graphics and Multimedia Systems Unit 9
compression formats, including MPEG-1, MPEG-2, and MPEG-4. The MPEG family of standards
includes several different formats, including MPEG-1, MPEG-2, MPEG-4, and MPEG-7, each
designed for different applications and use cases.MPEG-1 is a standard for compressing video and
audio data for storage on CDs, while MPEG-2 is used for compressing video data for broadcast
television and DVDs. MPEG-4 is a more recent standard that supports a wide range of applications,
including streaming video, videoconferencing, and mobile devices. MPEG-7 is a metadata standard
for describing multimedia content, allowing for easier searching and retrieval of audio and video
files.The MPEG standards are widely used in the multimedia industry, with MPEG-2 and MPEG-4
being particularly important for broadcasting and digital video.

JPEG (Joint Photographic Experts Group): This is a standard for compressing digital images, which
is widely used for photos and graphics on the web and in digital media. JPEG (Joint Photographic
Experts Group) is a standard for compressing digital images. The JPEG standard was first introduced
in 1992 and has since become one of the most widely used formats for storing and sharing digital
photos and other images.JPEG uses a lossy compression algorithm to reduce the size of digital
images. This means that some information is lost during compression, resulting in a reduction in
image quality. However, JPEG compression is designed to minimize the loss of image quality, while
still achieving significant reductions in file size.JPEG files can be opened and viewed on a wide range
of devices and software applications, making it a highly versatile format for sharing and storing digital
images. However, the lossy compression used by JPEG means that it may not be suitable for all
applications, such as professional photography or scientific imaging, where preserving image quality
and accuracy is paramount.

MHEG (Multimedia and Hypermedia information coding Expert Group) is a standard for interactive
digital television (DTV) that was developed by the International Organization for Standardization
(ISO) and the International Electrotechnical Commission (IEC). MHEG (Multimedia and Hypermedia
information coding Expert Group) is a standard for authoring interactive multimedia content, such as
television programs and interactive digital TV services. It was developed by ISO/IEC (International
Organization for Standardization/International Electrotechnical Commission) in the 1990s as a way
to enable interactivity and multimedia content in broadcast television.MHEG is based on the concept
of declarative programming, which means that instead of writing instructions for the computer to
follow, the author of the content describes what the content should look like and how it should behave.
This makes it easier for non-programmers to create interactive multimedia content, such as TV shows
with interactive elements or quizzes, without needing to know how to write computer code.MHEG is
designed to work with a range of devices, including TVs, set-top boxes, and other devices that
support digital TV services.

• SGML (Standard Generalized Markup Language) is a standard for defining markup

languages used to structure and describe electronic documents. It was developed in the
1980s by the International Organization for Standardization (ISO) and has since been used
as the basis for many other markup languages, including HTML, XML, and LaTeX.SGML
provides a way to define a document's structure, including the types of content that are
included and the relationships between different parts of the document. It uses a set of rules
called a Document Type Definition (DTD) to define the structure of a document and to ensure
Manipal University Jaipur B1552 Page No.: 190
Graphics and Multimedia Systems Unit 9
that it conforms to the SGML standard.

• DC (Dublin Core) is a standard for describing and organizing digital resources, such as web
pages, images, and videos. It was developed by the Dublin Core Metadata Initiative, an
international organization focused on developing metadata standards for describing digital
resources.The Dublin Core standard consists of a set of 15 elements, such as title, creator,
and date, that can be used to describe digital resources in a standardized way. These
elements provide basic information about the resource and are intended to be used in
combination with other metadata standards to provide more detailed information about the
resource.

RDF (Resource Description Framework) is a standard for describing and representing information
on the web. It provides a way to express metadata about resources on the web, such as web pages,
images, and videos, in a machine-readable format that can be understood by computers.RDF is
based on the idea of using simple statements, or "triples," to describe relationships between
resources. Each triple consists of a subject, a predicate, and an object, which together express a
statement about the resource. For example, a triple might describe the relationship between a web
page, its author, and the date it was created.RDF is designed to be flexible and extensible, allowing
users to define their own vocabularies and ontologies for describing resources.

Self Assessment questions :

14.Full form of MHEG is ----------------
15.-------------- standard for defining markup languages used to structure and describe
electronic documents
-------------------- provides a way to express metadata about resources on the web, such as web
pages, images, and videos.

Manipal University Jaipur B1552 Page No.: 191

Graphics and Multimedia Systems Unit 9

Manipal University Jaipur B1552 Page No.: 192

Graphics and Multimedia Systems Unit 9

Manipal University Jaipur B1552 Page No.: 193

Graphics and Multimedia Systems Unit 9

Manipal University Jaipur B1552 Page No.: 194

Graphics and Multimedia Systems Unit 9

9.6Summary
In this unit we discussed the meaning of multimedia and various
developments in the field. Let us summarize the important points discussed
in the unit:
Multimedia involves multiple media like text, image, graphics, audio, video
and animation. These provide more effective ways to communicate ideas and
views when compared to the traditional textual form. Hypertext allows non-
sequential reading and writing of documents by using embedded links to
jump from one place in the document to another Hypermedia is a computer
based information retrieval system that enables a user to gain or provide
access to texts, audio and video recordings.
A multimedia system enables the user to navigate to specific portions of the
content as desired thus providing non-linearity property. Interactivity helps the
user to get involved with the system. Multimedia is used in different fields in
order to improvise the quality of the work.
This unit concludes with the discussion of multimedia standard, which refers
to the exchange of content. It is important to be aware of the standards and
use them wisely. The goal of the standards was to develop a Coded
Representation of Multimedia and Hypermedia Information.
The development of powerful multimedia computers and the evolution of the
Internet have led to a wide range of applications of multimedia.

9.7Terminal Questions
1. List and explain the various criterions of multimedia.
2. Explain the uses of multimedia.
3. Discuss in detail the difference between hypertext and hypermedia.
4. Explain the types and uses of hypertext.
5. Discuss the file formats of video.
6. Explain the Standards of multimedia in detail.

9.8Answers
Self-Assessment Questions
1. "rich media"
2. multus
3. True
Manipal University Jaipur B1552 Page No.: 195
Graphics and Multimedia Systems Unit 9
4. Voice
5. a. Engineering
6. Education
7. tele - medicine
8. Static or dynamic
9. It is used as a logical extension of the term hypertext; different mediums
are intertwining to create a generally nonlinear medium of information.
10. Multimedia
11. Ivan Sutherland
12. TIFF
13. Analog
14. Multimedia and Hypermedia information coding Expert Group
15. SGML
16. RDF

Terminal Questions
1. Media can be classified with respect to different criteria i.e., perception,
representation, presentation, storage, transmission, and information
exchange. Refer sub sections 9.2.
2. Multimedia has found large applications in various areas including
advertisements, art, education, entertainment, engineering etc. Refer
section 9.3.
3. Hypertext based on the user demand which leads the user towards the
related information. Refer section 9.4.
4. Hypertext documents can either be static (prepared and stored in
advance) or dynamic (continually changing in response to user input).
Refer section 9.4.
5. File formats: There are several common video file formats used in
computer graphics and video production. Some of the most popular ones
include: MP4,AVI,MOV,WMV,FLV. Refer section 9.5
6. Standards in multimedia refer to agreed-upon specifications and
guidelines for creating and delivering multimedia content, including
audio, video, images, and interactive media.. Refer section 9.6.

Manipal University Jaipur B1552 Page No.: 196

Graphics and Multimedia Systems Unit 9

7. G series voice coding standards refer to a group of International

Telecommunications Union (ITU) standards for audio (speech)
compression and decompression. Refer section 9.5.

Manipal University Jaipur B1552 Page No.: 197

Graphics and Multimedia Systems Unit 10

Unit 10 Audio
Structure:
10.1 Introduction
Objectives
10.2 Standard and the compression technique
10.3 Digital Audio
10.4 MIDI
MIDI Basic Concepts
MIDI Devices
MIDI Messages
10.5 Processing and Sampling Sound
10.6Compression
Differential Pulse Code Modulation
Adaptive Differential PCM
Adaptive Predictive Coding
Linear Predictive Coding
10.7Summary
10.8Terminal Questions
10.9Answers

10.1 Introduction
In the previous unit, we discussed the basic concepts and applications of
multimedia and explored the difference between hypertext and hypermedia
as well as its significance. We also discussed the standards of image, video
and audio. Apart from image and text, audio too plays an important role in
various multimedia applications. Audio comprises of continuously varying
analog signals which are converted in to digital form by the digitization
process known as PCM (Pulse code Modulation). MIDI, which stands for
“Musical Instrument Digital Interface,” is a system that allows electronic
musical instruments and computers to send instructions to each other. It
sounds simple, but MIDI provides some profound creative opportunities that
are discussed in this unit. The process of sampling is the reduction of a
continuous signal to a discrete signal. Finally we will discuss the concept of
audio compression. It is a form of data compression designed to reduce the
transmission bandwidth requirement of digital audio streams and the storage
size of audio files.

Manipal University Jaipur B1552 Page No.: 193

Graphics and Multimedia Systems Unit 10

Objectives:
After studying this unit, you should be able to:
• Explain the concepts of standards in audio.
• explain the concepts of digital audio
• discuss the MIDI concepts
• explain the concept of processing and sampling sound
• list and discuss various audio compression techniques

10.2 Standard and compression technique

Standards in audio refer to the technical specifications and guidelines that are used
to ensure interoperability, compatibility, and quality in the creation, recording,
processing, and distribution of audio signals. There are several standards in audio,
including:
1. Sampling rate and bit depth: The sampling rate and bit depth are technical
specifications that define the quality and resolution of an audio signal. The most
common sampling rate for audio is 44.1 kHz, while the most common bit depth is 16
bits.
2. Digital Audio Workstation (DAW) standards: DAWs are software applications
that are used for recording, editing, and producing audio. Common DAW standards
include the Audio Engineering Society (AES) and the Broadcast Wave Format
(BWF).
3. Audio codecs: Audio codecs are software algorithms that are used to
compress and decompress audio data. Common audio codecs include MP3, AAC,
and FLAC.
4. Audio file formats: Audio file formats are used to store and distribute audio
data. Common audio file formats include WAV, AIFF, MP3, and FLAC.
5. Loudness standards: Loudness standards are used to ensure that audio
signals have consistent volume levels across different platforms and devices. The
most common loudness standards include the European Broadcasting Union (EBU)
R128 and the Advanced Television Systems Committee (ATSC) A/85.
Adhering to these standards ensures that audio signals can be recorded, processed,
and distributed with consistency, compatibility, and high quality.Top of Form

10.3 Digital Audio

Digital audio has grown in significance because of its usefulness in the
recording, manipulation, mass-production, and distribution of sound.
Digitizing is the process of capturing the sound which is created while
converting digital audio into numbers. It is possible to digitize sound from a
microphone, a synthesizer, existing tape recordings, live radio and television
Manipal University Jaipur B1552 Page No.: 194
Graphics and Multimedia Systems Unit 10
broadcasts, and popular CDs. Modern distribution of music across the
Internet via on-line stores depends on digital recording and digital
compression algorithms. Distribution of audio as data files rather than as
physical objects has significantly reduced the cost of distribution.
The steps involved in preparing digital audio files are:
Step 1: Take the analog source material or the sound recorded in the
analog media.
Step 2: Digitize the analog material into computer readable digital media.
Step 3: Two crucial points to focus are:
• Balancing the need for sound quality against your available RAM and
hard disk resources.
• Setting proper recording levels to get a good, clean recording.
Formula for determining the size of the digital audio is:
Monophonic = Sampling rate * duration of recording in seconds * (bit
resolution / 8) * 1
Stereo = Sampling rate * duration of recording in seconds * (bit resolution /
8) * 2, where
• Sampling rate is the frequency of the samples taken
• Amount of information stored is called the sample size.
• The number of channels is 2 for stereo and 1 for monophonic.
• The time span of the recording is measured in seconds.

Manipal University Jaipur B1552 Page No.: 195

Graphics and Multimedia Systems Unit 10

Self Assessment Questions

1. is the process of capturing the sound which is
created while converting digital audio into numbers.
2. State True/False. Stereo = Sampling rate * duration of recording in
seconds * (bit resolution / 8) * 2 Where Sampling rate is the frequency of
the samples taken.

10.4 MIDI
MIDI stands for Musical Instrument Digital Interface. It is a technical standard that
was developed in the 1980s to allow electronic musical instruments, computers, and
other devices to communicate with each other. MIDI uses a standardized protocol to
transmit information about musical events, such as notes played, pitch, duration,
velocity, and other performance parameters.
MIDI data can be recorded and edited in a MIDI sequencer or Digital Audio
Workstation (DAW), allowing musicians to create and manipulate musical
compositions using software instruments, virtual synthesizers, and other digital tools.
MIDI files can also be played back on MIDI-compatible hardware devices, such as
synthesizers, drum machines, and sequencers.
MIDI has revolutionized the way music is created, produced, and performed. It has
enabled musicians to create complex arrangements and orchestral scores with ease,
and has made it possible to synchronize music with video, lighting, and other
multimedia elements. MIDI has also paved the way for the development of electronic
dance music (EDM), video game music, and other forms of electronic music that rely
heavily on digital instrumentation and production techniques.

10.4.1 MIDI Basic Concepts

The MIDI is a standard protocol that enables electronic musical instruments

(synthesizers, drum machines) and computers to communicate and
synchronize with each other. The manufacturers of various musical
instruments follow the specifications of MIDI while building them so that the
instruments of different manufactures can communicate with one another.
A MIDI interface has two different components:-
Hardware: It is used to connect the musical instruments. It specifies that a
MIDI port as shown in figure 12.1 should be built into all instruments. There
are two ports MIDI IN and MIDI OUT.

Manipal University Jaipur B1552 Page No.: 196

Graphics and Multimedia Systems Unit 10
MIDI IN MIDI OUT
Figure 10.1: MIDI hardware

The physical connection between instruments is done using a special five

conductor cable or MIDI cable, through which the instruments can pass MIDI
signals to each other. One needs to connect the MIDI OUT of one

Manipal University Jaipur B1552 Page No.: 197

Graphics and Multimedia Systems Unit 10

instrument to the MIDI IN of another instrument, and vice versa. For example,
the following figure 10.2 shows the connection between a computer's MIDI
interface and a MIDI keyboard that has built-in sounds.

Figure 10.2: Connection between computer MIDI interface and MIDI keyboard

Message: The MIDI messages consist of instrument specification, the notion

of the beginning and end of a note as well as the frequency and volume of
sound. Each MIDI message represents one musical event between
machines. These musical events are actions performed by a musician while
playing an instrument. The actions can be pressing keys, moving the controls,
adjusting the foot pedals and so on.
For example, when one presses a key of the keyboard (refer figure 10.3 (a)),
the MIDI interface creates a MIDI message with information like beginning of
the note, intensity of the key stroke. This message is transmitted over the
hardware to another machine. After the key is released (refer figure 10.3 (b)),
then the corresponding message is transmitted again.

Figure 10.3: (a) Key is pressed (b) Key is released

10.4.2MIDI devices
Any musical instrument that satisfies both the components of the MIDI
standard capable of communicating with other MIDI devices through

Manipal University Jaipur B1552 Page No.: 198

Graphics and Multimedia Systems Unit 10

channels are called a MIDI device. For example, a synthesizer is a MIDI

device. The MIDI standard specifies 16 channels. To tune a MIDI device to
one or more channels, the device must be set to one of the MIDI reception
modes. There are four MIDI reception’s modes:
Mode 1: Omni On/Poly
Mode 2: Omni On/Mono
Mode 3: Omni Off/Poly
Mode 4: Omni Off/Mono
Where, Omni On/Off specifies how the MIDI device should monitor the
incoming MIDI channels. If Omni is on, then the MIDI device monitors all the
channels and responds to all channel messages irrespective of which channel
they are transmitted on. If Omni is off, the MIDI device responds only to
channel messages sent on the channel(s) the device is set to receive.
And Poly/Mono indicates how MIDI device should play the notes coming in
over the MIDI cable. When Poly mode is selected, the device can play many
notes at a time whereas if Mono is selected, then the device can play only
one note at a time.
10.4.3MIDI Messages
A MIDI message transmits information between MIDI devices and consists
of musical events. Each message is made up of a status byte which is
generally followed by one or two data bytes. The status byte contains the
function to be performed and the channel number which is to be affected
whereas data bytes contain additional parameters required for performing the
said task. MIDI messages can be broadly classified into two types as shown
in figure 10.4.

Figure 10.4: Types of MIDI messages

Manipal University Jaipur B1552 Page No.: 199

Graphics and Multimedia Systems Unit 10

Channel Messages: Channel messages are those which apply to a specific

channel, and the channel number is included in the status byte for these
messages. Channel Messages may be further classified as being either
Channel Voice Messages or Mode Messages.
Channel Voice Messages: It is used to send musical performance data like
keyboard action, controller action and control panel changes. These
messages comprise of most of the traffic in a typical MIDI data stream. The
messages in this category are the Note On, Note Off, Polyphonic Key
Pressure, Channel Pressure, Program Change, and the Control Change
messages.
Channel Mode Messages: This determines the way a receiving instrument
will respond to the channel voice messages. They set the MIDI channel
receiving modes for different MIDI devices and control the device locally.
Examples of this type of messages are Local Control, All Notes Off, and Omni
Mode Off etc.
System Messages: System messages are not channel specific so no
channel number is indicated in their status bytes. There are three types of
system messages:
System Real-Time Messages: These messages are very short consisting of
only one byte. They are used to synchronize all of the MIDI clock-based
equipment within a system, such as sequencers and drum machines. This
type of message may appear between the status byte and data byte of
some other MIDI message. Examples of such messages are the Timing
Clock, Start, Continue, Stop, Active Sensing, and the System Reset
message.
System Common Messages: These messages are intended for all receivers
in the system. They are used to prepare sequencers and synthesizers to play
a song. These enable the user to select a song, find a common starting place
in the song, and tune all the synthesizers if they need tuning. Examples
include Song Select, Song Position Pointer, Tune Request, and End of
Exclusive (EOX).
System Exclusive Messages: These messages include a Manufacturer's
Identification (ID) code and ends with an EOX message. They are used to
transfer any number of data bytes between MIDI devices in a specific format
specified by the referenced manufacturer.

Manipal University Jaipur B1552 Page No.: 200

Graphics and Multimedia Systems Unit 10

Self Assessment Questions

3. MIDI stands for .
4. _________and _______________are the two ports available inthe
MIDI hardware interface.
5. ____________specifies how the MIDI device should monitor the
incoming MIDI channels.
a. Omni On/Poly b. Omni On/Mono c. Omni Off/Poly d. Omni On/Off

10.5 Processing and Sampling Sound

Audio signal processing, sometimes referred to as audio processing, is
the intentional alteration of auditory signals or sound. As audio signals may
be electronically represented in either digital or analog format, signal
processing could occur in either domain. Analog processors operate directly
on the electrical signal, while digital processors operate mathematically on
the digital representation of that signal.
Audio are continuously varying analog signals which are converted to the
digital form by the digitization process known as PCM (Pulse code
Modulation). PCM involves sampling the audio signal at a minimum rate which
is twice that of the maximum frequency component of the signal also known
as Nyquist theorem. Sampling is the first step towards digitizing audio
signals. It consists of measuring the amplitude of the analog audio waveform
at periodic intervals, T. But in case the frequency (bandwidth) of` the
communication channel to be used is less than that of the signal, then the
sampling rate is determined by the bandwidth of the communication channel.
Then it is known as a band limited signal. For example, in case ofan audio
signal, the maximum frequency component is 20 kHz and hence the minimum
sampling rate is 40ksps (kilo samples per second). The number of bits per
sample is 16 bits and the bit rate is 128 Mbps.
In case of multimedia applications, the bandwidths of the communication
channels decide transmission rates which are usually less than the required
rate. There are two ways through which the same bandwidth can be achieved,
that being either the audio signal is sampled at lower rate or a compression
algorithm is used.

Manipal University Jaipur B1552 Page No.: 201

Graphics and Multimedia Systems Unit 10

PCM System
Two basic operations in the conversion of analog signal into the digital is time
discretization and amplitude discretization. In the context of PCM, the former
is accomplished with the sampling operation and the latter by means of
quantization. In addition, PCM involves another step, namely, conversion of
quantized amplitudes into a sequence of simpler pulse patterns (usually
binary), generally called as code words. (The word code in pulse code
modulation refers to the fact that every quantized sample is converted to an
R -bit code word.). Fig. 10.5 illustrates a PCM system

Figure 10.5: PCM System

Here, m(t) is the information bearing message signal that is to be transmitted

digitally. m(t ) is first sampled and then quantized. The output of the sampler
is m(nTs) = m(t)| t = nTs . Ts is the sampling period and n is the appropriate
integer. fs=1/Ts is called the sampling rate or sampling frequency. The
quantizer converts each sample to one of the values that is closest to it from
a pre-selected set of discrete amplitudes.
The encoder represents each one of these quantized samples by an R -bit
code word. This bit stream travels on the channel and reaches the receiving
end. With fs as the sampling rate and R -bits per code word, the bit rate of the
PCM system is Rfs= R /Ts bits/sec. The decoder converts the R -bit code
words into the corresponding (discrete) amplitudes. Finally, the reconstruction
filter, acting on these discrete amplitudes, produces the

  
analog signal, denoted by m (t ) . If no channel errors, m (t ) ≈ m (t )

Manipal University Jaipur B1552 Page No.: 202

Graphics and Multimedia Systems Unit 10

Self Assessment Questions

6. Audio are continuously varying analog signals which are converted to
digital form by the digitization process known as .
7. The sampling rate is determined by the bandwidth of the communication
channel if its frequency is less than signal is known asa .
8. PCM decoder converts the R -bit code words into the corresponding
.

10.6 Compression
Compression involves the encoding of digital audio data to take up less
storage space and transmission bandwidth. Audio compression typicallyuses
lossy methods, which eliminate bits that are not restored at the other end.
10.6.1 Differential Pulse Code Modulation
Differential Pulse Code Modulation (DPCM) is derived from PCM and is based
on the fact that most audio signals show significant correlation between
successive samples. As a result encoding incorporates redundancy in
sample values. Therefore in DPCM only the difference between two
successive samples is considered instead of the original sample signal. In
DPCM, only the digitized difference signal is used to encode the waveform,
thus taking fewer bits as compared to the PCM signalwith the same sampling
rate. Figure 10.6 shows a DPCM encoder and decoder.

(a)

Manipal University Jaipur B1552 Page No.: 203

Graphics and Multimedia Systems Unit 10

(b)
Figure 10.6: DPCM (a) Encoder (b) Decoder

The DPCM encoder as shown in figure 10.6 (a) consists of a Register (R). It
is a temporary storage where the previous digitized sample of the analog input
signal is stored. The difference signal (that is, DPCM in figure 10.6(a)) is
computed by a subtractor which subtracts the current contents of the register
from the new digitized sample or in other words the output by the ADC (PCM).
As shown in figure 10.6, the contents of register R is updated by adding the
current register contents and the computed difference signal output by the
subtractor. The DPCM values thus computed are fed to serial to parallel
convertor and then transmitted.
The block diagram of the decoder is shown in figure 10.6 (b). The decoder
performs the reverse operation of the encoder. So, it operates by adding the
received DPCM to the previously computed signal held in the register. The
compression achieved through DPCM is limited to 1-bit. Therefore the bit rate
required for a standard PCM voice signals is reduced from 64kbps to56
kbps.
The accuracy of each computed difference signal is determined by the
accuracy of the previous signal/value held in the register. So the overall
accuracy of DPCM depends on the register value and more over the previous
value in the register is just an approximation value. Hence, a more enhanced
technique has been developed to estimate a more accurate value of the
previous signal. This technique is known as predicting. In this technique, the
value of the previous signal is predicted by using not only the estimate of the
current signal but also varying proportions of other

Manipal University Jaipur B1552 Page No.: 204

Graphics and Multimedia Systems Unit 10

immediately preceding estimated signals. The proportions used are known as

predictor coefficients. The block diagram of Predictive DPCM encoder and
decoder is shown in figure 10.7.

(a)

(b)
Figure 10.7: Predictive DPCM (a) Encoder (b) Decoder

In this, the difference signal or DPCM is computed by subtracting varying

proportions of the last three predicted values from the current digitized value
that is, PCM. The proportion is decided by the predictor coefficients stored
in C1, C2 and C3 and the predicted values are stored in registers R1, R2 and
R3. The values in R1, R2 and R3 are added together and the resulting sum
is subtracted from the current digitized value which is the output by the ADC.
Then the current content of register R1 is shifted to R2 and that of R2 to R3.
The new predicted value is loaded into R1 and is ready for the next iteration.
The decoder operates in a similar manner by adding the same proportions of
the last three computed PCM signals to the received DPCM signal.

Manipal University Jaipur B1552 Page No.: 205

Graphics and Multimedia Systems Unit 10

10.6.2 Adaptive Differential PCM

Adaptive Differential Pulse Code Modulation (ADPCM) is an international
standard defined in ITU-T Recommendation G.721. ADPCM is based on the
same principle as that of DPCM except that the difference signal obtained is
represented by a variable number of bits depending on itsamplitude. Hence
fewer bits are required to encode (and hence transmit) the smaller difference
as compared to a larger difference. The variable number of bits can be either
6 bits producing 32kbps or 5 bits producing 16 kbps.
Another version of ADPCM standard called G.722 provides a better sound
quality than the G.721 standard. In G.722 standard, an added technique
called sub band coding is used and also the input speech bandwidth is
extended from 50 Hz to 7 kHz as compared to 3.4 kHz for a standard PCM.
Sub-band coding is a form of transform coding that breaks a signal into a
number of different frequency bands and encodes each one independently.
Figure 10.8 shows ADPCM encoder and decoder using sub band coding.

(a)

(b)
Figure 10.8: (a) ADPCM encoder (b) ADPCM decoder

Manipal University Jaipur B1552 Page No.: 206

Graphics and Multimedia Systems Unit 10

As shown in figure 12.8 (a), the encoder consists of two filters. One filter
passes frequency in the range 50Hz to 3.5 kHz while the other passes
frequency in the range 3.5 kHz to 7 kHz. Therefore, the input signal is divided
into two signals: lower sub-band and upper sub-band signal. Each sub-band
signal is then sampled and encoded independently using ADPCM. The
sampling rate for the upper sub-band signal is 16ksps (kilo samples per
seconds) as it contains the higher frequency components and the sampling
rate for lower sub-band signal is 8ksps. Therefore, different bit rates can be
used for each of them. The bit rates can be 64, 56, or 48 kbps. Considering
a bit rate of 64 kbps, then the lower sub-band is ADPCM encoded at
48 kbps and the upper sub-band is encoded at 16 kbps. The two bit streams
are then merged together using a multiplexer to produce the signal to be
transmitted.
On receiving the encoded stream the de-multiplexer in the decoder divides
the stream into two separate streams depending on the frequency range as
shown in figure 12.8 (b). The stream within the low band range is decoded by
a lower sub-band ADPCM decoder. Similarly, the stream within the uppersub-
band range is decoded by upper sub-band ADPCM decoder. The decoded
stream is passed through a low pass filter which produces the speech signal.
10.6.3 Adaptive Predictive Coding
The principle of Adaptive Predictive Coding (APC) is to make the predictor
coefficients, adaptive since the predictor coefficients vary continuously as
they are dependent on the characteristics of the audio signal being digitized.
Therefore the input signal is divided into fixed time segments and the
characteristics are determined for each segment. The optimum set of
coefficients is then computed and these are used to predict the previous
signal more accurately. Using this method, a high level of compression is
achieved thereby reducing the bandwidth requirement to 8 kbps, while still
maintaining an acceptable perceived signal.
10.6.4 Linear Predictive Coding
The algorithms discussed in the previous sections 10.6.1 to 10.6.3 arebased
on sampling and then quantizing only the difference signal. An alternative
approach also exists and is called Linear Predictive Coding (LPC), which
involves the analysis of the audio waveform to determine a

Manipal University Jaipur B1552 Page No.: 207

Graphics and Multimedia Systems Unit 10

selection of the perceptual features it contains. The perceptual features are

then quantized and transmitted. At the destination, these perceptual features
are used along with a sound synthesizer to regenerate a sound that is
perceptually comparable with the source audio signal. Using LPC, high levels
of compression and thereby low bit rates can be achieved.
Before discussing the LPC encoder and decoder, let us first discuss the main
features that determine the perception of a signal by the ear. These features
are:
Pitch: Pitch is related to frequency of the signal. It is an important feature
because the ear is more sensitive to frequency in the range 2 to 5 kHz than
the frequencies that are higher or lower than this range.
Period: It signifies the duration of the signal.
Loudness: It is determined by the amount of energy in the signal and it is
also related to amplitude of the signal.
Moreover, the origins of the sound also need to be considered. These are
called vocal tract excitation parameters and classified as: voiced sound and
unvoiced sound
Voiced Sound: It is the sound generated through the vocal chords. For
example, sounds generated when one reads the letters M, V and L.
Unvoiced sound: It is the sound generated when the vocal chords are open.
For example, sounds generated when one reads the letters F and S.
The block diagram of Linear Predictive Coding encoder and decoder are
shown in figure 10.9.

(a)

Manipal University Jaipur B1552 Page No.: 208

Graphics and Multimedia Systems Unit 10

(b)
Figure 10.9: (a) LPC encoder (b) LPC decoder

As shown in figure 10.9(a), the encoder consists of a digitizer that performs

sampling and quantization to generate digitized samples. A block of digitized
samples, also known as segments, is then fed to the waveform feature
extraction block. The waveform feature extraction block analyses the
segments to determine the various perceptual parameters (that is, pitch,
loudness, voiced and unvoiced) contained in the speech. The vocal tract
analysis block is used to generate vocal tract model coefficients which are
used along with the features extracted to generate the synthesized version
of the original speech signal. The output of the encoder is a string of frames,
one for each segment. Each frame contains fields for pitch and loudness, a
parameter stating whether the signal is voiced or unvoiced, and set of
computed model coefficients.
Likewise, in the decoder (refer figure 10.9(b)), the vocal tract model generates
the speech signal as a function of the present output of the synthesizer along
with a linear combination of the previous set of model coefficients. Hence the
vocal tract model used is adaptive. The sound generated using this model is
often very synthetic and as a result LPC codes are used primarily in military
applications in which bandwidth usage isimportant.

Manipal University Jaipur B1552 Page No.: 209

Graphics and Multimedia Systems Unit 10

Self Assessment Questions

9. signifies the duration of the signal.
10. technique involves the analysis of the audio
waveform to determine a selection of the perceptual features.

10.7 Summary
In this unit we began with a discussion on the basic concepts of audio and
about the MIDI (Musical Instrument Digital Interface). MIDI is a protocol used
to perform a direct connection between a MIDI instrument and the computer.
This protocol is widely used by composers, musicians and performers as a
tool. All sound waves including speech and music can be approximated using
audio signals. We discussed the Pulse Code Modulation technique by which
a sampled analog signal is converted into digital audio signal.
We also learnt some of the basic principles of audio compression. In DPCM,
the difference between the consecutive signals is considered for encoding
instead of the original signal. Therefore the overall accuracy of DPCM
depends on the accuracy of the previous signal. To overcome this, an
enhanced technique known as Predictive DPCM was designed in which the
value of the previous signal is predicted by using not only the estimate of the
current signal but also the varying proportions of other immediately preceding
estimated signals. Another technique similar to DPCM calledAdaptive DPCM
(ADPCM) uses variable bits to represent the difference signals. We also
discussed another technique called LPC which is based on the perceptual
features of the signal.

10.8 Terminal Questions

1. Explain Digital audio.
2. What is MIDI? Discuss its devices and messages.
3. What is audio sampling?
4. Discuss Differential pulse code modulation.
5. Explain Adaptive Differential PCM.
6. Brief about LPC encoder and decoder.

Manipal University Jaipur B1552 Page No.: 210

Graphics and Multimedia Systems Unit 10

10.9 Answers
Self Assessment Questions
1. Digitizing
2. True
3. Musical Instrument Digital Interface
4. MIDI IN and MIDI OUT
5. d. Omni On/Off
6. Pulse code Modulation
7. Band limited signal.
8. amplitudes
9. Period
10. LPC

Terminal Questions
1. Digital audio has emerged because of its usefulness in the recording,
manipulation, mass-production, and distribution of sound. Refer section
10.3
2. The MIDI is a standard protocol that enables electronic musical
instruments) and computers to communicate and synchronize with each
other. Refer Sub-sections 10.4.1,10.4.2,10.4.3.
3. Sampling the audio signal at a minimum rate which is twice that of the
maximum frequency component of the signal. Refer Section 10.5
4. Differential Pulse Code Modulation (DPCM) is derived from PCM and is
based on the fact that most audio signals show significant correlation.
Refer Sub-section 10.6.1.
5. ADPCM is based on the same principle as that of DPCM except that the
difference signal obtained is represented by variable number of bits
depending on its amplitude. Refer Sub-section 10.6.2.
6. LPC involves the analysis of the audio waveform to determine a selection
of the perceptual features it contains. Refer Sub-section 10.6.4.

Manipal University Jaipur B1552 Page No.: 211

Graphics and Multimedia Systems Unit 11

Unit 11 Video
Structure:
11.1 Introduction: The concept of video in Multimedia
Objectives
11. 2 Compression Techniques
11.3 MPEG Compression Standard
MPEG-1
MPEG-2
MPEG-4
11.4 Compression Through spatial and temporal Redundancy
11.5 Inter Frameand Intra Frame Compression
11.6 Summary
11.7 Terminal Questions
11.8 Answers

11.1 Introduction: The concept of video in multimedia

In the previous unit, we discussed the concepts of digital audio, the role of
MIDI, its devices and messages. We also discussed how audio signals can
be processed and sampled. The previous unit concluded with a study on the
different techniques of audio compression such as Differential Pulse Code
Modulation, Adaptive Differential PCM, Adaptive Predictive Coding, and
Linear Predictive Coding.In this unit we will discuss the importance of video
and compression and the various techniques used for this.
MPEG video compression is used in many current as well as emerging
products. It is at the heart of digital television products and other applications.
These applications benefit from video compression as they require less
storage space for archived video information, less bandwidth for the
transmission of the video information from one point to another or a
combination of both. It is also ideal for a wide variety of applications since it
is defined in two finalized international standards. Video data is traditionally
represented in the form of a stream of images, called frames. Sequence of
frames between the consecutive frames creates redundancy. In this unit we
will discuss the different types of frames and the various types of frame
compressions.

Manipal University Jaipur B1552 Page No.: 210

Graphics and Multimedia Systems Unit 11

Objectives:
After studying this unit, you should be able to:
• list and discuss various MPEG compression standards.
• discuss the techniques of compression through redundancy.
• list and explain the types of frames.
• explain the concept of interframe and intraframe compression.

11.2 Compression Techniques

Video data may be represented as a series of still frames, or fields for interlaced
video. The sequence of frames will almost certainly contain both spatial and
temporal redundancy that video compression algorithms can use. Most video
compression algorithms use both spatial compression, based on redundancy
within a single frame or field, and temporal compression, based on
redundancy between different video frames.

There are two methods of compression – lossy and lossless.

• Lossy reduces file size by permanently removing some of the original data.
• Lossless reduces file size by removing unnecessary metadata.

Lossy algorithms.
Lossy compression is typically used when a file can afford to lose some data,
and/or if storage space needs to be drastically ‘freed up’.

Here, an algorithm scans image files and reduces their size by discarding
information considered less important or undetectable to the human eye.

Using lossy methods therefore requires you to make a balanced judgement

between:
• storage/delivery requirements
• loading times (e.g. on the web)
• image quality

Lossless algorithms.
With lossless compression the file data is restored and rebuilt in its original form
after decompression, enabling the image to take up less space without any
discernible loss in picture quality.
No data is lost and as the process can be reversed, it’s also known as reversible
compression.

11.3 MPEG Compression Standard

Compression aims at lowering the total number of parameters required to

Manipal University Jaipur B1552 Page No.: 211

Graphics and Multimedia Systems Unit 11
represent the signal, while maintaining good quality. These parameters are
then coded for transmission or storage. A result of compressing digital video
is that it becomes available as computer data, ready to be transmitted over
existing communication networks. The MPEG was formed by the ISO to
formulate a set of standards relating to a range of multimedia applications that
involve the use of video with sound. There are three standards namely
MPEG-1, MPEG-2 and MPEG-4. Each group is targeted at a particular
application domain and describes how the audio and video are compressed
and integrated together. In this section we will discuss the three standards in
the context of video coding.
11.3.1 MPEG-1
The MPEG-1 video standard is defined in ISO Recommendation 11172. The
resolution used for the NTSC and PAL is given below.

It is basically used for VHS quality audio and video on CD-ROM at a bit rate
of 1.5 Mbps. MPEG-1 uses a combination of I-frames only, I- and P-frames
only, or I-,P- and B-frames. It does not support D-frames. The compression
algorithm used is based on the H.261 standard with two main differences. The
first is that timestamps are inserted which will enable the decoder to
resynchronize quickly when there are one or more corrupted or missing
macroblocks. The number of macroblocks between two timestamps is known
as a slice and a slice can comprise from 1 to a maximum number of
macroblocks in a frame that is 22. The second difference is because B-frames
are supported by MPEG-1 and this increases the time interval

Manipal University Jaipur B1552 Page No.: 212

Graphics and Multimedia Systems Unit 11

between I and P frames. The compressed bit stream produced by theMPEG-

1 video coder is hierarchical and is shown in figure 11.1.

Figure 11.1: MPEG-1 video bit stream

The compressed video is known as a sequence which in turn consists of a

string of groups of pictures (GOPs) each containing a string of I, P or B
frames/pictures in the considered sequence. Each frame is made up of N
slices, each of which comprises of multiple macroblocks. Each macro block
consists of six 8x8 pixel blocks as hown in figure 11.1. For the decoder to
decode the received bit stream, each field has to be clearly specified within
the bit stream. The format of the bit stream is shown in figure 11.2.

Figure 11.2: Format of MPEG-1 bit stream

Manipal University Jaipur B1552 Page No.: 213

Graphics and Multimedia Systems Unit 11

11.3.2 MPEG-2
The MPEG-2 video standard is defined in ISO Recommendation 13818. It is
used in the recording and transmission of studio quality audio and video. The
basic coding structure of MPEG-2 video is the same as that of MPEG-1 with
some difference. In this case there are different levels of video resolution
possible:-
Low: It is based on SIF digitization format with a resolution of 352 x 288
pixels. It is comparable with MPEG-1 format and produces VHS-quality. The
audio is of CD quality and the target bit rate is up to 4 Mbps.
Main: It is based on 4:2:0 digitization formats with a resolution of 720 x 576
pixels. It produces studio quality video and audio with bit rate up to 15 Mbps
or 20 Mbps with the 4:2:2 digitization format.
High 1440: It is based on the 4:2:0 digitization format with the resolution of
1440 x 1152 pixels. It is proposed for high definition television (HDTV) at bit
rates up to 60 Mbps or 80 Mbps with the 4:2:2 digitization format.
High: It is based on the 4:2:0 with a resolution of 1920x1152 pixels. It is
proposed for wide screen HDTV at bit rates up to 80 Mbps or 100 Mbps with
the 4:2:2 digitization format.
For each of the above levels, MPEG-2 provides five profiles: simple, main,
spatial resolution, quantization accuracy, and high. The four levels and five
profiles collectively form a two-dimensional table which acts as a framework
for all standards activities associated with MPEG-2. Since the Low level is
compatible with MPEG-1, discuss here is only the main profile at the main
level (MP@ML).
MP@ML: The main application of MP@ML is digital television broadcasting.
Hence interlaced scanning is used with a frame refresh rate of 30Hz for NTSC
and 25Hz for PAL. The 4:2:0 digitization format is used and the bit rate ranges
from 4Mbps to 15 Mbps. The coding scheme is similar to MPEG-1 with only
a difference in the scanning method. It uses interlaced scanning instead of
progressive scanning which results in each macroblock having two fields: odd
and even fields each with the refresh rate of 60Hz for NTSC or 50Hz for PAL.
Therefore for the I-frame, the DCT blocks from eachmacroblock have to be
derived. So there are two alternatives:

Manipal University Jaipur B1552 Page No.: 214

Graphics and Multimedia Systems Unit 11

• Field Mode: This method is used for encoding DCT blocks when a large
amount of motion is present. This means that there will be shorter time
difference between successive fields due to which compression ratio will
be higher. For example, a live cricket match video can be encoded using
this mode as there will be a large amount of movement.
• Frame Mode: This method is used when there is small amount of
movement. This means that the time intervals between successive fields
are large and the compression ratio very low. Hence the DCT blocks are
encoded from each complete frame. For example, a news broadcast can
be encoded using this mode since the movement between frames is more
as compared to the movements in the field.
Similarly for encoding P- and B-frames, three different modes are possible:
field, frame and mixed. The field and frame mode works the same way as I-
frame with additional consideration of the preceding frame filed for both P-
and B-frame and for B-frame, the immediately succeeding field. In the mixed
mode, both the motion vectors of frame and field modes are computed and
the one with the smallest value is selected.
11.3.3 MPEG-4
This standard is related to interactive audio and video over the internet and
other entertainment network. This standard has a feature that allows the user
to not only access a video sequence, but also to influence independently, the
individual elements that make up the video sequence. In short, using MPEG-
4 standard, the user can be made capable of not only starting, stopping and
pausing functions, but also, repositioning, deletingand altering the movement
of each characters within a scene. The MPEG-4 standard has a very high
coding efficiency and therefore it can be used over low bit-rate networks like
wireless and PSTNs (Public Switched Telephone Network). The MPEG-4
standard is a good alternative for the H.263 standard.
An important feature of MPEG-4 is content based coding which signifies that
before the video is compressed, each scene is defined in the form of a
background and one or more foreground audio-visual objects (AVO). Each
AVO is composed of one or more video objects and/or audio objects. Taking
the example of a news broadcast, the laptop in a scene can be considered as
a single video object while the news reader can be defined as using both

Manipal University Jaipur B1552 Page No.: 215

Graphics and Multimedia Systems Unit 11

an audio and video object. Similarly, each video and audio object can further
be made up of many sub-objects. In the case of news reader example there
is a movement in his/her eyes and mouth only. So the reader’s face can be
defined in the form of three sub-objects: one each for head, eye and mouth.
Once the content-based coding is done, the encoding of the background and
each AVO is carried out separately.
Each audio and video object is described by an object descriptor which
enables a user to manipulate the objects. The language used to describe the
objects and define functions for manipulating the shape, size and location of
the objects is called Binary Format for Scenes (BIFS). In a complete scene
there may be many AVOs and some relation may exist between these AVOs.
So the relation between the AVOs is defined by a scene descriptor. Each
video scene is segmented into a number of Video Object Planes (VOP),
each of which corresponds to an AVO of interest. Forexample, in the news
broadcast example, VOP0 represents the news reader, VOP1 represents the
laptop present in the table, VOP2 represent the background setting of the
studio. Each VOP is encoded separatelybased on its shape, motion and
texture as shown in figure 11.3.

Figure 11.3: VOP encoder

The resulting bit stream from the VOP encoder is encoded for transmission
by multiplexing the VOPs together with the related object and scene
descriptors as shown in figure 11.4(a). Similarly, at the receiver, the bit

Manipal University Jaipur B1552 Page No.: 216

Graphics and Multimedia Systems Unit 11

stream is first demultiplexed and the individual VOPs decoded. The individual
VOPs together with the object and scene descriptors are then used to create
the video frame that is played out on the terminal as shown infigure 11.4(b).

(a)

(b)
Figure 11.4: MPEG-4 (a) Encoder (b) Decoder

Manipal University Jaipur B1552 Page No.: 217

Graphics and Multimedia Systems Unit 11

The audio associated with an AVO is compressed using any one algorithm,
depending on the available bit rate of the transmission channel and the sound
quality required. For example, CELP can be used for video telephony. The
audio encoder and decoder are included inside the MPEG-4 encoder and
decoder as shown in figure 11.4(a) & (b).
Self Assessment Questions
1. MPEG-1 does not support D-frames. (True/False)
2. The MPEG-2 video standards is defined in ISO Recommendation
3. PSTN stands for .
a) Public Standard Telephone Network
b) Public Switched Telephone Network
c) Private Switched Telephone Network
d) Private Standard Telephone Network
4. In VOP encoder the relation between the AVOs is defined by.

11.4 Compression through spatial and temporal Redundancy

Assume that you are watching news broadcast in the television. What would
you observe apart from listening to the news? One can notice the movement
of a person’s lips or eyes whereas the background remains the same. That
means the sequence of frames related to the news broadcast consists of
many repeated information. The repetition of same information many times
is known as redundancy. Therefore redundancy can be eliminated/ reduced
by taking into consideration only those portions in a frame that involves some
changes when compared to the previous frame. Redundancy has been
categorized into two types:
11.4.1 Spatial Redundancy: If neighboring pixels are similar within each
frame then it is known as spatial redundancy. For example, in an image of
blue sky there will be repeated pixel values. Figure 11.5 shows spatial
redundancy attribute.

Manipal University Jaipur B1552 Page No.: 218

Graphics and Multimedia Systems Unit 11

Figure 11.5: spatial redundancy

11.4.2 Temporal Redundancy: In a video when adjacent frames are similar

then itis known as temporal redundancy. The example about news broadcast
discussed above falls under this type of redundancy. Figure 13.6 shows
temporal redundancy attribute.
A video contains much spatial and temporal redundancy. Therefore
compression can be achieved by exploiting spatial and temporal redundancy
inherent to video. As only the difference (movement) between the successive
frames is being considered, the accuracy of the predicted frames depends on
the estimation of the difference. The method that performs estimation is
known as motion estimation. Since the estimation method is not exact,
some additional information must be sent that indicates the small differences
between the predicted and actual positions of the moving segments involved.
This process is known as motioncompensation.

Figure 11.6: Temporal redundancy

11.5 Inter Frame and
Intra frame types

Depending in the type of redundancy exploited there are different types of

frames are used as shown on figure 11.7. The details regarding each frame
is explained below.

Manipal University Jaipur B1552 Page No.: 219

Graphics and Multimedia Systems Unit 11

Figure 11.7: Frame types

I-frames (Intra-coded frames): Frames that are encoded independently of

any other frames are called I-frame (as shown in figure 11.7). These exploit
spatial redundancy within a frame. Each frame is treated as a separate
picture/image and the Y, Cr and Cb matrices are encoded separately using
JPEG. Therefore the compression achieved is relatively small. The first frame
of every video sequence must necessarily be an I-picture, since it does not
have any past reference. I-frames must be repeated at regular intervals to
avoid loss of the whole picture. This is because during transmission it can get
corrupted and the frame may be lost. The number of frames/pictures between
successive I-frames is known as a Group of Pictures (GOP). Typical values
of GOP are 3 – 12. When the I-frames are received at the destination, they
are decoded immediately to reconstruct the original image as it does not
depend on other frames.
P-frame (Predictive Frame): Frames that are encoded with reference to
either the previous coded I-frame or P-frame are called P-frame. P-frames
are encoded using a combination of motion estimation and motion
compensation. P-frames exploit temporal redundancy and therefore a P‑
frame holds only the changes in the image from the previous frame. The
number of frames between a P-frame and the immediately preceding I-frame
or P-frame is called the prediction span. When P-frames are received at the
destination, they are decoded first and then the resulting information is used
together with the decoded information of the precedingI- or P-frame to derive
the decoded frame contents.
B-frame (Bi-directionally predicted frame): Frames that are encoded with
reference to both the previous and future coded frames are known as

Manipal University Jaipur B1552 Page No.: 220

Graphics and Multimedia Systems Unit 11

B-frame. These frames have the highest level of compression and because
they are not involved in the coding of other frames, they do not propagate
errors.

Consider encoded frame sequence I B B P B BI . Figure 11.8 shows the

relation between different frame types for a given sequence of frames.

Figure 11.8: Relation between different frames

When B-frames are received at the destination, they are first decoded and
then the resulting information is used together with the decoded information
of the preceding I- or P-frame and the immediately succeeding I- or P-frame
to derive the decoded frame contents. While decoding B-frame, if either the
preceding or succeeding I- or P-frame is not available, then the time required
to decode the B-frame increases. Therefore to minimize the decoding time all
the required frames should be made available, and the reordering of frames
are done. For example, if the uncoded frame sequenceis I B B P B B P B BI
... then,
Let us number these frame for easy understanding, therefore,
0123456789
I B B P B B P B BI ...
Then the reordered sequence would be
0312645789
I P B B P B BBBI ...

Manipal University Jaipur B1552 Page No.: 221

Graphics and Multimedia Systems Unit 11

PB-frame: A PB-frame consists of two consecutive frames (P- andB-

frames) being coded as one unit. Figure 11.9 shows this type of frame.

Figure 11.9: BP-frame

D-frame: D-frames also called DC-frames are independent frames where

only the DC coefficients are encoded. D-frames are of very low quality and
are never referenced by I-, P- or B- frames.
Self Assessment Questions
5.
11.5 Following are the two common and powerful ways of reducing
video data.
Spatial Compression: Spatial Compression refers to compression applied
to a single frame of data. The frame is compressed independently of any
surrounding frames. Compression can be Lossless or Lossy. A spatially
compressed frame is often referred to as an “intraframe.”, I frame or
Keyframe.
Intraframe Compression looks for redundant or imperceptible data within
each frame and eliminates it, but keeps each frame separate from all others.

Manipal University Jaipur B1552 Page No.: 222

Graphics and Multimedia Systems Unit 11

It is a commonly used method which works by comparing each frame in the

video with the previous one. If the frame contains areas where nothing has
moved, the system simply issues a short command that copies that part of
the previous frame, bit-for-bit, into the next one. If sections of the frame move
in a simple manner, the compressor emits a command that tells the
decompressor to shift, rotate, lighten, or darken the copy – a longer
command, but still much shorter than intraframe compression. Interframe
compression works well for programs that are likely to be simple played back
by the viewer, but can cause problems if the video sequence needs to be
edited. Since interframe compression copies data from one frame to another,
if the original frame is simply cut out (or lost in transmission), the following
frames cannot be reconstructed properly.
Temporal Compression: Temporal compression identifies the differences
between frames and stores only those differences. Unchanged areas are
simply repeated from the previous frame(s). A temporally compressed frame
is often referred to as an “interframe.” or P frame
Interframe compression looks for portions of the image which do not change
from frame to frame and encodes them only once. This usually involves
saving a Keyframe which is a single, complete frame, and then saving a series
of Delta frames which only contain the changes for each subsequent frame.
All CODECs use some form of intraframe compression. CODEC is short form
of "Compressor/Decompressor" and refers to the particular scheme used to
reduce the video data. More efficient CODECs generally use both intraframe
and interframe compression. Intraframe CODECS are well suited for
acquisition and post-production because they keep every frame whole and
separate, making it easy to cut the video at any point. CODECs which use
both are generally better suited to distribution because they allow much lower
file sizes with higher picture quality.
Digital video compression is always a trade-off between file size and image
quality. Lossless CODECs usually cannot reduce the data by more than half.
Most CODECs attempt to reduce file size by throwing away redundant
information first, then eliminating information that is least likely to be perceived
by the viewer (Perceptual Encoding). The more compression is

Manipal University Jaipur B1552 Page No.: 223

Graphics and Multimedia Systems Unit 11

applied, the farther one pushes beyond the perception threshold, with the
result being a noticeable reduction in image quality.
Self Assessment Questions
6. The repetition of same information many times is known as
.
7. Additional information indicates the small differences between thepredicted and actual
positions of the moving segments are called

a. motion estimation
b. motion calculation
c. motion compensation
d. motion segmentation
7. Number of frames between a P-frame and the immediately precedingI-frame or P-frame
is called .
8. Spatially compressed frame is often referred to as an .
9. CODEC is the short form of Compressor/Decompressor. [True/False]
10. identifies the differences between framesand stores only those differences.

11.6 Summary
This unit provides information about the various video compression
techniques. Let us recapitulate the unit content. One of the most popular
MPEG standards has evolved from the original MPEG-1 audio and video
compression schemes into MPEG-2 now used for digital cable, satellite and
terrestrial broadcasting, DVD, High-definition and many other applications
and MPEG-4, which includes several improvements over MPEG-2, including
multi-directional motion vectors, ¼-pixel offset, object coding for greater
efficiency, separate control of individual picture elements and behavior coding
for interactive use. Spatial and temporal are the two categories of
compression technique helps to compress the video in an efficient way. Inter
and intra frame compression helps save large amounts of data when
compared to the task of storing or transmitting a full description of every pixel
in the original image.

11.7 Terminal Questions

1. Explain MPEG-1 video standard.
2. Discuss the role of VOP encoder.
3. Explain MPEG-4 encoder and decoder.
4. Brief about spatial and temporal redundancy.

Manipal University Jaipur B1552 Page No.: 224

Graphics and Multimedia Systems Unit 11
5. Explain the different types of frames.
6. Differentiate between interframe and intraframe compression.

11.8 Answers
Self-Assessment Questions
1. True
2. 13818
3. (b) Public Switched Telephone Network
4. scene descriptor.
5. redundancy
6. (c) motion compensation
7. Prediction span
8. intraframe
9. True
10. Temporal compression

Terminal Questions
1. The MPEG-1 video standard is defined in ISO Recommendation 11172.
Refer subsection 11.3.1.
2. Each video scene is segmented into a number of Video Object Planes
(VOP), each of which corresponds to an AVO of interest. Refer sub
section 11.3.3.
3. The resulting bit stream from the VOP encoder is encoded for
transmission by multiplexing the VOPs together with the related object
and scene descriptors. Similarly, at the receiver, the bit stream is first
demultiplexed and the individual VOPs decoded. Refer Sub Section
11.3.3.
4. The repetition of same information many times is known as redundancy.
Redundancy has been categorized into two types they are spatial and
temporal redundancy. Refer Section 11.4.
5. Redundancy can be eliminated / reduced by taking into consideration only
those portions in a frame that involves some changes compared to the
previous frame. Depending in the type of redundancy exploited there are
different types of frames. Refer sub section 11.4.1.
6. Intraframe compression looks for redundant or imperceptible data within
each frame and eliminates it, but keeps each frame separate from all
others. Interframe compression looks for portions of the image which don't
Manipal University Jaipur B1552 Page No.: 225
Graphics and Multimedia Systems Unit 11
change from frame to frame and encodes them only once. Refer Section
11.5.

Manipal University Jaipur B1552 Page No.: 226

Graphics and Multimedia Systems Unit 12

Unit 12 Animation
Structure:
12. 1 Introduction
Objectives
12.2 Role of animation in Multimedia
12.3 Types and Techniques of Animation
• Stop Motion
• Computer Animation
• Traditional Animation
12.4 Key Frame Animation
12.5 Utility
12.6 Morphing
12. 7 Virtual Reality Concepts
Types of VR
12.8 Summary
12.9 Terminal Questions
12.10 Answers

12.1 Introduction
In the previous unit we discussed the MPEG standard compression. We dealt
in detail about how the compression can be done through spatial and temporal
redundancy. We also explored the techniques of interframe and intraframe
compression.
In this unit we will discuss the topic of animation. Animation is a visual
technique that provides the illusion of motion by displaying a collection of
images in rapid sequence. Each image contains a small change, for example
a leg moves slightly, or the wheel of a car turns. When the images are viewed
rapidly, the human eye fills in the details and the illusion of movement is
complete.
Technically, animation can be defined as a simulation of movement created
by displaying a series of pictures, or frames. Cartoons on television are one
example of animation. Animation on computers is one of the chief ingredients
of multimedia presentations. There are many software applications that
enable one to create animations that can be displayed on acomputer monitor.

Manipal University Jaipur B1552 Page No.: 225

Graphics and Multimedia Systems Unit 12

We will also discuss the concept of key frame animation and morphing and
discuss in detail the different techniques of morphing. We will conclude this
unit a discussion of virtual reality concepts and its various types.
Objectives:
After studying this unit, you should be able to:
• explain the concept of animation
• list and explain the techniques of animation.
• describe the role of key frame animation
• able to explain morphing
• analyse the concepts of virtual reality.

12.2 Role of animation in Multimedia

Animation plays a crucial role in multimedia as it allows for the creation of engaging,
interactive, and visually appealing content that can be used in various applications
such as video games, movies, educational materials, marketing campaigns, and
more.
Here are some of the key roles of animation in multimedia:
• Enhancing engagement: Animation can bring characters and stories to life,
creating a more immersive and engaging experience for the audience.
• Communicating complex ideas: Animation can be used to convey complex
ideas or concepts in a simple and easy-to-understand way. This is particularly
useful in educational materials, where animations can be used to visualize
abstract or complex concepts.
• Creating visual appeal: Animation can be used to create visually stunning
and memorable content that captures the attention of the audience. This is
particularly important in marketing and advertising, where animations can be
used to create memorable brand experiences.
• Providing interactivity: Animation can be used to create interactive
experiences that allow users to engage with content in a more meaningful
way. This is particularly useful in video games and e-learning, where
interactivity can help to enhance learning outcomes and engagement.
• Overall, animation is a powerful tool that can be used to create engaging,
interactive, and visually appealing content in multimedia.

Manipal University Jaipur B1552 Page No.: 226

Graphics and Multimedia Systems Unit 12
12.3 Types and Techniques of Animation
The techniques used by animators to bring characters and stories to life have
improved tremendously over the years. Primarily there are three types of
animation called traditional, stop-motion, and computer animation. The
differences between the three major forms of animation are discussed below.
12.3.1 Traditional Animation
During the 20th century, traditional animation was the process used in most of
the animated films. The individual frames of a traditionally animated film are
photographs of drawings, which are first drawn on paper. In order to create
the illusion of movement, each drawing is made slightly different from the
previous frame. The animators' drawings are traced or photocopied onto
transparent acetate sheets called cels, which are filled in with paints in
assigned colors or tones on the side opposite the line drawings. The
completed character cels are photographed one-by-one onto motion picture
film against a painted background. Traditional animation is otherwise called
cel animation or hand-drawn animation.
But in the recent times, animators’ background and the contents are either
scanned into or drawn in the computer system directly. Now many software
programs are available in the market to color the drawings and provide
simulative camera effects and movements.
Full Animation refers to the process of producing high-quality traditionally
animated films, which regularly use detailed drawings and possible
movement. Fully animated films can be done in a variety of styles, from
more realistically animated works such as those produced by the Walt

Manipal University Jaipur B1552 Page No.: 227

Graphics and Multimedia Systems Unit 12

Disney Studio (Beauty and the Beast, Aladdin, Lion King) to the more
'cartoony' styles of those produced by the Warner Bros. Animation Studio.
Limited Animation is a process of making animated cartoons that does not
redraw entire frames but variably reuses common parts between frames. One
of its major trademarks is the stylized design in all forms and shapes, which
in the early days was referred to as modern design. Pioneered by the artists
at the American studio, United Productions of America, limited animation can
be used as a method of stylized artistic expression. Its primary use, however,
has been in producing cost-effective animated content for media such as
television (the work of Hanna-Barbara, Filmation, and other TV animation
studios) and later on in the Internet (web cartoons).
Rotoscoping is an animation technique in which animators trace over live-
action film movement, frame by frame, for use in animated films. Originally,
recorded live-action film images were projected onto a frosted glass panel
and re-drawn by an animator. This projection equipment is called a
rotoscope, although this device has been replaced by computers in recent
years. In the visual effects industry, the term rotoscoping refers to the
technique of manually creating a matte for an element on a live-action plate
so that it can be composited over another background.
Live Action/Animation is a motion picture that features a combination of real
actors or elements, live-action and animated elements, typically interacting.
Originally, animation was combined with live action in several ways,
sometimes as simply as double-printing two negatives onto the same release
print. More sophisticated techniques used optical printers or aerial image
animation cameras, which enable more exact positioning, and better
interaction of actors and animated characters. Often, every frame of the live
action film was traced by rotoscoping, so that the animator could add his
drawing in the exact position. The combination of live action and animation
is very common in TV commercials, especially those promoting products
appealing to children.
12.3.2 Stop Motion
Stop motion (also known as stop action) is an animation technique used to
make a physically manipulated object appear to move on its own. The object
is moved in small increments between individually photographed frames,
creating the illusion of movement when the series of frames is played as
continuous sequence. Clay figures are often used in stop motion for their ease
Manipal University Jaipur B1552 Page No.: 228
Graphics and Multimedia Systems Unit 12
of repositioning. Motion animation using clay is called clay animation or clay-
motion.
Puppet Animation typically involves stop-motion puppet figures interacting
with each other in a constructed environment, in contrast to the real-world
interaction in model animation. The puppets generally have an armature inside
of them to keep them still and steady as well as constraining their movements
to particular joints. Examples of puppet animation include The Tale of the Fox
(France, 1937), The Nightmare Before Christmas (US, 1993), Corpse Bride
(US, 2005), Coraline (US, 2009).
Puppetoon animation is a type of replacement animation, which is itself a
type of stop-motion animation. In traditional stop-motion, the puppets are
made with movable parts which are repositioned between frames to create
the illusion of motion when the frames are played in rapid sequence. In
puppetoon animation the puppets are rigid and static pieces. Each puppet is
typically used in a single frame and then switched with a separate, near-
duplicate puppet for the next frame. Thus puppetoon animation requires many
separate figures. It is thus more analogous in a certain sense to cel animation
than is traditional stop-motion as the characters are created from scratch for
each frame.
Clay Animation: Here, each object is sculpted in clay or a similarly pliable
material such as plasticine, usually around a wire skeleton called an
armature. As in other forms of object animation, the object is arranged on the
set (background), a film frame is exposed, and the object or character is then
moved slightly by hand. Another frame is taken, and the object is moved
slightly again. This cycle is repeated until the animator has achieved the
desired amount of film.
Graphic Animation is a variation of stop motion (and possibly more
conceptually associated with traditional flat cel animation and paper drawing
animation, but still technically qualifying as stop motion) consisting of the
animation of photographs (in whole or in parts) and other non-drawn flat visual
graphic material, such as newspaper and magazine clippings.
In its simplest form, graphic animation can take the form of the animation
camera merely panning up and down and/or across individual photographs,

Manipal University Jaipur B1552 Page No.: 229

Graphics and Multimedia Systems Unit 12

one at a time, (filmed frame-by-frame, and hence, "animated") without

changing the photographs from frame to frame, as on Ken Burns historical
documentary films for PBS. But once the photos (or "graphics") are moved
from frame to frame, more exciting montages of movement can be produced.
12.3.4 Computer Animation
Computer animation is the process used for generating animated images by
using computer graphics. Computer generated animations are more
controllable than other more physically based processes, such as
constructing miniatures for effects shots or hiring extras for crowd scenes.
Also, it allows the creation of images that would not be feasible using any
other technology.
2D animation
2D animation figures are created and/or edited on the computer using 2D
bitmap graphics or created and edited using 2D vector graphics. It is defined
as the creation of moving pictures in a two-dimensional environment, through
"traditional" cel animation or in computerized animation software. This is
achieved through sequencing consecutive images or frames that simulate
motion with each image going towards the next in a gradual progression of
steps. The eye can be "fooled" into perceiving motion when these consecutive
images are shown at a rate of 24 frames per second or faster.
3D animation
3D animation creates a third dimension so that cartoons do not look like
drawings. It is digitally modeled and manipulated by an animator. In order to
manipulate a mesh, it is given a digital skeletal structure that can be used to
control the mesh. This process is called rigging. Various other techniques
can be applied, such as mathematical functions (example gravity, particle
simulations), simulated fur or hair, effects such as fire and water and the use
of motion capture to name a few. These techniques fall under the category
of 3D dynamics.
Self-Assessment Questions
1. is a process of making animated
cartoons that does not redraw entire frames but variably reusescommon parts between
frames.

Manipal University Jaipur B1552 Page No.: 230

Graphics and Multimedia Systems Unit 12
2. are made with movable parts which are repositioned between
frames to create the illusion of motion when theframes are played in rapid sequence.
a. Puppets
b. Cartoons
c. characters
d. animators
3. Stop motion is an animation technique to make a physicallymanipulated object
appear to move on its own. (True/False)
4. What is Rotoscoping?

12.4 Key Frame Animation

Design and control of animation sequences are handled with a set of
animation routines. A general-purpose language, such as C, Lisp, Pascal, or
FORTRAN, is often used to program the animation functions, but several
specialized animation languages have also been developed. Animation
functions include a graphics editor, a key-frame generator, an in-between
generator, and standard graphics routines. The graphics editor allows one to
design and modify object shapes, using spline surfaces, constructive solid-
geometry methods, or other representation schemes. A typical task in an
animation specialization is scene description. This includes the positioningof
objects and light sources, defining the photometric parameters (light- source
intensities and surface-illumination properties), and setting the camera
parameters (position, orientation, and lens characteristics). Another standard
function is action specification. This involves the layout of motion paths for the
objects and camera. Also are required the usual graphics routines: viewing
and perspective transformations, geometric transformations to generate
object movements as a function of accelerations or kinematic path
specifications, visible-surface identification, and the surface-rendering
operations.
Key-frame systems are specialized animation languages designed simply to
generate the in-betweens from the user-specified key frames. Usually, each
object in the scene is defined as a set of rigid bodies connected at the joints
and with a limited number of degrees of freedom. For example, the single-
arm robot in figure 12.1 has six degrees of freedom, which are called arm
sweep, shoulder swivel, elbow extension, pitch, yaw, and roll. The number of
degrees of freedom for this robot arm can be extended to nine by allowing
three-dimensional translations for the base. One can also allow base

Manipal University Jaipur B1552 Page No.: 231

Graphics and Multimedia Systems Unit 12

rotations; the robot arm can have a total of 12 degrees of freedom. The human
body, in comparison, has over 200 degrees of freedom.

Figure 12.1: Degree of freedom for a stationary single-arm robot

Parameterized Systems allow object-motion characteristics to be specified

as part of the object definitions. The adjustable parameters control such
object characteristics such as degrees of freedom, motion limitations, and
allowable shape changes.
Scripting Systems allow object specifications and animation sequences to
be defined with a user-input script. From the script, a library of various objects
and motions can be constructed.
Key Frame Systems
Each set of in-betweens can be generated from the specification of two (or
more) key frames. Motion paths can be given with a kinematic description
as a set of spline curves, or the motions can be physically based by specifying
the forces acting on the objects to be animated. For complex scenes, one can
separate the frames into individual components or objects called cels or
celluloid transparencies, an acronym from cartoon animation. Given the
animation paths, it is possible to interpolate the positions ofindividual objects
between any two times. With complex object transformations, the shapes of
objects may change over time. Examples for this include clothes, facial
features, magnified detail, evolving shapes, exploding or disintegrating
objects, and objects that transform from one to another. If all surfaces are
described with polygon meshes, then the numberof edges per polygon can
change from one frame to the next. Thus, the totalnumber of line segments
can be different in different frames.

Manipal University Jaipur B1552 Page No.: 232

Graphics and Multimedia Systems Unit 12

Self-Assessment Questions
5. Design and control of animation sequences are handled with a set of
.
6. What are Parameterized systems?

12.5Utility

Multimedia utility refers to the use of multimedia elements such as text, images,
audio, video, and animation to convey information or messages through various
digital devices. It has become an essential tool in modern society due to the
increasing use of digital devices and the internet. In this article, we will explore the
various applications of multimedia utility and its importance in today's world.

One of the main uses of multimedia utility is in the field of education. With the advent
of technology, it has become easier for students to access a wide range of
multimedia resources, including online textbooks, videos, and interactive learning
materials. This has revolutionized the way in which education is delivered, making it
more engaging and interactive for students. Multimedia utility has also made it
possible for educators to personalize learning and cater to the individual needs of
students, which was previously difficult to achieve with traditional classroom
methods.

Another important application of multimedia utility is in the field of entertainment. The

rise of digital media has led to the creation of various forms of entertainment,
including video games, movies, and music. The use of multimedia elements such as
audio, video, and animation has made these forms of entertainment more engaging
and immersive for audiences. Additionally, the availability of streaming services has
made it easier for people to access a wide range of multimedia content, regardless
of their location or device.

Multimedia utility is also important in the field of marketing and advertising. With the
increasing use of social media and online platforms, companies can now reach a
wider audience through various multimedia elements such as images, videos, and
interactive advertisements. The use of multimedia in marketing and advertising has
been shown to increase customer engagement and improve brand recognition,
leading to increased sales and revenue.

Multimedia utility has also been used in the field of healthcare, where it has been
used to improve patient care and education. The use of multimedia elements such
as videos, animations, and infographics has made it easier for healthcare
professionals to communicate complex medical information to patients. Additionally,
multimedia tools such as telemedicine have made it possible for patients to receive
medical care remotely, reducing the need for in-person visits and improving access

Manipal University Jaipur B1552 Page No.: 233

Graphics and Multimedia Systems Unit 12
to healthcare services.

The use of multimedia utility has also had a significant impact on the field of
journalism. With the rise of digital media, journalists can now use a wide range of
multimedia elements to enhance their reporting, including images, videos, and audio
recordings. This has made it possible for journalists to provide more in-depth
coverage of events and issues, as well as to reach a wider audience through online
platforms.

In conclusion, multimedia utility has become an essential tool in today's world, with
applications in various fields such as education, entertainment, marketing,
healthcare, and journalism. Its ability to enhance communication, engagement, and
accessibility has made it a valuable resource for individuals and organizations alike.
As technology continues to advance, it is likely that the use of multimedia utility will
become even more widespread, leading to further innovation and development in
the field.

12.6Morphing
Morphing is a special effect in motion pictures and animations that changes
one image into another through a seamless transition. It is frequently used
to depict one person turning into another through technological means or as
part of a fantasy or surreal sequence. Traditionally such a depiction would be
achieved through cross-fading techniques on film.
Transformation of object shapes from one form to another is called
morphing, which is a shortened form of metamorphosis. Morphing methods
can be applied to any motion or transition involving a change in shape. Given
two key frames for an object transformation, it is necessary to first adjust the
object specification in one of the frames so that the number of polygon edges
(or the number of vertices) is the same for the two frames. This preprocessing
step is illustrated in Figure. 12.2. A straight-line segment in key frame k is
transformed into two line segments in key frame k+1. Sincekey frame k+1 has
an extra vertex, one needs to add a vertex between vertices 1 and 2 in key
frame k to balance the number of vertices (and edges) in the two key frames.

Manipal University Jaipur B1552 Page No.: 234

Graphics and Multimedia Systems Unit 12

Figure 12.2: An edge with vertex positions 1 and 2 in key frame k evolves into
two connected edges in key frame k + 1

Manipal University Jaipur B1552 Page No.: 235

Graphics and Multimedia Systems Unit 12

Using linear interpolation to generate the in-betweens, one can transition the
added vertex in key frame k into vertex 3 along the straight-line path as shown
in Figure 12.3.

Figure 12.3: linear interpolations for transforming a line segment in

Key frame k into two connected line segments in key frame k + 1.

An example of a triangle linearly expanding into a quadrilateral is given in

Figure. 12.4.

Figure 12.4: Linear interpolation for transforming a triangle into a

quadrilateral

Pixel Manipulation
Morphing is a combination of two processes, each of which changes pixel
attributes. Cross-dissolving changes the image's colors, pixel by pixel while
warping changes the shapes of features in the image by shifting its pixels
around.

Manipal University Jaipur B1552 Page No.: 236

Graphics and Multimedia Systems Unit 12

Cross-Dissolving, is a process that produces bridging images by averaging

pixel colors row-by-row, column-by-column. In other words, in any
intermediate image of a cross-dissolve, the pixel at row x and column y is
the average of the pixel color at (x, y) in the source and the pixel color at
(x, y) in the destination image. A simple morph between two similar images
with the same dimensions and resolution can be created with cross-
dissolving alone. For example, cross-dissolving can change a green circle
into a blue circle flawlessly. However, to change a green circle into blue
square, one would need to change the shape of the circle to fit the shape of
the square before or while cross-dissolving.
Warping, the second technique is used to change shapes. It uses one of
many algorithms to change the row and column values of an image's pixels,
thus changing the actual shape of features in the image. By itself, warping
acts like a funhouse mirror, distorting the features in a single image.
Morphing, a combination of these two effects warps two images so that their
features are the same shape and then cross-dissolves between them. The
result is a smooth transformation.
Morphing involves two steps, i) warp the two images ii) cross-dissolve their
colors. Figures 12.5 and 12.6 depict the combination of generalized image
warping with a cross-dissolve between pixels.

Figure 12.5: warping of two images

Manipal University Jaipur B1552 Page No.: 237

Graphics and Multimedia Systems Unit 12

Figure 12.6: Cross dissolving two images

During the morphing process one needs to do the warp process first. If the
cross-dissolve is done with two arbitrary images, one gets the double-image
effect. Re-positioning of all pixels in the source images is required to avoid
or minimize the double-image effect Intermediate Images.
In order to ensure smooth transition, each intermediate frame is created by
the combination of beginning and ending pictures. An image's position in the
sequence determines the influence that the beginning and ending frames will
have upon it.
It can be observed in figure 12.7 that the early intermediate images in the
sequence are much like the first source image. The middle image of the
sequence is the average of the first source image distorted halfway towards
the second one and the second source image distorted halfway back towards
the first one. The last images in the sequence are similar to the second source
image.

Manipal University Jaipur B1552 Page No.: 238

Graphics and Multimedia Systems Unit 12

Figure 12.7: intermediate images

For example, in a sequence of ten total frames, the first frame is 100% of
the start image blended with 0% of the end image. The second frame is 90%
of the start image blended with 10% of the end image. Frame three is 80%
and 20%, and so forth.
Self-Assessment Questions
7. Name the technique used to change shapes during morphing.
8. Specify the role of cross dissolving in morphing.
9. In order to make a transition smooth, each intermediate frame is created by
the combination of beginning and ending pictures.(True/False)

12.7Virtual Reality Concepts

Virtual reality (VR) is a term applied to computer-simulated environments that
can simulate physical presence in places in the real world, as well as in
imaginary worlds. Some of the most recent virtual reality environments are
primarily visual experiences, displayed either on a computer screen or
through special stereoscopic displays, but some simulations include
additional sensory information, such as sound through speakers or

Manipal University Jaipur B1552 Page No.: 239

Graphics and Multimedia Systems Unit 12

headphones. Some advanced haptic systems now include tactile information,

generally known as force feedback, in medical and gaming applications.
Further, virtual reality covers remote communication environments which
provide the virtual presence of users with the concepts of telepresence and
telexistence or a virtual artifact (VA) either through the use of standard input
devices such as a keyboard and mouse, or through multimodal devices such
as a wired glove, and the omnidirectional treadmills.
The simulated environment can be similar to the real world in order to create
a lifelike experience. For example, in the case of simulations for pilot or
combat training and in other cases like VR games, it can differ significantly
from reality. In practice, it is currently very difficult to create a high-fidelity
virtual reality experience, largely due to technical limitations of processing
power, image resolution, and communication bandwidth, However, the
technology's proponents hope that such limitations will be overcome as
processor, imaging, and data communication technologies become more
powerful and cost-effective over time.
Virtual reality is often used to describe a wide variety of applications
commonly associated with immersive, highly visual, 3D environments. The
development of CAD software, graphics hardware acceleration, and head
mounted displays; database gloves and miniaturization have helped
popularize the notion. There are seven different concepts of virtual reality.
These are: simulation, interaction, artificiality, immersion, telepresence, full-
body immersion, and network communication. But generally one tends to
identify VR with head mounted displays and data suits.
Virtual Reality involves stimulating the user’s senses in such a way that a
computer generated world is experienced as real. To get a true illusion of
reality, it is essential for the user to have an influence on this virtual
environment. All that has to be done in order to raise the illusion of being in
or acting upon a virtual world or virtual environment, is providing a simulation
of the interaction between the human being and this real environment. This
simulation is -at least- partly attained by means of Virtual Reality interfaces
connected to a computer. Basically, a virtual reality interface stimulates one
of the human senses. This is not necessarily as complex as it sounds, for
example, a PC-monitor stimulates the visual

Manipal University Jaipur B1552 Page No.: 240

Graphics and Multimedia Systems Unit 12

sense; a headphone stimulates the auditory sense. Consequently, these two

kinds of interfaces are widely employed as Virtual Reality interfaces.
With the gustatory and olfactory sense left out of consideration, the hardest
part of simulating the interaction between human being and real environment
is stimulating the tactile sense and the proprioceptive system (kinesthetic
sense). This can be done using a so-called haptic interface as shown in figure
12.8. This is a device configured to provide haptic information to a human.
Just as a video interface allows the user to see a computer generated scene,
a haptic interface permits the user to “feel” it. Haptic displays generate forces
and motions, which are sensed throughboth touch and kinesthesia.

Figure 12.8: A haptic interface (FCS HapticMaster)

Currently, there are two main kinds of haptic interfaces, namely the off-body
interface and the on-body interface. The main difference is that the mass of
the on-body interface is supported by the operator while the off-body interface
rests on the floor. These days, most commercially available devices are off-
body.

(a) (b)
Figure 12.9: a) On-body interface (Exoskeleton) b) Off-body interface (Phantom
Desktop)

Manipal University Jaipur B1552 Page No.: 241

Graphics and Multimedia Systems Unit 12

12.7.1 Types of VR
Although it is difficult to categorize all VR systems, most configurations fall
into three main categories and each category can be ranked by the sense
of immersion, or degree of presence it provides. Immersion or presence can
be regarded as how powerfully the attention of the user is focused on the task
in hand. Immersion presence is generally believed to be the product of several
parameters including level of interactivity, image complexity, stereoscopic
view, and field of regard and the update rate of the display.
Non-Immersive (Desktop) Systems
Non-immersive systems, as the name suggests, are the least immersive
implementation of VR techniques. Using the desktop system, the virtual
environment is viewed through a portal or window by utilizing a standard
higher resolution monitor. Interaction with the virtual environment can occur
by conventional means such as keyboards, mouse and trackballs or may be
enhanced by using 3D interaction devices such as a Space Ball; or Data
Glove.
The non-immersive system has advantages as they do not require thehighest
level of graphics performance, no special hardware and can be implemented
on high specification PC clones. This means that these systems can be
regarded as the lowest cost VR solution which can be used for many
applications. Additionally, these systems are of little use where the perception
of scale is an important factor. However, one would expect to see an increase
in the popularity of such systems for VR use in the near future. This is due to
the fact that Virtual Reality Modeling Reality Language (VRML) is expected
to be adopted as a de-facto standard for the transfer of 3D model data and
virtual worlds via the internet. The advantage of VRML for the PC desktop
user is that this software runs relatively well on a PC, which is not always the
case for many proprietary VR authoring tools. Further, many commercial VR
software suppliers are now incorporatingVRML capability into their software
and exploring the commercial possibilities of desktop VR in general.

Semi-Immersive Projection Systems

Semi-immersive systems are a relatively new implementation of VR
technology and borrow considerably from technologies developed in the flight
simulation field. A semi-immersive system will comprise of a relatively

Manipal University Jaipur B1552 Page No.: 242

Graphics and Multimedia Systems Unit 12

high performance graphics computing system which can be coupled with

either:
• A large screen monitor or
• A large screen projector system or
• Multiple television projection systems.
In many ways, these projection systems are similar to the IMAX theatres.
Using a wide field of view, these systems increase the feeling of immersion
or presence experienced by the user. However, the quality of the projected
image is an important consideration. It is important to calibrate the geometry
of the projected image to the shape of the screen to prevent distortions and
the resolution will determine the quality of textures, colors, the ability
of define shapes and the ability of the user to read text on-screen. The
resolutions of projection systems range from 1000 - 3000 lines but to achieve
the highest levels it may be necessary to use multiple projection systems
which are more expensive. Semi-immersive systems therefore provide a
greater sense of presence than on-immersive systems and also a greater
appreciation of scale. In addition, image scan provided is of a far greater
resolution than HMDs and this implementation offers the ability to share the
virtual experience. This may have a considerable benefit in educational
applications as it allows simultaneous experience of the VE which is not
available with head-mounted immersive systems. Additionally, stereographic
imaging can be achieved, using some type of shuttered glasses in
synchronization with the graphics system.
Fully Immersive Head-Mounted Display Systems
The most direct experience of virtual environments is provided by fully
immersive VR systems. These systems are probably the most widely known
VR implementation where the user either wears an HMD or uses some form
of head-coupled display such as a Binocular Omni-Orientation Monitor.
Head Mounted Displays (HMDs)
An HMD uses small monitors placed in front of each eye which can provide
stereo, bi-ocular or monocular images. Stereo images are provided in a
manner similar to shutter glasses, so that a slightly different image is
presented to each eye. The major difference is that the two screens are
placed very close (50-70mm) to the eye, although the image, which the
wearer focuses on, will be much further away because of the HMD optical

Manipal University Jaipur B1552 Page No.: 243

Graphics and Multimedia Systems Unit 12

system. Bi-ocular images can be provided by displaying identical images on

each screen and monocular images by using only one display screen. The
most commonly used displays are small Liquid Crystal Display (LCD) panels
but more expensive HMDs use Cathode Ray Tubes (CRT) which increase the
resolution of the image. The HMD design may partially or fully exclude the
user’s view of the real world and enhances the field of view of the computer
generated world. The advantage of this method is that the user is provided
with a360° field of regard meaning that the user will receive a visual image
whichever direction they turn. All fully immersive systems givea sense of
presence that cannot be equaled by the approaches discussed earlier, but
the sense of immersion depends of several parameters including the field of
view of the HMD, their solution, the update rate, and contrast andillumination
of the display.
Observe figure 12.10, which depicts the major components of an HMD. It
shows the two screens capable of producing stereo images and speakers
located to provide stereo sound.

Figure 12.10: HMD

Fully immersive VR systems tend to be the most demanding in terms of the

computing power and level of technology (and consequently cost) required
to achieve satisfactory level of realism and development is constantly
underway to improve the technologies. Major areas of research and
development include field of view vs. resolution trade-offs, reducing the size
and weight of HMDs and reducing system lag times.

Self-Assessment Questions

10.HMD stands for .

11. and are the twomain kinds of haptic interfaces.
Manipal University Jaipur B1552 Page No.: 244
Graphics and Multimedia Systems Unit 12

12.Which systems provides most direct experience of virtual environments

a) Non Immersive
b) Semi immersive
c) Fully Immersive

12.8Summary
Now let us recapitulate the content discussed in this unit. Animation is the
rapid display of a sequence of images of 2-D artwork or model positions in
order to create an illusion of movement due to the phenomenon of persistence
of vision. Also discussed were the various types and techniquesof animation.
A key frame animation basically supports one by providing exact control over
the way one layers the animation. We understood from this unit that morphing
is a technology for transforming one image to another and morphing is popular
in the entertainment industry. Virtual reality is also known as artificial reality.
It represents computer interface technology that isdesigned to leverage our
natural human capabilities. We also discussed the types of virtual reality;
major distinction of VR systems is the mode with which they interface to the
user.

12.9Terminal Questions
1. List and explain the types of animation.
2. Describe the role of key frame animation.
3. Explain the techniques involved in morphing process.
4. What is virtual reality?
5. Explain the types of virtual reality.

12.10 Answers
Self-Assessment Questions
1. Limited animation
2. a. Puppets
3. True
4. Animation technique, here animators trace over live-action film
movement, frame by frame, used in animated films.
5. Animation routines
6. It allows object-motion characteristics to be specified as part of the
object definitions
7. Warping
Manipal University Jaipur B1552 Page No.: 245
Graphics and Multimedia Systems Unit 12
8. Changes the image color pixel by pixel
9. True
10. Head Mounted Display
11. Off body and on body interface
12. c. Fully Immersive

Terminal Questions
1. Three types of animation remains as three primary types of animation
called traditional, stop-motion, and computer. For more details Refer
section 12.3.
2. Key-frame systems are specialized animation languages designed simply
to generate the in-betweens from the user-specified key frames. For more
details Refer section 12.4.
3. Morphing is a combination of two processes, each of which changes pixel
attributes. Cross-dissolving changes the image's colors, pixel by pixel;
warping changes the shapes of features in the image by shifting its pixels
around. For more details Refer 12.6.
4. Virtual reality (VR) is a term that applies to computer-simulated
environments that can simulate physical presence in places in the real
world, as well as in imaginary worlds. For more details Refer section 12.7.
5. Non immersive, semi immersive and full immersive are the important
three types of virtual reality. For more details Refer sub section 12.7.1.

Manipal University Jaipur B1552 Page No.: 246

Graphics and Multimedia Systems Unit 14

References:
1. Donald D. Hearn., M. Pauline Baker Computer Graphics (2009).
Pearson Education.
2. J. Foley, A. van Dam, S. Feiner, & J. Hughes, Computer Graphics:
Principles and Practice (1996), 2nd edition, Addison-Wesley,
3. Alan Watt 3D Computer Graphics (3rd Edition) (1999).
4. S. Hoggar, Mathematics for Computer Graphics (1992)., Cambridge
University Press,
5. Bhatnagar G., Mehta S., & Mitra S. (2004). Introduction to Multimedia
Systems. India: Elsevier.
6. Buford J.F.K. (2007). Multimedia Systems. India: Pearson Education.
7. Li Z. & Drew M. S. (2009). Fundamentals of Multimedia. India: Pearson
Education.
8. Parekh R. (2007). Principles of Multimedia. India: Tata McGraw-Hill.

Manipal University Jaipur B1552 Page No.: 247

Jinkies
90% (10)
Jinkies
145 pages
Computer Graphics
100% (5)
Computer Graphics
148 pages
PDF of Computer Graphics 2
No ratings yet
PDF of Computer Graphics 2
194 pages
Unit 1
100% (1)
Unit 1
13 pages
CG Imp Notes
No ratings yet
CG Imp Notes
180 pages
Unit 1.1.2 Application
No ratings yet
Unit 1.1.2 Application
29 pages
Unit1 2 Notes
No ratings yet
Unit1 2 Notes
25 pages
CG Unit 1
No ratings yet
CG Unit 1
143 pages
Comp Graphics Notes All Chaps Comp Graphics Notes All Chaps
No ratings yet
Comp Graphics Notes All Chaps Comp Graphics Notes All Chaps
31 pages
Comp Graphics - Notes - All - Chaps
No ratings yet
Comp Graphics - Notes - All - Chaps
30 pages
Basic of Computer Graphics, Applications of Computer Graphics
No ratings yet
Basic of Computer Graphics, Applications of Computer Graphics
10 pages
Computer Graphics
No ratings yet
Computer Graphics
132 pages
Computer Graphics
100% (1)
Computer Graphics
132 pages
Unit-01-Introduction To Computer Graphics Structure
No ratings yet
Unit-01-Introduction To Computer Graphics Structure
13 pages
Module 1 Complete (CG)
No ratings yet
Module 1 Complete (CG)
171 pages
Introduction To Graphics Application Unit-2
No ratings yet
Introduction To Graphics Application Unit-2
20 pages
CG - Unit - 1
No ratings yet
CG - Unit - 1
249 pages
CMP461 Computer Graphics and Animations
No ratings yet
CMP461 Computer Graphics and Animations
175 pages
Chapter 1
No ratings yet
Chapter 1
71 pages
CG Chapter 1 and 2
No ratings yet
CG Chapter 1 and 2
58 pages
Introduction To Computer Graphics: Unit - I
No ratings yet
Introduction To Computer Graphics: Unit - I
379 pages
CG-Complete Notes 1584255270
No ratings yet
CG-Complete Notes 1584255270
144 pages
CG Unit I My Notes
No ratings yet
CG Unit I My Notes
27 pages
CMP461 Computer Graphics and Animations 2nd Note
No ratings yet
CMP461 Computer Graphics and Animations 2nd Note
171 pages
Computer Graphics Notes PDF
No ratings yet
Computer Graphics Notes PDF
98 pages
Chapter 1 - Introduction and History of Computer Graphics
No ratings yet
Chapter 1 - Introduction and History of Computer Graphics
85 pages
CGIP Mod1 PPT 1 (Intro, Applications)
No ratings yet
CGIP Mod1 PPT 1 (Intro, Applications)
31 pages
Chapter 1
No ratings yet
Chapter 1
71 pages
MOD 1
No ratings yet
MOD 1
139 pages
document
No ratings yet
document
271 pages
Lec 01_02 Introduction to Computer Graphics
No ratings yet
Lec 01_02 Introduction to Computer Graphics
73 pages
Computer Graphics & Multimedia
100% (4)
Computer Graphics & Multimedia
28 pages
17CS62 CGV
No ratings yet
17CS62 CGV
297 pages
Complete CG Full Notes (PPT & PDF
No ratings yet
Complete CG Full Notes (PPT & PDF
377 pages
Computer Graphics AND Multimedia: Edusat Learning Resource Material
No ratings yet
Computer Graphics AND Multimedia: Edusat Learning Resource Material
98 pages
USIT 405 Computer Graphics and Animation Munotes
No ratings yet
USIT 405 Computer Graphics and Animation Munotes
218 pages
Module 1 CGV
No ratings yet
Module 1 CGV
16 pages
CG-Unit-I[1]
No ratings yet
CG-Unit-I[1]
84 pages
COMPUTER GRAPHICS AND MULTIMEDIA Unit 1
No ratings yet
COMPUTER GRAPHICS AND MULTIMEDIA Unit 1
63 pages
A. Graphs and Charts: Computer Graphics and Visualization (18CS62)
No ratings yet
A. Graphs and Charts: Computer Graphics and Visualization (18CS62)
51 pages
CG Unit 1 PPT New
100% (1)
CG Unit 1 PPT New
29 pages
CG 1
No ratings yet
CG 1
137 pages
CHAPTER 1 (2)
No ratings yet
CHAPTER 1 (2)
47 pages
Unit-I Cga
No ratings yet
Unit-I Cga
124 pages
Multimedia & CG
No ratings yet
Multimedia & CG
59 pages
CGZS UNIT 1-1
No ratings yet
CGZS UNIT 1-1
24 pages
What Is Computer Graphics
No ratings yet
What Is Computer Graphics
8 pages
unit 1
No ratings yet
unit 1
15 pages
CGM CH 1 All
No ratings yet
CGM CH 1 All
112 pages
SM 58
No ratings yet
SM 58
29 pages
1-Application of Computer Graphics
No ratings yet
1-Application of Computer Graphics
28 pages
Complete Note On Module 1
No ratings yet
Complete Note On Module 1
66 pages
Computer Graphics Assignment
No ratings yet
Computer Graphics Assignment
36 pages
Computer Graphics Module1
No ratings yet
Computer Graphics Module1
39 pages
MCA Computer Graphics Report
No ratings yet
MCA Computer Graphics Report
15 pages
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
From Everand
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
Fouad Sabry
No ratings yet
Global Illumination: Advancing Vision: Insights into Global Illumination
From Everand
Global Illumination: Advancing Vision: Insights into Global Illumination
Fouad Sabry
No ratings yet
Geometric Modeling: Exploring Geometric Modeling in Computer Vision
From Everand
Geometric Modeling: Exploring Geometric Modeling in Computer Vision
Fouad Sabry
No ratings yet
Fundamentals of Digital Image Processing
From Everand
Fundamentals of Digital Image Processing
Dandak Kaniyar
No ratings yet
3D Hardware design:: Software applications for GPU
From Everand
3D Hardware design:: Software applications for GPU
S Mathioudakis
No ratings yet
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
From Everand
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
Fouad Sabry
No ratings yet
Verb To Be - Worksheet
No ratings yet
Verb To Be - Worksheet
1 page
TV & Satellite Week - December 21, 2024 UK
No ratings yet
TV & Satellite Week - December 21, 2024 UK
196 pages
Lista Canale OROC TV Prin Cablu 24 Iunie 2022
No ratings yet
Lista Canale OROC TV Prin Cablu 24 Iunie 2022
5 pages
141214-Article Text-376019-1-10-20160802
No ratings yet
141214-Article Text-376019-1-10-20160802
10 pages
Addams Family Character Breakdown
No ratings yet
Addams Family Character Breakdown
2 pages
Reported Speech Mowa Zależna.
100% (2)
Reported Speech Mowa Zależna.
3 pages
Jill Dando Pitch
No ratings yet
Jill Dando Pitch
21 pages
Compiled result Reshuffling Test Panini (2)
No ratings yet
Compiled result Reshuffling Test Panini (2)
6 pages
Selected students
No ratings yet
Selected students
6 pages
40 mac drk (1)
No ratings yet
40 mac drk (1)
20 pages
Who's Who at The Zoo - Barney Wiki - Fandom
No ratings yet
Who's Who at The Zoo - Barney Wiki - Fandom
4 pages
Animasi Asia
No ratings yet
Animasi Asia
30 pages
Disney Hits
100% (1)
Disney Hits
29 pages
Kim So-Hyun
No ratings yet
Kim So-Hyun
19 pages
ALS Project
No ratings yet
ALS Project
34 pages
INTEGRANTES
No ratings yet
INTEGRANTES
2 pages
Genthink English Reviewer
0% (1)
Genthink English Reviewer
3 pages
Worksheet 12
No ratings yet
Worksheet 12
6 pages
FLOWERS - Ukulele Tabs by Miley Cyrus On UkuTabs
No ratings yet
FLOWERS - Ukulele Tabs by Miley Cyrus On UkuTabs
3 pages
Mackenzie Ziegler: Mackenzie Frances Ziegler (Born June 4, 2004) Is An American
No ratings yet
Mackenzie Ziegler: Mackenzie Frances Ziegler (Born June 4, 2004) Is An American
14 pages
Benjamin Shah: Modeling Experience
No ratings yet
Benjamin Shah: Modeling Experience
1 page
PROGRAMS SEARCH 2024- 2025 (1)
No ratings yet
PROGRAMS SEARCH 2024- 2025 (1)
2 pages
Naruto Shippuuden Akatsuki Theme
100% (1)
Naruto Shippuuden Akatsuki Theme
2 pages
Diagnostic Test Standard With Answers
No ratings yet
Diagnostic Test Standard With Answers
2 pages
Replacement 13 Aug (1)
No ratings yet
Replacement 13 Aug (1)
360 pages
Naruto Live Action Adaptation - Chapter 53
No ratings yet
Naruto Live Action Adaptation - Chapter 53
4 pages
Pokemon Dossier
No ratings yet
Pokemon Dossier
16 pages
Data Manipulation in Dplyr
No ratings yet
Data Manipulation in Dplyr
29 pages
Your Personal Channel Line-Up
No ratings yet
Your Personal Channel Line-Up
5 pages