DCA3142 Graphics and Multimedia
DCA3142 Graphics and Multimedia
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 1
Introduction to Computer Graphics and
Graphics System
Table of Contents
8 Graphics Devices - 7 19 - 20
9 Summary - - 21
10 Terminal Questions - - 21 - 22
11 Answers - - 22 - 24
1. INTRODUCTION
The art of drawing images using computer programming is known as computer graphics. It
entails calculations, data production, and manipulation. In other words, computer graphics
can be thought of as a tool for creating and modifying images. Graphics can be two- or three-
dimensional while images can be completely synthetic or be produced by manipulating
photographs. So, farm any powerful tools have been developed to visualize data.
Computer graphics has emerged as a sub-field of computer science which studies methods
to synthesize and manipulate visual content digitally to digitally synthesize and manipulate
visual content. Over the past decade, other specialized fields have evolved like information
visualization, and scientific visualization which is primarily focused on the visualization of
three dimensional phenomena where the emphasis is on realistic renderings of volumes and
surfaces.
Computer generated imagery can be categorized into several different types: 2D, 3D, and
animated graphics. With improvements in technology, 3D computer graphics have become
common, but 2D computer graphics too are widely used.
In this unit we will discuss graphics, Present and Interact picture presentation, image
processing such as picture analysis, visualization, RGB color model, direct coding and lookup
table.
1.1 Objectives:
The term computer graphics has been broadly used to describe "almost everything on a
computer that is not text or sound". Typically, the term computer graphics refers to several
different things: the representation and manipulation of image data by a computer, the
various technologies used to create and manipulate images, the images so produced, and the
sub-field of computer science which studies methods for digitally synthesizing and
manipulating visual content. Today, computers and computer-generated images touch many
aspects of daily life. Computer imagery is found on television, in newspapers, for example in
weather reports, or in all kinds of medical investigation and surgical procedures. A well-
constructed graph can present complex statistics in a form that is easier to understand and
interpret. In the media such graphs are used to illustrate papers, reports, thesis, and other
presentation material.
There are many other areas that involve computer graphics, but whether they are core
graphics area is a debatable matter. These will all be touched upon in the text. Such areas
include the following:
User Interaction: It deals with the interface between input devices such as mouse and
tablets, the application, feedback to the user in imagery, and other types of sensory feedback.
Historically, this area is associated with graphics largely because graphics researchers were
among the first to have access to the input/output devices that are now ubiquitous.
Virtual Reality: It attempts to immerse the user into a 3D virtual world. This typically
requires at least stereo graphics and response to head motion. For true virtual reality, sound
and force feedback should be provided as well. Because this area requires advanced 3D
graphics and advanced display technology, it is closely associated with graphics.
Visualization: Today, there is a much greater need for visualization than ever before. Data
visualization, for example, aids in discovering insights from data, and we also need the right
kind of visualization to check and study the behaviour of processes around us. This can be
done by using computer graphics in the right way.
Image Processing: Editing is required for many distinct varieties of photographs and
pictures before they may be utilised in a variety of settings. One of the many applications of
computer graphics is the processing of already-existing images into more refined ones for
the purpose of improved interpretation.
Computational Photography: This deals with the use of computer graphics, computer
vision, and image processing methods to enable new ways of photographically capturing
objects, scenes, and environments.
Graphical User Interface: The use of photos, images, icons, pop-up menus, and graphical
objects helps build a user-friendly environment where working is simple and enjoyable. By
employing computer graphics, we can create a setting where everything is automatable and
anyone can easily accomplish the needed action.
SELF-ASSESSMENT QUESTIONS - 1
1. deals with the interface between input devices such as mouse and tablets.
2. Virtual reality attempts to immerse the user into a 3D virtual world. State
True/False.
3. Visualization attempts to give users insight into complex information via
4. ___________ helps in producing more sharp and detailed drawings with superior
specifications.
➢ Image enhancement
➢ Pattern detection and recognition
➢ Scene analysis and computer vision
The image enhancement deals with the improvement in the image quality by eliminating
noise or by increasing image contrast. Pattern detection and recognition deal with the
detection and clarification of standard patterns and finding deviations from these patterns.
The optical character recognition
(OCR) technology is a practical example of pattern detection and recognition. Scene analysis
and computer vision deals with the recognition and construction of 3D model of scene from
several 2D images.
The above three fields of image processing proved their importance in many areas such as
finger print detection and recognition, modeling of building, ships, automobiles etc., and so
on.
Computer graphics and image processing of computer processing of picture in the initial
stages were quite separate disciplines. But now a day they use some common features, and
overlap between them is growing and they both use raster displays.
1. Visualization
Most people are familiar with the digital animations produced to present
meteorological data during weather reports on television, though only a few can
distinguish between the models of reality and the satellite photos that are also shown
on such programs. Television offers scientific visualizations when it shows computer
drawn and animated reconstructions of road or airplane accidents. Some of the most
popular examples of scientific visualizations are: computer-generated images that
show a real spacecraft in action, out in the void far beyond Earth, or on other planets.
Dynamic forms of visualization, such as educational animation or timelines, have the
potential to enhance learning about systems that change over time.
Apart from the distinction between interactive visualizations and animation, the most
useful categorization is probably between abstract and model- based scientific
visualizations.
Management and business graphics: Decision-making systems and graphic displays of data;
Education and instruction: Techniques for enhancing children's and adults' visual thinking
and creative abilities;
Electric CAD/CAM: Generation of printed wire board and integrated circuit design symbols
and schematics.
Printing and publishing: Text and graphic integration in printed documents, page-layout
software, scanning systems, capability for direct-to-plate printing;
Statistical illustrations: Graphical ways for displaying vast quantities of data to enhance
comprehension of data analysis;
Visual arts and design: Computer graphics for graphic design, industrial design, advertising,
and interior design; standards based on design principles about colour, proportion,
positioning, and orientation of visual elements.
SELF-ASSESSMENT QUESTIONS - 2
Interactive graphics allow users to exert a great deal of influence over the layout and make
substantial modifications. Graphic designs that are interactive facilitate two-way contact
between users and the design. When you click a button on a website, use an app on a smart
phone, use an ATM, a picture booth, an airport check-in station, or play a video game, you
are interacting with interactive visuals. Even the operating system of a computer is
interactive graphics.
• Today, a high quality graphics display of personal computer offers one of the most
natural means of communication with a computer.
• It provides tools for producing pictures not only of concrete, “real-world” objects but
also of abstract, synthetic objects such as mathematical surfaces in 4D, and of data that
may have no inherent geometry such as survey results.
• It has an ability to show moving pictures, and thus it is possible to produce animations
with interactive graphics.
• With interactive graphics one can also control the animation by adjusting the speed.
Interactive graphics provides a tool called motion dynamics. With this tool, a user can
move and tumble objects with respect to a stationary observer, or can make objects
stationary with the viewer moving around them. A typical example is walk through
made by apartment builders toshow the interiors of an apartment as well as the
surrounding of the building. In many case it is also possible to move both the objects
and the viewer.
• The interactive graphics also provides a facility called update dynamics. With update
dynamics it is possible to change the shape, colour or other properties of the objects
being viewed.
• With the recent development of digital signal processing (DSP) and audio synthesis
chip, interactive graphics can now provide audio feedback along with the graphical
feedbacks to make the simulated environment even more realistic.
Passive Graphics :It is also called Non interactive graphics. In non-interactive computer
graphics, the image is rendered on the monitor, and the user has no control over the
rendered image, i.e., the user cannot alter the rendered image. Non-interactive Graphics
feature only one-way communication between the computer and the user. The user is able
to view the generated image, but cannot alter it. One example of it is Titles shown on T.V.
SELF-ASSESSMENT QUESTIONS - 3
RGB is an abbreviation for Red, Green, and Blue. In computer graphics, this colour space is
widely utilized. RGB are the primary colours from which all other colours are derived.Here
red, green, and blue light is added together in different ways to reproduce a broad array of
colours.. The main purpose of the RGB colour model is sensing, representation, and display
of images in electronic systems, such as televisions and computers, though it has also been
used in conventional photography. RGB is a device-dependent colour model. Different
devices detect or reproduce a given RGB value differently, since the colour elements (such
as phosphors or dyes) and their response to the individual R, G, and B levels vary from
manufacturer to manufacturer, or even in the same device over time. Thus an RGB value does
not define the same colour across devices without some kind of colour management.
Typical RGB input devices are video cameras, image scanners, and digital cameras. Typical
RGB output devices are TV sets of various technologies (CRT, LCD, plasma, and so on.),
computer and mobile phone displays, video projectors, multicolor LED displays, and large
screens such as JumboTron. Colour printers, on the other hand, are not RGB devices, but
subtractive colour devices (typically CMYK colour model).
To form a colour with RGB, three colored light beams (one red, one green, and one blue) must
be superimposed (for example by emission from a black screen, or by reflection from a white
screen). Each of the three beams is called a component of that colour, and each of them can
have an arbitrary intensity, from fully off to fully on, in the mixture. The RGB colour model is
additive in the sense that the three light beams are added together, and their light spectra
add, wavelength for wavelength, to make the final colour's spectrum.
Zero intensity for each component gives the darkest colour (no light, considered the black),
and full intensity of each gives a white. The quality of this white depends on the nature of the
primary light sources, but if they are properly balanced, the result is a neutral white
matching the system's white point. When the intensities for all the components are the same,
the result is a shade of gray, darker or lighter depending on the intensity. When the
intensities are different, the result is a colourized hue, more or less saturated depending on
the difference of the strongest and weakest of the intensities of the primary colours
employed.
When one of the components has the strongest intensity, the colour is a hue near this primary
colour (reddish, greenish, or bluish), and when two components have the same strongest
intensity, then the colour is a hue of a secondary colour (a shade of cyan, magenta or yellow).
A secondary colour is formed by the sum of two primary colours of equal intensity: cyan is
green +blue, magenta is red +blue and yellow is red +green. Every secondary color is the
complement of one primary color. When a primary and its complementary secondary colour
are added together, the result is white: cyan complements red, magenta complements green
and yellow complements blue.
The RGB colour model itself does not define what is meant by red, green, and blue
calorimetrically, and so the results of mixing them are not specified as absolute, but relative
to the primary colours. When the exact chromaticities of the red, green, and blue primaries
are defined, the colour model becomes an absolute colour space.
SELF-ASSESSMENT QUESTIONS - 4
6. DIRECT CODING
In computer graphics, direct coding is an algorithm that allocates storage space for each pixel
so that it can be assigned a colour.Images are just collections of pixels with colours. . For
example one may allocate 3 bits for each pixel, with one bit for each primary colour (refer
figure 1.1). This 3-bit representation allows each primary to vary independently between
two intensity levels: 0 (off) 1 (0n). Hence each pixel can take on one of the eight colours that
correspond to the corners of the RGB colour tube.
A widely accepted industry standard uses 3 bytes, or 24 bits per pixel, with one byte for each
primary colour. This way each primary colour is allowed to have 256 different intensity
levels, corresponding to binary values from 00000000 to 11111111. Thus a pixel can take
on a colour from 256 x 256 x
256 or 16.7 million possible choices. This 24-bits format is commonly referred to as the true
colour representation the difference between two colours that differ by one intensity level
in one or more of the primaries are virtually undetectable under normal viewing conditions.
Hence a more precise representation involving more bits is of little use in terms of perceived
colour accuracy.
A notable special case of direct coding is the representation of black-and- white (bi-level)
and gray-scale images, where the three primaries always have the same value and hence
need not be coded separately. A black-and- white image requires only one bit per pixel, with
bit value 0 representing black and bit 1 representing white. A grey scale image is typically
coded with 8 per pixel to allow a total of 256 intensity or gray levels.
Although the direct coding method features simplicity and has supported a variety of
applications, there is a relatively high demand for storage space when it comes to the 24-bit
standard. For example, a 1000x1000 true colour image would take up three million bytes.
Further, even if every pixel in that image had a different colour, there would be only one
million colours in the image. In many applications the number of colours that appear in any
one particular image is much less. Therefore the 24-bit representation’s ability to have 16.7
million different colours appear simultaneously in a single image seems to be overkill.
SELF-ASSESSMENT QUESTIONS - 5
7. LOOKUP TABLE
Lookup tables are tables that store numeric data in a multidimensional array format. In the
simpler two-dimensional case, lookup tables can be represented by matrices. Each element
of a matrix is a numerical quantity, which can be precisely located in terms of two indexing
variables. At higher dimensions, lookup tables can be represented by multidimensional
matrices, whose elements are described in terms of a corresponding number of indexing
variables.
Image representation using a lookup table can be viewed as a compromise between the
desire to have a lower storage requirement and the need to support a reasonably sufficient
number of simultaneous colors. In this approach pixel values do not code colours directly.
Instead, they are addresses or indices into a table of colour values. The colour of a particular
pixel is determined by the colour value in the table entry that the value of the pixel
references.
Figure 1.2 shows a lookup table with 256 entries. The entries have addresses 0 through 255.
Each entry contains a 24-bit RGB colour value. Considering that pixel values are 1-byte, or 8-
bit, quantities, the colour of a pixel whose value is I, where 0 ≤ i ≤ 255, is determined by the
colour value in the table entry whose address is i. This 24-bit 256 entry lookup table
representation is often referred to as the 8-bit format. It reduces the storage requirement of
a 1000 x 1000 image to one million bytes plus 768 bytes for the colour values in the lookup
table. It allows 256 simultaneous colours that are chosen from 16.7 million possible colours.
It is important to remember that, using the lookup table representation, an image is defined
not only by its pixel values but also by the colour values in corresponding lookup table. These
colour values form a colour map for the image.
SELF-ASSESSMENT QUESTIONS - 6
20. 24-bit 256 entry lookup table representation is often referred to as_______________
representation.
21. _____________ is commonly referred to as the true colour
22. The colour of a particular pixel is determined by the in the lookup table entry.
8. GRAPHICS DEVICES:
Graphics storage devices: Memory cards, external or internal hard drives, Cloud, USB flash
drives, and laptop computers are all suitable for digital image storage.
Graphics Input and output devices: Graphics input devices are classified into two types and
they are Manual data entry devices and Direct data entry devices. Manual input devices are
those peripheral devices through which the user can enter the data manually (by hand) at
the time of processing. Eg: Keyboard,mouse,joystick, Touch screen , touchpad etc. Direct data
devices are those peripheral devices through which we can directly input the data from the
source and transfer that to the computer system.Eg: Scanner,barcode reader, MICR(Magnetic
Ink Character reader),OCR(Optical character recognition), Sensors, Biometric systems etc.
Computer graphics software: Graphics software is a type of computer programme used for
image creation and editing. There is a vast selection of graphics software available on the
market, ranging from simple applications that allow users to generate and edit simple images
to complex tools that can be used to build intricate 3D models and animations. Adobe
Photoshop, Corel Painter, and Autodesk Maya are a few of the most popular graphics
software applications.
The graphics software components are the tools that you use to create and manipulate your
graphic images. These components include the following:
Image editors : These are the tools used to produce and modify graphic images. Photoshop,
Illustrator, and Inkscape are widespread
Vector graphics editors: ools used to generate or modify vector graphics. CorelDRAW and
Inkscape are well-known vector graphics editors.
3D modeling software: This is the software that you use to create three-dimensional models.
Common 3D modeling software includes Maya, 3ds Max, and Cinema 4D.
Animation software: This is the software that you use to create animations. Common
animation software includes Adobe After Effects, Apple Motion, and Autodesk Maya.
Video editing software: This is the software that you use to edit videos. Common video
editing software includes Adobe Premiere Pro, Apple Final Cut Pro, and Avid Media
Composer
• Vector graphics software: This type of software is used to create images made up of
lines and shapes, which can be scaled without losing quality. Vector graphics are often
used for logos, illustrations, and diagrams.
• Raster graphics software: This type of software is used to create images made up of
pixels, which cannot be scaled without losing quality. Raster graphics are often used for
photos and web graphics.
• 3D graphics software: This type of software is used to create three-dimensional
images and animations. 3D graphics are often used for product visualization and
gaming.
• Animation software: This type of software is used to create moving images, either by
animating existing graphics or by creating new ones from scratch. Animation software
is often used for movies, commercials, and video games.
SELF-ASSESSMENT QUESTIONS - 7
9. SUMMARY
We shall now summarise the unit content. We started our discussion with the overview of
graphics. Typically, the term computer graphics refers to several different things: the
representation and manipulation of image data by a computer, and the various technologies
used to create and manipulate the images. We also discussed the advantages of the
interactive graphics. We can say that computer graphics concerns the pictorial synthesis of
real or imaginary objects which require to represent the imaginary objects. We also explored
the concept of visualization. Today, visualization has ever- expanding applications in science,
education, engineering (example, product visualization), interactive multimedia, medicine,
etc. The RGB colour model is an additive colour model in which red, green, and blue light is
added together in various ways to reproduce a broad array of colours. We concluded this
unit with the discussion of direct and lookup table, which deal with the discussion of pixel
colour representation.
11. ANSWERS
Self Assessment Questions
1. User interaction
2. True
3. Visual display
4. Computer Aided Drawing
5. It is interactive graphics facility to change shape, color and other properties.
6. Digital Signal Processing.
7. Passive graphics
8. b) Pattern detection
9. Optical character recognition
10. Picture analysis
11. Data Visualization
12. Traditional areas of scientific visualization are flow visualization, medical visualization,
astrophysical visualization, and chemical visualization
13. True
14. Darkest color black
15. True
16. Primary color
17. 16.7 million
18. gray-scale images
19. Yes
20. 8 bit format
21. Colour value
22. Four
23. Raster graphics software
24. Manual data entry devices and Direct data entry devices.
1. Typically, the term computer graphics refers to several different things: the
representation and manipulation of image data by a computer the various technologies.
For further details refer section 1.2.
2. Today, a high quality graphics display of personal computer provide one of the most
natural means of communication with a computer. It provides tools for producing
pictures not only of concrete, “real-world” objects but also of abstract. For more details
refer section 1.3.
3. The image enhancement deals with the improvement in the image quality by eliminating
noise or by increasing image contrast. Pattern detection and recognition deal with the
detection and clarification of standard patterns and finding deviations from these
patterns. For more details refer section 1.4.
4. Visualization is a technique for creating images, diagrams, or animations to
communicate a message. Visualization through visual imagery has been an effective way
to communicate both abstract and concrete ideas. For more details refer section 1.5.
5. The RGB color model is an additive colour model in which red, green, and blue light is
added together in various ways to reproduce a broad array of colours. For more details
refer section 1.6.
6. Image representation is essentially the representation of pixel colours. Using direct
coding we allocate a certain amount of storage space for each pixel to code its colour. In
this approach pixel values do not code colours directly. Instead, they are addresses or
indices into a table of colour values. For more details refer section 1.8.
7. Computer-aided drawing is used in the design of structures, vehicles, and aero planes.
This aids in adding minute details to the drawing and produces more precise, sharp, and
detailed drawings with superior specifications. For more details refer section 1.1.
8. Graphics input devices are classified into two types and they are Manual data entry
devices and Direct data entry devices. Manual input devices are those peripheral devices
through which the user can enter the data manually (by hand) at the time of processing.
Eg: Keyboard,mouse,joystick, Touch screen , touchpad etc. Direct data devices are those
peripheral devices through which we can directly input the data from the source and
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 2
Scan Conversion
Table of Contents
1. INTRODUCTION
In the previous unit, we discussed the different types of storage and display devices used for
the graphical system. We explored the various display devices like storage graphics display,
raster-scan display and 3D viewing devices. We also learnt about the various input and
output units which are exclusively used for graphical purposes like plotters, printers,
digitizers, light pens and so on. Active and passive graphic devices were discussed and the
unit concluded with a note on the various components of computer graphic software.
In this unit we will discuss what points are and how to draw lines in graphics. A line drawing
algorithm is a graphical algorithm for approximating a line segment on discrete graphical
media. In computer graphics, a hardware or software implementation of a digital differential
analyzer (DDA) is used for linear interpolation of variables over an interval between start
and end point. In this unit DDA algorithm is discussed for the purpose of learning to draw a
line. We will also learn about one more line drawing algorithm called Bresenham’s line
algorithm, which determines which points in an n-dimensional raster should be plotted in
order to form a close approximation to a straight line between two given points. It is
commonly used to draw lines on a computer screen. Along with line drawing algorithms we
have a discussion on the properties of a circle and the algorithm used to draw a circle.
Bresenham's circle algorithm (also known as a midpoint circle algorithm) is an algorithm for
determining the points needed for drawing a circle with a given radius and the origin for the
circle. The scan converts a circle of a specified radius, centered at a specified location. We
will also discuss the ellipse generating algorithm, scan filling algorithm, boundary and flood
fill algorithms.
1.1 Objectives:
Line drawing is accomplished by calculating intermediate positions along the line path
between two specified endpoint positions. An output device is then directed to fill in these
positions between the endpoints. For analog devices, such as a vector pen plotter or a
random-scan display, a straight line can be drawn smoothly from one endpoint to the other.
Linearly varying horizontal and vertical deflection voltages are generated that are
proportional to the required changes in the x and y directions to produce a smooth line.
Digital devices display a straight-line segment by plotting discrete points between the two
endpoints. Discrete coordinate positions along the line path are calculated from the equation
of the line. For a raster video display, the line color (intensity) is then loaded into the frame
buffer at the corresponding pixel coordinates. The video controller, by reading from the
frame buffer "plots" the screen pixels. Screen locations are referenced with integer values,
so plotted positions may only approximate actual line positions between two specified
endpoints. For example, computed line position of (10.48, 20.51) would be converted to pixel
position (10, 21). Thus rounding of coordinate values to integers causes lines to be displayed
with a stairstep appearance ("the jaggies"), as represented in Fig 3.1.
The characteristic stairstep shape of raster lines is particularly noticeable on systems with
low resolution, and their appearance can be somewhat improved by displaying them on
high-resolution systems. More effective techniques for smoothing raster lines are based on
adjusting pixel intensities along the line paths. In this case it can be assumed that pixel
positions are referenced according to scan-line number and column number (pixel position
across a scan line). This addressing scheme is illustrated in Fig. 3.2. Scan lines are numbered
consecutively from 0, starting at the bottom of the screen; and pixel columns are numbered
from 0, left to right across each scan line.
Figure 2.2: Pixel positions referenced by scan line number and column number
To load a specified color into the frame buffer at a position corresponding to column x along
scan line y, we will assume that we have available a low- level procedure of the form SetPixel
(x, y). We may also want to be able to retrieve the current frame-buffer intensity setting for
a specified location. This can be accomplished with the low-level function getPixel (x, y).
SELF-ASSESSMENT QUESTIONS -1
A line connects two points. It is a basic element in graphics. To draw a line, you need two
points between which you can draw a line. A line drawing algorithm is a graphical algorithm
for approximating a line segment on discrete graphical media.
Y =m . x +b (Equation 2-1)
Here, m represents the slope of the line and b as they intercept, given that the two endpoints
of a line segment are specified at positions (x1, y1) and (x2, y2) as shown in Fig. 3.3.
Figure 2.3: Line path between endpoint positions (x1, y1) and (x2, y2).
We can determine values for the slope m and y intercept b with the following calculations:
𝑦2 −𝑦1
𝑚= 𝑥2 −𝑥1
(Equation 2-2)
Algorithms for displaying straight lines are based on the line equation 3-1 and the
calculations given in Equations 3-2 and 3-3.
For any given x interval ∆x along a line, we can compute the corresponding y interval ∆y from
equation 3-2 as
∆y = m∆ x (Equation 2-3)
∆𝑦
∆𝑥 = (Equation 2-4)
𝑚
These equations from (2-1 to 2-5) form the basis for determining deflection voltages in
analog devices. For lines with slope magnitudes | m | < 1, can be set proportional to a small
horizontal deflection voltage and the corresponding vertical deflection is then set
proportional to ∆y as calculated from Eq. 2-4. For lines whose slopes have magnitudes | m |
> 1, ∆y can be set proportional to a small vertical deflection voltage with the corresponding
horizontal deflection voltage set proportional to Ax, calculated from Eq. 2-5. For lines with
m = 1, ∆x = ∆y and the horizontal and vertical deflections voltages are equal. In each case, a
smooth line with slope m is generated between the specified endpoints.
On raster systems, lines are plotted with pixels, and step sizes in the horizontal and vertical
directions are constrained by pixel separations. That is, a line must be "sampled" at discrete
positions and the nearest pixel to the line determined at each sampled position. This scan
conversion process for straight lines is illustrated in Fig. 2.4, for a near horizontal line with
discrete sample positions along the x axis.
2.4 Figure 2.4 depicts the straight line with five sampling positions along the x axis
between x1 and x2.DDA Algorithm
Subscript k takes integer values starting from 1 for the first point, and increases by 1 until
the final endpoint is reached. Since m can be any real number between 0 and 1, the calculated
y values must be rounded to the nearest integer. For lines with a positive slope greater than
1, the roles of x and y are reversed, which means that sampling is done at unit y intervals (∆y
= 1) and each succeeding x value calculated as
1
𝑥𝑘+1 = 𝑥𝑘 + 𝑚 (Equation 2-7)
Equations 2-6 and 2-7 are based on the assumption that lines are to be processed from the
left endpoint to the right endpoint (Fig. 3.3). If this processing is reversed so that the starting
endpoint is at the right, then either we have ∆x = -1 and
1
𝑥𝑘+1 = 𝑥𝑘 − 𝑚 (Equation 2-9)
Equations 2-6 through 2-9 can also be used to calculate pixel positions along a line with
negative slope. If the absolute value of the slope is less than 1 and the start endpoint is at the
left, we set ∆x = 1 and calculate y values with equation. 2-6. When the start endpoint is at the
right (for the same slope), we set ∆x = -1 and output primitives obtain y positions from
equation 2-8. Similarly, when the absolute value of a negative slope is greater than 1, we use
∆y = -1 and equation 2-9 or we use ∆y = 1 and equation 2-7.
This algorithm is summarized in the following procedure, which accepts as input the two
endpoint pixel positions. Horizontal and vertical differences between the endpoint positions
are assigned to parameters dx and dy. The difference with the greater magnitude determines
the value of parameter steps. Starting with pixel position (xa, ya), the offset needed at each
step to generate the next pixel position along the line path can be determined. We loop
through this process steps times. If the magnitude of dx is greater than the magnitude of dy
and xa is less than xb, the values of the increments in the x and y directions are 1 and m,
respectively. If the greater change is in the x direction, but xa is greater than xb, then the
decrements - 1 and -m are used to generate each new point on the line. Otherwise, we use a
unit increment (or decrement) in the y direction and an x increment (or decrement) of l / m.
#include “device. h"
# define ROUND(a) ((int)(a+0.5)
void line DDA (int xa, int ya, int xb, int yb)
{
x += xIncrement;
y += yIncrement;
setpixel (ROUND(x), ROVND(y));
}
The} DDA algorithm is a faster method for calculating pixel positions than the direct use of
equation 2-1. It eliminates the multiplication in equation 3-1 by making use of raster
characteristics, so that appropriate increments are applied in the x or y direction to step to
pixel positions along the line path. The accumulation of round off error in successive
additions of the floating- point increment, however, can cause the calculated pixel positions
to drift away from the true line path for long line segments. Further, the rounding operations
and floating-point arithmetic in procedure line DDA are still time- consuming. The
performance of the DDA algorithm can be improved by separating the increments m and l/m
into integer and fractional parts so that all calculations are reduced to integer operations.
SELF-ASSESSMENT QUESTIONS -2
Figure 2.5 tries to show the Bresenham's line drawing algorithm visually. From the figure it
can be understood that it is impossible to draw the true line that we want because of the
pixel spacing (in other words there is not enough precision for drawing true lines on a PC
monitor especially when dealing with low resolutions).
The way the algorithm works is described here. First it decides which axis is the major axis
and which is the minor one. The major axis is longer than the minor axis. In the picture
illustrated above, the major axis is the X axis. Each iteration progresses the current value of
the major axis (starting from the original position), by exactly one pixel. Then it decides
which pixel on the minor axis is appropriate for the current pixel of the major axis.
How can one approximate the right pixel on the minor axis that matches the pixel on the
major axis? - That is what Bresenham's line-drawing algorithm is all about. It does so by
checking which pixel's center is closer to the true line. In the picture above it would be easy
to identify these pixels by looking at them. The center of each pixel is marked with a dot. The
algorithm takes the coordinates of that dot and compares it to the true line. If the span from
the center of the pixel to the true line is less or equal to 0.5, the pixel is drawn at that location.
That span is more generally known as the error term.
It must be understood here that the whole algorithm is done in straight integer math with
no multiplication or division in the main loops (no fixed point math either). Basically, during
each iteration through the main drawing loop, the error term is tossed around to identify the
right pixel as close as possible to the true line. Let us now consider these two deltas between
the length and height of the line: dx = x1 - x0; dy = y1 - y0; This is a matter of precision and
since we are working with integers it is necessary to scale the deltas by 2, generating two
new values: dx2 = dx*2; dy2 = dy*2. These are values that will be used to change the error
term. The error term must be first initialized to 0.5 and that cannot be done using an integer.
Further, finally the scaled values must be subtracted by either dx or dy (the original, unscaled
delta values) depending on what the major axis is (either x or y).
Consider drawing a line on a raster grid where we restrict the allowable slopes of the line to
the range 0 ≤ 𝑚 ≤ 1 . If we further restrict the line- drawing routine so that it always
increments x as it plots, it becomes clear that, having plotted a point at (x, y), the routine has
a severely limited range of options as to where it may put the next point on the line:
So, working in the first positive octant of the plane, line drawing becomes a matter of
deciding between two possibilities at each step. The diagram 3.6 depicts the situation where
the plotting program finds itself having plotted (x, y).
In plotting (x, y) the line drawing routine will, in general, be making a compromise between
what it would like to draw and what the resolution of the screen actually allows it to draw.
Usually the plotted point (x, y) will be in error, and the actual, mathematical point on the line
will not be addressable on the pixel grid. So we associate an error, , with each y ordinate, the
real value of y should be . This error will range from -0.5 to just under +0.5. In moving from
x to x+1 we increase the value of the true (mathematical) y-ordinate by an amount equal to
the slope of the line, m. We will need to choose to plot (x+1, y) if the difference between this
new value and y is less than 0.5.
Otherwise we will plot (x+1, y+1). It should be clear that by doing so, we minimize the total
error between the mathematical line segment and what actually gets drawn on the display.
The error resulting from this new point can now be written back into ∈. This will allow us to
repeat the whole process for the next point along the line, at x+2.The new value of error can
adopt one of two possible values, depending on what new point is plotted. If (x+1,y) is
chosen, the new value of error is given by:
Otherwise it is:
This gives an algorithm for a DDA which avoids rounding operations, instead using the error
variable to control plotting:
∈← 0, 𝑦 ← 𝑦1
𝒇𝒐𝒓 𝑥 ← 𝑥1 𝒕𝒐 𝑥2 𝒅𝒐
𝑰𝒇 (∈ +𝑚 < 0.5) ∈← ∈ +𝑚
Else
𝑦 ← 𝑦 + 1, ∈←∈ +𝑚 − 1
End If
End for
This still employs floating point values. Consider, however, what would happen if we
multiply across both sides of the plotting test by and then by 2:
∈ +𝑚 < 0.5
∆𝑦
∈ + < 0.5
∆𝑥
2 ∈ ∆𝑥 + 2∆𝑦 < ∆𝑥
The update rules for the error on each step may also be cast into ∈′ form.
∈←∈ +𝑚
∈←∈ +𝑚 − 1
∈ ∆𝑥 ← ∈ ∆𝑥 + ∆𝑦
∈ ∆𝑥 ← ∈ ∆𝑥 + ∆𝑦 − ∆𝑥
Which is in ∈ ′ form.
∈′ ←∈′ + ∆𝑦
∈′ ←∈′ + ∆𝑦 − ∆𝑥
Using this new “error'' value, ∈′ , with the new test and update equations gives Bresenham's
integer-only line drawing algorithm:
∈′ ← 0, 𝑦 ← 𝑦1
𝑭𝒐𝒓 𝑥 ← 𝑥1 𝒕𝒐𝑥2 𝒅𝒐
∈′ ←∈′ + ∆𝑦
Else
𝑦 ← 𝑦 + 1, ∈′ ←∈′ + ∆𝑦 − ∆𝑥
End If
End For
So we can conclude by saying that, integer only is efficient and fast and multiplication by 2
can be implemented by left-shift. This version limited to slopes in the first octant, 0 ≤ 𝑚 ≤
1..
Below is the Bresenham algorithm for line segments in the first octant.
int dx = x2 - x1, dy
= y2 - y1, y =
y1,
eps = 0;
(x,y,colour);
eps += dy;
eps -= dx;
SELF-ASSESSMENT QUESTIONS -3
Since the circle is a frequently used component in pictures and graphs, a procedure for
generating either full circles or circular arcs is included in most graphics packages. Generally,
a single procedure can be provided to display either circular or elliptical curves.
A circle is defined as the set of points that are all at a given distance r from a center position
(xc, yc) as shown in figure 2.7. This distance relationship is expressed by the Pythagorean
theorem in Cartesian coordinates as
Figure 2.7: Circle with center coordinates (xc, yc) and radius r.
This equation can be used to calculate the position of points on a circle circumference by
stepping along the x axis in unit steps from xc - r to xc + r and calculating the corresponding
y values at each position as
and use the equation 2-11 to compute the pixels of the circle. At the end of it, we are likely to
find a code that looks something like:
But this is not the best method for generating a circle. One problem with this approach is that
it involves considerable computation at each step. Moreover, the spacing between plotted
pixel positions is not uniform, as demonstrated in Figure 2.11.
The spacing can be adjusted by interchanging x and y (stepping through y values and
calculating x values) whenever the absolute value of the slope of the circle is greater than 1.
But this merely increases the computation and processing required by the algorithm.
Figure 2.8: Positive half of a circle plotted with equation 2-12 and with (xc, yc) = (0,0)
Another way to eliminate the unequal spacing shown in figure 2.8 is to calculate points along
the circular boundary using polar coordinates r and θ
(figure 2.9). Expressing the circle equation in parametric polar form yields the following pair
of equations.
𝑥 = 𝑥𝑐 + 𝑟 𝑐𝑜𝑠𝜃
When a display is generated with these equations using a fixed angular step size, a circle is
plotted with equally spaced points along the circumference. The step size chosen for 8
depends on the application and the display device. Larger angular separations along the
circumference can be connected with straight line segments to approximate the circular
path. For a more continuous boundary on a raster display, the step size at 1/r can be set up.
This plots pixel positions that are approximately 1 unit apart.
Computation can be reduced by considering the symmetry of the circle. The shape of the
circle is similar in each quadrant. One can generate the circle section in the second quadrant
of the xy plane by noting that the two circle sections are symmetric with respect to the axis.
The circle sections in the third and fourth quadrants can be obtained from sections in the
first and second quadrants by considering symmetry about the x axis. This can be taken
further and it can be noted that there is also symmetry between octants. Circle sections in
adjacent octants within one quadrant are symmetric with respect to the 450 line dividing the
two octants. Figure 2.9 depicts that the calculation of a circle point (x, y) in one octant yields
the circle points shown for the other seven octants.
These symmetry conditions are illustrated in figure 2.9, where a point at position (x, y) on a
one-eighth circle sector is mapped into the seven circle points in the other octants of the xy
plane. Taking advantage of the circle symmetry in this way it is possible to generate all pixel
positions around a circle by calculating only the points within the sector from x = 0 to x = y.
In computer graphics, the midpoint circle algorithm is an algorithm used to determine the
points needed for drawing a circle. The algorithm is a variant of Bresenham's line algorithm,
and is thus sometimes known as Bresenham's circle algorithm.
We will try to simplify the function evaluation that takes place on each iteration of the circle-
drawing algorithm. Our first objective is to simplify the function evaluation that takes place
on each iteration. In midpoint circle algorithm the screen center point is located at (xc, yc)
and calculated pixel position id (x, y).
xc = xc + x
yc = yc + y
circle (x, y) = x2 + y2 – r2 any point (x, y) on the boundary of the circle with radius r satisfies
the equation circle(x, y)=0, that is
Figure 4.4(a) depicts the situation for the circle when f(x, y)=0.
The circle function tests in the above equation 4-4, performed for the midpoints between
pixels near the circle path at each sampling step. The figure 2.11 (b) shows the midpoint
between two pixels. Assuming that we have just plotted the pixel at (xk, yk), we next need to
determine whether the pixel at position (xk+1, yk) or the one at position (xk+1, yk-1) is
closer to the circle.
If pk > 0, this midpoint is inside the circle and the pixel on scan line yk is closer to the circle
boundary. Otherwise, the mid-position is outside or on the circle boundary, and we select
the pixel on scan-line yk-1.
Successive decision parameters are obtained using incremental calculations. pk+1 = ¦circle
(xk+1 + 1, yk+1 – ½ )
= [ (xk+1) + 1 ]2 + [yk+1 – ½ ]2 – r2
or
where yk+1 is either yk or yk-1, depending on the sign of pk. Increments for obtaining pk+1
are either 2xk+1 + 1 (if pk is negative) or 2xk+1 + 1 – 2yk+1. Evaluation of the terms 2xk+1
and 2yk+1 can also be done incrementally as
2xk+1 = 2xk + 2
2yk+1 = 2yk – 2
Each successive value is obtained by adding 2 to the previous value of 2x and subtracting 2
from the previous value of 2y. The initial decision parameter is obtained by evaluating the
circle function at the start position (x0, y0) = (0, r):
p0 = circle (1, r- ½)
= 1 + (r- ½)2 – r2
Or
p0 = 5/4 – r
1. Input radius r and circle centre (xc, yc), and obtain the first point on the circumference
of a circle centered on the origin as (x0, y0) = (0, r).
2. Calculate the initial value of the decision parameter as p0 = 5/4 – r.
3. At each xk position, starting at k=0, perform the following test: if pk < 0, the next point
along the circle centered on (0, 0) is (xk+1, yk) and p
pk+1 = pk + 2xk+1 + 1
Otherwise, the next point along the circle is (xk+1, yk-1) and
SELF-ASSESSMENT QUESTIONS - 4
12. In circle, spacing between plotted pixels can be adjusted by interchanging x and
y, when the absolute value of the slope of the circle is __________ .
13. Midpoint circle algorithm is otherwise called as_______________.
14. Circle sections in adjacent octants within one quadrant are symmetric with
respect to the____________degree line dividing the two octants.
15. Circle (x, y) = x2 + y2 – r2 any point (x, y) on the boundary of the circle with radius
r satisfies the equation when .
Loosely stated, an ellipse is an elongated circle. Therefore, elliptical curves can be generated
by modifying circle-drawing procedures to take into account the different dimensions of an
ellipse along the major and minor axes.
An ellipse is defined as the set of points such that the sum of the distances from two fixed
positions (foci) is the same for all points as depicted in figure 2.12.
If the distances to the two foci from any point P = (x, y) on the ellipse are labeled d1 and d2,
then the general equation of an ellipse can be stated as
Expressing distances d1, and d2, in terms of the focal coordinates F1 = (x1, y1) and F2 = (x1,
y2), we have
By squaring this equation, isolating the remaining radical, and then squaring again, we can
rewrite the general ellipse equation in the form
where the coefficients A, B, C, D, E, and F are evaluated in terms of the focal coordinates and
the dimensions of the major and minor axes of the ellipse. The major axis is the straight line
segment extending from one side of the ellipse to the other through the foci. The minor axis
spans the shorter dimension of the ellipse, bisecting the major axis at the halfway position
(ellipse center) between the two foci.
An interactive method for specifying an ellipse in an arbitrary orientation is to input the two
foci and a point on the ellipse boundary. With these three coordinate positions, it is possible
to evaluate the constant in equation 2-15. Then, the coefficients in equation can be evaluated
and used to generate pixels along the elliptical path. Ellipse equations are greatly simplified
if the major and minor axes are oriented to align with the coordinate axes. In figure 2.13, an
ellipse is shown in "standard position" with major and minor axes oriented parallel to the x
and y axes. Parameter r, in this case labels the semi major axis, and parameter ry, the semi
minor axis.
Figure 2.13: Ellipse centered at (xc, yc) with major and minor axis
The equation of the ellipse shown in figure 2.13 can be written in terms of the ellipse center
coordinates and parameters r, and ry, as
2
𝑥−𝑥𝑐 2 𝑦−𝑦𝑐
( ) +( ) = 1 (Equation 2-17)
𝑟𝑥 𝑟𝑦
Using polar coordinates r and θ. we can also describe the ellipse in standard position with
the parametric equations:
An ellipse has somewhat less symmetry than a circle, so more computations may be required
to find its pixel coordinates. To avoid computations where the slope (rate of change of one
variable with respect to change in the other) is too large, we will need to split the first
quadrant into a region where,
∆𝑦 ∆𝑦 ∆𝑥 ∆𝑥
|∆𝑥 | ≤ 1 |∆𝑥 | ≤ 1 |∆𝑦| ≤ 1 |∆𝑦| ≤ 1 and a region where
which has the property that As with the circle algorithm, we can start at the top point (0, b)
and take unit steps in the x direction until we reach the point where the slope is -1 where we
switch to unit steps in the (negative) y direction. At each step we will have to compute the
slope of the tangent line:
𝑑𝑦 𝑑𝑦/𝑑𝑚 2𝑏 2 𝑥
= =
𝑑𝑥 𝑑𝑥/𝑑𝑚 2𝑎2 𝑦
At the point where dy/dx = −1, b2x = a2y, we can use the sign of a2y − b2x to determine where
to switch from delta x to delta y zones. Starting at (0, b), the midpoint algorithm, as applied
to the ellipse, entails evaluating the decision parameter:
1 1 2
𝑝𝑥𝑘 = 𝑓 (𝑥𝑘 + 1, 𝑦𝑘 − ) = 𝑏 2 (𝑥𝑘 + 1)2 + 𝑎2 (𝑦𝑘 − ) − 𝑎2 𝑏 2
2 2
Again, the decision parameter is negative only if the midpoint between adjacent candidate
pixels is inside the ellipse boundary, The coordinates of the next pixel to color will be either
(xk+1, yk) or (xk+1, yk−1) depending on whether the decision parameter is negative or not.
This can be adapted to the midpoint algorithm to plot ellipses with rotated axes by using
𝐴𝑥 2 + 𝑏𝑦 2 + 𝐶𝑥𝑦 + 𝐷𝑥 + 𝐸𝑦 + 𝐹 = 0 to work the test parameter for the entire perimeter, or
to plot it without rotation and then rotate it using some efficient rotation routine.
So we will now discuss the pseudo code for an ellipse midpoint algorithm:
1. Get parameters a, b, h, k for center coordinate h and k and major and minor axis lengths
2a and 2b.
2. Calculate the initial decision parameter value in the first region:
𝑎2 𝑎2
𝑝𝑥0 = 𝑏 2 − 𝑎2 𝑏 + 𝑝𝑥0 = 𝑏 2 − 𝑎2 𝑏 +
4 4
SELF-ASSESSMENT QUESTIONS - 5
The scan line fill algorithm is an ingenious way of filling in irregular polygons. The algorithm
begins with a set of points. Each point is connected to the next, and the line between them is
considered to be an edge of the polygon. The points of each edge are adjusted to ensure that
the point with the smaller y value appears first. Next, a data structure is created which
contains a list of edges that begin on each scan line of the image. The program progresses
upwards from the first scan line. For each line, pixels that contain an intersection between
this scan line and an edge of the polygon are filled in. Then, the algorithm progresses along
the scan line, turning on when it reaches a polygon pixel and turning off when it reaches
another one, all the way across the scan line.
There are two special cases that are solved by this algorithm. First, a problem may arise if
the polygon contains edges that are partially or completely out of the image. The algorithm
solves this problem by moving pixel values that are outside the image to the boundaries of
the image. This method is preferable to eliminating the pixel completely, because its deletion
could result in a "backwards" coloring of the scan line which means that pixels that should
be on are off and vice versa.
The second case has to do with the concavity of the polygon. If the polygon has a concave
portion, the algorithm will work correctly. The pixel on which the two edges meet will be
marked twice, so that it is turned off and then on. If, however, the polygon is convex at the
intersection of two edges, the coloring will turn on and then immediately off, resulting in
"backwards" coloring for the rest of the scan line. The problem is solved by using the vertical
location of the next point in the polygon to determine the concavity of the current portion.
Overall, the algorithm is very robust that the main challenge it faces is with polygons that
have large number of edges, like circles and ellipses. Filling in such a polygon could be very
costly.
1. Find the intersections of the scan line with all edges of the polygon.
We will discuss now the scan line algorithm with an example depicted in figure 2.14.
It can be observed in figure 2.14, that for scan line number 8 the sorted list of x-coordinates
is (2, 4, 9, 13) (b and c are initially no integers). Therefore pixels are filled with x-coordinates
2-4 and 9-13.
Edge Coherence
Processing Polygons
• Active edges are sorted according to increasing X. Filling the scan line starts at the
leftmost edge intersection and stops at the second. It restarts at the third intersection
and stops at the fourth. . .
1. The edge table (ET) shown in figure 2.15, with edges entries sorted I in increasing y
and x of the lower end.
2. ymax: max y-coordinate of edge
3. xmin: x-coordinate of lowest edge point
1. Set y to smallest y with entry in ET, that is, y for the first non-empty bucket.
2. Init Active Edge Table (AET) to be empty.
3. Repeat until AET and ET are empty:
a. Move form ET bucket y to the AET those edges whose ymin=y (entering edges)
b. Remove from AET those edges for which y=ymax (not involved in next scan line),
then sort AET (remember: ET is presorted)
c. Fill desired pixel values on scan line y by using pairs of x-coords from AET
d. Increment y by 1 (next scan line)
e. For each nonvertical edge remaining in AET, update x for new y
As per the reference of polygon depicted in the figure 2.16, scan line 9 will be and scan line
10 is
SELF-ASSESSMENT QUESTIONS - 6
We will now discuss the boundary fill algorithm. Here we must start at a point inside the
figure and paint with a particular colour, this filling continues until a boundary colour is
encountered. There are two ways to do this process. First we will discuss the four connected
fill where we recommend the process from left right up down
The above problem leads to the eight-connected fill algorithm where we test all eight
adjacent pixels. So we can add the four more following calls.
The above 4-fill and 8-fill algorithms involve heavy duty recursion which may consume
memory and time. Better algorithms are faster, but more complex. They make use of pixel
runs (horizontal groups of pixels).
Sometimes we may want to fill in (or recolour) an area that is not defined within a single
colour boundary. We can paint such areas by replacing a specified interior colour instead of
searching for a boundary colour value. This approach is called a flood-fill algorithm. Here
we start from a specified interior point (x, y) and reassign all pixel values that are currently
set to a given interior colour with the desired fill colour. If the area we want to paint has
more than one interior colour, we can first reassign pixel values so that all interior points
have the same color. Using either a 4-connected or 8-connected approach, we then step
through pixel positions until all interior points have been repainted. The following
procedure flood fills a 4-connected region recursively, starting from the input position.
setpixel (x, y ) :
We can modify procedure floodFill4 to reduce the storage requirements of the stack by filling
horizontal pixel spans, as discussed for the boundary-fill algorithm. In this approach, we
stack only the beginning positions for those pixel spans having the value oldcolor. Starting
at the first position of each span, the pixel values are replaced until a value other than
oldcolor is encountered.
We display a filled polygon in PHlGS and GKS with the function fillArea (n, wcvertices). The
displayed polygon area is bound by a series of n straight line segments connecting the set of
vertex positions specified in wcvertices. These packages do not provide fill functions for
objects with curved boundaries. Implementation of the fillArea function depends on the
selected type of interior fill. We can display the polygon boundary surrounding a hollow
interior, or we can choose a solid color or pattern fill with no border for the display of the
polygon. For solid fill, the fillArea function is implemented with the scan-line fill algorithm
to display a single color area. Another polygon primitive available in PHlGS is fillAreaSet.
This function allows a series of polygons to be displayed by specifying the list of, vertices for
each polygon. Also, in other graphics packages, functions are often provided for displaying a
variety of commonly used fill areas besides general polygons.
SELF-ASSESSMENT QUESTIONS - 7
20. __________ and _____________ are the two algorithm used for boundary fill process.
21. How many parameters required for eight_fill procedure?
a. 3 b. 4 c. 5 d. 2
22. 4-fill and 8-fill algorithms which involves recursion will not consume memory and
time. (True/False)
10. SUMMARY
This unit provides information about what points and lines are and the role of the line
drawing algorithm. A line drawing algorithm is a graphical algorithm for approximating a
line segment on discrete graphical media. On discrete media, such as pixel-based displays
and printers, line drawing requires such an approximation (in nontrivial cases). By contrast
continuous media does not require algorithm to draw a line. In this unit we also discussed
two line drawing algorithms called DDA and Bresenham algorithm. The digital differential
analyzer (DDA) is a scan conversion line algorithm based on calculation of either Dy or Dx.
The line at unit intervals is one coordinate and determines the corresponding integer values
the nearest line for the other coordinate and considers first a line with positive slope. The
Bresenham line algorithm is an algorithm which determines which points in an n-
dimensional raster should be plotted in order to form a close approximation to a straight line
between two given points. It is commonly used to draw lines on a computer screen. Also the
simple circle drawing algorithm has large gaps where the slope approaches vertically and
also with regard to efficiency of calculation. Eight-way symmetry gives more efficient circle
centered at (0, 0). Similarly there is an incremental algorithm for drawing circles called mid-
point circle algorithm using eight-way symmetry. Elliptical curves can be generated by
modifying circle-drawing procedures to take into account the different dimensions of an
ellipse along the major and minor axes. We also discussed the algorithm which generates
ellipse. Scan line fill algorithm intersect scan line with polygon edges and fill between pairs
of intersections. We concluded this unit with a discussion on boundary and flood fill
algorithms to fix a boundary and fill those boundaries with the specified colour.
12. ANSWERS
1. Points and lines are two of the most fundamental concepts in Geometry, but they are
also the most difficult to define. A point is a location in space. As for a line segment, we
specify a line with two points for more details. Refer section 2.1.
2. Line is a continuous object. However, the computer monitor (screen) consists of a
matrix of pixels. So then, how do we represent the continuous object on this discrete
matrix for more details. Refer section 2.3.
3. The digital differential analyzer (DDA) is a scan conversion line algorithm based on
calculation either Dy or Dx. The line at unit intervals is one coordinate and determine
corresponding integer values nearest line for the other coordinate. For more details
Refer sub- section 2.1.
4. The Bresenham's line-drawing algorithm is based on drawing an approximation of the
true line. The true line is indicated in bright color, and its approximation is indicated in
black pixels. For more details Refer sub- section 2.2.
5. Consider drawing a line on a raster grid where we restrict the allowable slopes of the
line to the range 0 m 1. If we further restrict the line- drawing routine so that it
always increments x as it plots, it becomes clear that,having plotted a point at (x, y). For
more details Refer sub- section 2..2.
6. A circle is a simple shape of Euclidean geometry consisting of the set of points in a plane
that are a given distance from a given point, the centre. The distance between any of
the points and the centre is called the radius. Circles are simple closed curves which
divide the plane into two regions. For more details Refer section 2.6.
7. The midpoint circle algorithm is an algorithm used to determine the points needed for
drawing a circle. The algorithm is a variant of Bresenham's line algorithm. For more
details Refer section 2.7.
8. An ellipse is defined as the set of points such that the sum of the distances from two
fixed positions (foci) is the same for all points. For more details Refer section (2.7).
9. Scan line algorithm intersect scan line with polygon edges and fill between pairs of
intersections. For more details Refer section 2.8.
10. Boundary fill algorithm, start at a point inside the figure and paint with a particular
color, this filling continues until a boundary color is encountered. For more details
Refer section 2.8.
11. Sometimes we want to fill in (or recolor) an area that is not defined within a single color
boundary. We can paint such areas by replacing a specified interior color instead of
searching for a boundary color value. For more details Refer section 2.6.
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 3: 2D Transformation 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Unit 3
2D Transformation
Table of Contents
8 Terminal Questions - - 26
9 Answers - - 27 - 28
Unit 3: 2D Transformation 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1. INTRODUCTION
In the previous unit, we discussed the scan conversion process which included circle
generation algorithm, ellipse generating algorithm, scan line polygon, fill algorithm and flood
fill algorithm.
In this unit we will discuss the basic geometric transformations of 2D like translation,
rotation, and scaling. Other transformations that are often applied to objects include
reflection and shear. We will also discuss the basic transformations which can be expressed
in matrix form. But many graphic applications involve sequences of geometric
transformations. Hence a general form of matrix is required for representing such
transformations. In geometry, a coordinate system is a system which uses one or more
numbers or coordinates, to uniquely determine the position of a point or other geometric
elements. We will also discuss the transformation of the coordinate system.
1.1 Objectives:
Unit 3: 2D Transformation 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
2. BASIC TRANSFORMATIONS
Animations are produced by moving the 'camera' or the objects in a scene along animation
paths. Changes in orientation, size and shape are accomplished with geometric
transformations that alter the coordinate descriptions of the objects. The basic geometric
transformations are translation, rotation, and scaling. Other transformations that are often
applied to objects include reflection and shear. In 3D graphics, it is necessary to use 3D
transformations. However, 3D transformations can be quite confusing so it helps to begin
with 2D.
2.1Translation
The most basic transformation is translation. The formal definition of a translation is "every
point of the pre-image is moved the same distance in the same direction to form the image."
In figure 3.1, where the triangle translation explains the concept.
Each translation follows a rule. In this case, the rule is "5 to the right and 3 up." It is also
possible to translate a pre-image to the left, down, or any combination of two of the four
directions. More advanced transformation geometry is done on the coordinate plane. The
transformation for this example would be T(x, y) = (x+5, y+3).
2.2 Rotation
A rotation is a transformation that is performed by "spinning" the object around a fixed point
known as the center of rotation. It is also possible to rotate the object at any degree measure,
Unit 3: 2D Transformation 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
but 90° and 180° are two of the most common rotations and also, rotations are done
counterclockwise.
The figure 3.2 (a) shown at the right is a rotation of 90°, rotated around the center of rotation.
It is essential to note that all dotted lines are the same distance from the center or rotation
than from the point. Also all the dashed lines form 90° angles. That's what makes the rotation
a rotation of 90°. Figure 3.2 (b) an example of a rotation of 180°:
(a) (b)
Some geometry lessons go back to algebra by describing the formula that explains the
translation. In the example above, for a 180° rotation, the formula is:
This type of transformation is often called coordinate geometry because of its connection
back to the coordinate plane.
2.3 Scaling
Take the example of wanting to double the size of a 2-D object. What does double mean here?
Does it mean double the size, width only, height only, or double along some line? When we
talk about scaling, it usually means some amount of scaling along each dimension. That is, it
must be specified as to how much the size must be changed along each dimension. In figure
3.3, shown are a triangle and a house both of which have been doubled in width and height
(note that the area is more than doubled).
Unit 3: 2D Transformation 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
The scaling for the x dimension does not have to be the same as the y dimension. If these are
different, then the object is distorted as we can observe in figure 3.4.
When the size is doubled, what happens to the resulting object? In the figure 3.4, the scaled
object is always shifted to the right. This is because it is scaled with respect to the origin.
That is, the point at the origin is left fixed. Thus scaling by more than 1, moves the object
away from the origin and scaling of less than 1, moves the object towards the origin. This is
because of how basic scaling is done. The figure 3.5 has been scaled simply by multiplying
each of its points by the appropriate scaling factor. For example, the point p= (1.5, 2) has
been scaled by 2 along x and .5 along y. Thus, the new point is q = (2*1.5, 5*2) = (1, 1).
Unit 3: 2D Transformation 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
SELF-ASSESSMENT QUESTIONS - 1
Unit 3: 2D Transformation 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Recall section 3.2 where each basic transformation took the form
For a sequence of operations, we may first perform scaling, followed by a translation and
then a rotation, each time, calculating newer coordinates from the old. This sequence can be
performed in “one go” by a composite matrix multiplication, without the additive term M2 in
(3-1), if we employ special coordinates known as homogeneous coordinates.
Homogeneous coordinates
First, represent each (x, y) with the homogeneous coordinate triple (xh, yh, h) where
𝑥ℎ 𝑦ℎ
𝑥= ,𝑦 = ,
ℎ ℎ
Thus the homogeneous coordinate representation of the 2D point (x, y) is (h.x, h.y, h). Since
any h Non zero value can be used as the homogeneous parameter, we conveniently choose h
= 1 (we will require other values in 3D), so that: (x, y) has as its homogeneous coordinate
counterpart (x, y, 1).
We can now show that all 2D geometric transformations are just matrix multiplications in
homogeneous coordinates.
Unit 3: 2D Transformation 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
A 2D point (x, y) is translated to the point (x ‘, y’) by applying a shift (or translation) vector (
tx, ty ) to it as follows:
x’ = x+ tx , y’ = y + ty
𝑥 𝑥′ 𝑡𝑥
𝑷 = [𝑦] , 𝑷′ = [ ′ ] , 𝑻 = [𝑡 ]
𝑦 𝑦
𝑥′ 1 0 𝑡𝑥 𝑥
[𝑦′] = [0 1 𝑡𝑦 ] [𝑦] (3-3)
1 ⏟0 0 1 1
𝑇 (𝑡𝑥 ,𝑡𝑣 )
1 0 −𝑡𝑥
[𝑇 (𝑡𝑥 , 𝑡𝑦 )]−1
= [0 1 −𝑡𝑦 ] (Prove This!).
0 0 1
ii) The transformation equations for the point (x, y) rotated through an angle θ about (0,
0) to the point (x’, y’) as
x’ = xcosθ – ysinθ y’
= xsinθ + ycosθ
𝑐𝑜𝑠 𝜃 − 𝑠𝑖𝑛 𝜃
𝑅=[ ]
𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃
Unit 3: 2D Transformation 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
We can write
𝑥′ 𝑐𝑜𝑠 𝜃 − 𝑠𝑖𝑛 𝜃 𝑥
[ ]=[ ] [ ] , 𝑜𝑟 𝑷′ = 𝑅. 𝑷
𝑦′ 𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃 𝑦
𝑥′ 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 𝑥
[𝑦′] = [𝑠𝑖𝑛 𝜃 𝑐𝑜𝑠 𝜃 0] [𝑦] (3-5)
1 ⏟0 0 1 1
𝑅(𝜃)
𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0
𝑅(𝜃)−1 = 𝑅(−𝜃) = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] = 𝑅(𝜃)𝑇 (3-7)
0 0 1
iii) A scaling transformation alters the size of an object. Here the new coordinates (x’, y’)
are given by
X’ = sxx, y’ = syy
Where, sx sy > 0 are scaling factors in the X ,Y directions respectively. In matrix form,
𝑥′ 𝑠 0 𝑥 𝑠𝑥 0
[ ] = [ 𝑜𝑥 𝑠𝑦 ] [𝑦 ] , 𝑜𝑟 𝑃 ′
= 𝑆. 𝑃 𝑊𝑖𝑡ℎ 𝑆 = 0 𝑠𝑦
𝑦′
For scaling about the origin as fixed point we write and recover
𝑥′ 𝑆𝑥 0 0 𝑥
[𝑦′] = [ 0 𝑆𝑦 0] [𝑦 ] (3-8)
1 ⏟0 0 1 1
𝑆 (𝑠𝑥, 𝑠𝑦 )
Unit 3: 2D Transformation 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1
0 0
𝑠𝑥
𝑆 (𝑠𝑥, 𝑠𝑦 )−1 = 0 0 (3-10)
1
𝑠𝑦
[0 0 1]
Further, to rotate about a general pivot or scale with regards to general fixed point, we can
use a succession of transformations about the origin.
SELF-ASSESSMENT QUESTIONS - 2
Unit 3: 2D Transformation 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1. Translate so that the origin (x0, y0) of (X’, Y’) --> (0, 0).
2. Rotate the X’ axis onto the X axis (that is, thro’ -θ)
1 0 −𝑥0
𝑇 (−𝑥0, − 𝑦0 ) = [0 1 −𝑦0 ] (3-11)
0 0 1
Unit 3: 2D Transformation 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0
𝑅(−𝜃) = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0] (3-12)
0 0 1
This result can also be obtained directly by deriving the relation between the respective
coordinate-distance pairs in the two systems. As an alternative to giving the orientation of
X’Y’ relative to XY as an angle depicted in figure 5.7 is to use unit vectors in the Y’ and X’
directions:
Thus, suppose V is a point vector in the XY system and is in the same direction as the +ve Y’
coordinate axis. If a unit vector along the +ve Y’ coordinate axis is, say,
Unit 3: 2D Transformation 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
𝑉
𝑣 = |𝑉| = (𝑉𝑋 , 𝑉𝑌 ) (3-14)
𝑢𝑥 𝑢𝑦 0
𝑅 = [ 𝑣𝑥 𝑣𝑦 0](3-16)
0 0 1
0 1 0
𝑅 = [−1 0 0]
0 0 1
This result also follows from (5-12) by setting θ= 900. Typically, in interactive applications,
it is more convenient to choose a direction V relative to a position P0 which is the origin of
the X’Y’ system rather than relative to the XY origin referred in figure 5.8.
𝑃1 −𝑃0 𝑉
Then we can use the unit vector |𝑃1 −𝑃0 |
= |𝑉| = 𝑉 ≡ (𝑉𝑥 , 𝑉𝑌 ) (3-17)
Unit 3: 2D Transformation 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
SELF-ASSESSMENT QUESTIONS - 3
Unit 3: 2D Transformation 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
5. OTHER TRANSFORMATION IN 2D
Apart from scaling, rotation, and translation, some of the other transformation techniques in
2D are as follows:
5.1 Reflection
This produces a mirror image of an object by rotating it 180o about an axis of rotation: For
reflection about the X-axis: the x-values remain unchanged, but the y-values are flipped. The
path of rotation is ┴ XY plane for all points in the body.
(3-20)
For reflection about the Y-axis: the y-values remain unchanged, but the x-values are flipped.
The path of rotation is ┴ XY plane for all points in the body.
(3-21)
For reflection about origin: Both x-values and y-values are flipped. The rotation is about axis
thro’ (0, 0) ┴ XY plane for all points in the body.
Unit 3: 2D Transformation 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
(3-22)
Note that the above matrix is same as the rotation matrix R(θ) with θ = 1800 ; so, both these
operations are equivalent.
For reflection about any other point referred in figure 3.9 is the same as rotation about an
axis through the fixed reflection point Pref =(xref, yref) and ┴ XY plane for all points in the body.
(3-23)
Unit 3: 2D Transformation 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Prove that (3.23) is the transformation matrix by concatenating the matrices for:
It is essential to note that this process is also equivalent to reflection about the X-axis +
rotation thro’ +900.
Unit 3: 2D Transformation 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Variations
• Reflections about the coordinate axes or (0,0) can also be implemented as scaling with
negative scaling factors
• Can set elements of reflection matrix to be <, >, ±1.
➢ For > |±1| mirror image is shifted further away
➢ For < |±1| mirror image is shifted nearer axis
5.2 Shear
A shear transformation distorts the shape of an object, causing “internal layers to slide over”.
Two types here are i) a shift in x-values and ii) a shift in y-values.
x-shear:
𝑥 ′ = 𝑥 + 𝑠ℎ𝑋 . 𝑦 1 𝑠ℎ𝑥 0
𝐺𝑖𝑣𝑒𝑛 𝑏𝑦 } 𝑤𝑖𝑡ℎ 𝑚𝑎𝑡𝑟𝑖𝑥 [0 1 0] (3-24)
𝑦′ = 𝑦
0 0 1
Here shx is any real number. For example, shx = 2, changes the square below into a
parm:
Unit 3: 2D Transformation 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
(3-26)
1 0 0
𝑥′ = 𝑥
𝐺𝑖𝑣𝑒𝑛 𝑏𝑦 𝑦 ′ = 𝑦 + 𝑠ℎ (𝑥 − 𝑥 )} 𝑤𝑖𝑡ℎ 𝑚𝑎𝑡𝑟𝑖𝑥 [𝑠ℎ𝑌 1 −𝑠ℎ𝑦 . 𝑥𝑟𝑒𝑓 ] (3-27)
𝑦 𝑟𝑒𝑓
0 0 1
This will shift coordinate positions vertically by an amount ∞. Distance from the reference
line x = xref, e.g., with shy = ½, relative to the line x = xref = −1 we have: -
Remark Shears may also be expressed as compositions of the basic transformations. For
example, (3-24) may be written as a + a scaling.
Unit 3: 2D Transformation 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
6. COMPOSITE TRANSFORMATIONS IN 2D
In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplication with a single composite (concatenated) matrix. We thus consider: -
i) Composite Translations
Let (tx1, ty1) and (tx2, ty2) be 2 successive translation vectors applied to P. Then the final (in
homogeneous coordinates)
Thus, 2 successive translations are additive (add their matrix arguments for the translation
parts).
𝑷′ = 𝑅(𝜃2 )°[𝑅(𝜃1 ). 𝑷]
Unit 3: 2D Transformation 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Thus, two successive rotations are additive (add the rotation angles in the corresponding
matrix arguments).
Thus two successive scalings are multiplicative. For example, if we triple the size of an object
twice => final scaling is 9 × original.
In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplications.
If we have a function for rotation about the origin, we can obtain rotation about any pivot
point (xr, yr) by means of the operations:
Unit 3: 2D Transformation 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Diagrammatically we have:
We can use (3-33) to write a function that accepts a general pivot point, (xr, yr)
Similarly, if we have a function that scales only w.r.t. (0,0) , we can scale
Unit 3: 2D Transformation 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Diagrammatically we have:
1 0 𝑥𝑓 𝑠𝑥 0 0 1 0 −𝑥𝑓 𝑆𝑥 0 𝑥𝑦 (1 − 𝑠𝑥 )
[0 1 𝑌𝑓 ] [ 0 𝑠𝑦 0 ] [0 1 −𝑦𝑓 ] = [ 0 𝑆𝑦 𝑥𝑦 (1 − 𝑠𝑦 )]
0 0 1 0 0 1 0 0 1 0 0 1
Or
We can use (3-34) to write a function that accepts a general fixed point, (xf, yf)
Thus, the scaling factors (sx, sy) are used to apply to stretching along the usual OX-OY
directions. But, when stretching is required along some other OS1-OS2 directions referred in
figure 5.10 with scale factors (s1, s2) where
Unit 3: 2D Transformation 24
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
we proceed as follows:
• Apply a rotation so that the OS1 – OS2 axes coincide the OX-OY axes
• Now scale as before
• Apply reverse rotation to the original directions
Example using (5.35) with s1 = 1, s2 = 2, θ=450 sends the unit square to a parallelogram.
Note that if the scaling above was performed in w.r.t at an arbitrary fixed point rather than
the origin, then an additional translation matrix would have to be incorporated into (3-35).
SELF-ASSESSMENT QUESTIONS - 4
Unit 3: 2D Transformation 25
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
7. SUMMARY
This unit provides information about the basic transformations like translation, rotation and
scaling and other information like reflection and shear. Matrix representation and
homogeneous coordinates scenes involving animations may need all of translations,
rotations and scaling. To process such combined operations more efficiently, we need to re-
formulate the previous transformation equations. Apart from applying transformations on
an object in a particular coordinate system, it is necessary in many situations to be able to
transform a description from one coordinate system to another, calls for transformations
between coordinate systems. We concluded the other transformations like reflection and
shear, reflection produces the mirror image object where as the shear transform distorts the
shape of an object causing internal layers to slide over.
8. TERMINAL QUESTIONS
Unit 3: 2D Transformation 26
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
9. ANSWERS
Terminal Questions
Unit 3: 2D Transformation 27
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
distorts the shape of an object, causing “internal layers to slide over. For more details
refer section 3.5.
5. In effecting a sequence of transformations, we shall find that the result is equivalent to
matrix multiplication with a single composite (concatenated) matrix. Composite
translation, composite rotation and composite scaling are involved in composite
transformation. For more details refer section 3.6.
Unit 3: 2D Transformation 28
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 4: 2D Viewing 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Unit 4
2D Viewing
Table of Contents
8 Answers - - 22 - 23
Unit 4: 2D Viewing 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1. INTRODUCTION
In the previous unit, we discussed various 2D transformations like translation, rotation, and
scaling. Other transformations that are often applied to objects include reflection and shear.
We also discussed matric representation and homogeneous coordinates as well as
transformations between coordinate systems. We concluded the unit with a discussion on
the various composite transformations of 2D.
In this unit we will explore the concept of 2D viewing as well as 2D viewing pipeline that will
allow the designer to pan the image. Window and viewport are the transformation matrix
that map the window from world coordinates into the viewport in screen coordinates. This
process can be achieved in the following sequence: window in world coordinates, window
translated to origin, window scaled to size of viewport and translated to final position.
We will then discuss about clippings. Three types of clippings are used in this unit - line, point
and polygon clippings.
1.1 Objectives:
Unit 4: 2D Viewing 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
2. 2D VIEWING PIPELINE
The section of a scene that is to be shown on the screen is usually referred to as a clipping
window. This is because the unwanted portions are to be discarded or clipped off. After this
section is mapped to device coordinates, its placement can be controlled within the display
window on the screen by putting the mapped image of the clipping window into another
window known as the viewport. What is to be seen is selected by the clipping window, and
where it is shown on the output is the function of the viewport. In fact, several clipping
windows can be defined with each one mapping to a separate viewport, either on one device
or distributed amongst many. For better understanding, let us take the case in the figure 6.1.
1. Construct the scene in world coordinates, using modeling coordinates for each part.
2. Set up a 2D viewing system with an oriented window
3. Transform to viewing coordinates
4. Define a viewport in normalized (0..1 or -1...1) coordinates and map it from the view
coordinates
5. Clip all parts outside the viewport
6. Transform to device coordinates
Unit 4: 2D Viewing 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
When all the transformations are done, clipping can be done in normalized coordinates or
device coordinates. The clipping process is fundamental in computer graphics.
SELF-ASSESSMENT QUESTIONS - 1
Unit 4: 2D Viewing 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Once object descriptions have been transferred to the viewing reference frame, one must
choose the window extents in viewing coordinates and select the viewport limits in
normalized coordinates. Object descriptions are then transferred to normalized device
coordinates. This is done using a transformation that maintains the same relative placement
of objects in normalized space as was done in viewing coordinates. If for instance, a
coordinate position is at the center of the viewing window, it will be displayed at the center
of the viewport.
Figure 4.2 (a) illustrates the window-to-viewport mapping. A point at position (xw, yw) in
the window is mapped into position (xv, yv) in the associated view-port. To maintain the
same relative placement in the viewport as in the window, it is required that,
(a) (b)
In figure 4.2 (b) a point at the position (xm, yw) in a designated window is mapped to
viewport coordinates (xv, yv) so that relative positions in the two areas are the same.
Solving these expressions for the viewport position (xv, yv), we have
Equations can also be derived with a set of transformations that converts the window area
into the viewport area. This conversion is performed with the following sequence of
transformations:
Unit 4: 2D Viewing 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1. Perform a scaling transformation using a fixed-point position (xwmin, ywmin) that scales
the window area to the size of the viewport.
2. Translate the scaled window area to the position of the viewport. Relative proportions
of objects are maintained if the scaling factors are same (sx = sy). Otherwise, world
objects will be stretched or contracted in either e, x or y direction when displayed on
the output device.
Character strings can be handled in two ways when they are mapped to a viewport. The
simplest mapping maintains a constant character size, even though the viewport area may
be enlarged or reduced, relative to the window. This method is employed when text is
formed with standard character fonts that cannot be changed. In systems that allow for
changes in character size, string definitions can be windowed just as other primitives. For
characters formed with line segments, the mapping to the viewport can be carried out as a
sequence of line transformations.
From normalized coordinates, object descriptions are mapped to the viewport display
devices. Any number of output devices can be open in a particular application, and another
window-to-viewport transformation can be performed for each open output device. This
mapping, called the workstation transformation, is accomplished by selecting a window area
in normalized space and a viewport area in the coordinates of the display device. With the
workstation transformation, we gain some additional control over the positioning of parts of
a scene on individual output devices. As illustrated in figure 4.3, it is possible to use work
station transformations to partition a view so that different parts of normalized space can be
displayed on different output devices.
Unit 4: 2D Viewing 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
SELF-ASSESSMENT QUESTIONS - 2
3. Character strings can be handled in two ways when they are mapped to a
viewport. State True/False.
4. Which sequence is followed to transform Window to viewport process?
a. scaling, translation
b. translation, scaling
c. rotation, scaling
d. Scaling, rotation
5. What is workstation transformation?
Unit 4: 2D Viewing 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
4. CLIPPING OPERATION
Clipping operations are used to remove unwanted portions of a picture – we either remove
the outside or the inside of a clipping window, which can be rectangular (most common),
polygonal or curved (complex).
We will first discuss the basic concepts of clipping before we start with point clipping. It is
desirable to restrict the effect of graphics primitives to a sub- region of the canvas in order
to protect other portions of the canvas. All primitives are clipped to the boundaries of this
clipping rectangle; that is, primitives lying outside the clip rectangle are not drawn. The
default clipping rectangle is the full canvas (the screen), and it is obvious that we cannot see
any graphics primitives outside the screen. A simple example of line clipping can illustrate
this idea: Figure 4.4 shows a simple example of line clipping: the display window is the
canvas and also the default clipping rectangle, thus all line segments inside the canvas are
drawn. The thick line box is the clipping rectangle and the dotted line is the extension of the
four edges of the clipping rectangle.
Unit 4: 2D Viewing 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Now with the assumption of a rectangular clip window, point clipping is easy. The point can
be saved if:
If the x coordinate boundaries of the clipping rectangle are Xmin and Xmax, and the y
coordinate boundaries are Ymin and Ymax, then the following inequalities must be satisfied
for a point at (X, Y) to be inside the clipping rectangle:
This section deals with clipping of lines against rectangles. Although there are specialized
algorithms for rectangle and polygon clipping, it is important to note that other graphic
primitives can be clipped by repeated application of the line clipper.
x = x1 + u (x2 - x1)
If the value of u for an intersection with a clipping edge is outside the range
Unit 4: 2D Viewing 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
0 to 1, then the line does not enter the interior of the window at that boundary. If the value
of u is within this range, then the line does enter the interior of the window at that boundary.
To clip a line, one needs to consider only its endpoints and not its many interior points. If
both endpoints of a line lie inside the clip rectangle (for instance, AB, refers to the figure 6.4
example), the entire line lies inside the clip rectangle and can be accepted. If one endpoint
lies inside and one outside (example CD, the line intersects the clip rectangle and it is
necessary to compute the intersection point. If both endpoints are outside the clip rectangle,
the line may or may not intersect within the clip rectangle (EF, GH, and IJ), and it is necessary
to perform further calculations to determine whether there are any intersections. The brute-
force approach to clipping a line that cannot be accepted is to intersect that line with each of
the four clip-rectangle edges to see whether any intersection points lie on those edges. If that
is the case, the line cuts the clip rectangle and is partially inside. For each line and clip-
rectangle edge, we must take the two mathematically infinite lines that contain them and
intersect them. Next, it is essential to test whether this intersection point is "interior" – that
is, whether it lies within both the clip rectangle edge and the line. In that case, there is an
intersection with the clip rectangle. In the first example, intersection points G' and H' are
interior, but I' and J' are not.
In the algorithm, first of all, it is detected whether line lies inside the screen or it is outside
the screen. All lines come under any one of the following categories:
1. Visible
2. Not Visible
3. Clipping Case
1. Visible: If a line lies within the window, i.e., both endpoints of the line lies within the
window. A line is visible and will be displayed as it is.
2. Not Visible: If a line lies outside the window it will be invisible and rejected. Such lines
will not display.
Unit 4: 2D Viewing 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
3. Clipping Case: If the line is neither visible case nor invisible case. It is considered to be
clipped case.
The more efficient Cohen-Sutherland Algorithm performs initial tests on a line to determine
whether intersection calculations can be avoided.
End-points pairs of a line are checked for trivial acceptance or trivial reject using out code.
In case there is neither trivial-acceptance nor trivial-reject, the line is divided into two
segments at a clip edge. The line is iteratively clipped by testing trivial-acceptance or trivial-
rejected, and divided into two segments until completely inside or trivial-rejected.
To perform trivial accept and reject tests, the edges of the clip rectangle must be extended to
divide the plane of the clip rectangle into nine regions as shown in the figure 6.5. Each region
is assigned a 4- bit code determined by where the region lies with respect to the outside half
planes of the clip- rectangle edges. Each bit in the out code is set to either 1 (true) or 0 (false);
the 4 bits in the code correspond to the following conditions:
Bit 1: outside half plane of top edge, above top edge Y > Ymax
Bit 2: outside half plane of bottom edge, below bottom edge Y < Ymin
Bit 3: outside half plane of right edge, to the right of right edge X > Xmax
Bit 4: outside half plane of left edge, to the left of left edge X < Xmin
Unit 4: 2D Viewing 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
As a conclusion, Cohen-Sutherland algorithm is efficient when out code testing can be done
cheaply (for example, by doing bit-wise operations in assembly language) and trivial
acceptance or rejection is applicable to the majority of line segments. (For example, large
windows - everything is inside, or small windows - everything is outside).
Liang-Barsky Algorithm
Faster line clippers have been developed that are based on analysis of the parametric
equation of a line segment, which can be written as:
x = x1 + u Δx
Where Δx = x2 - x1 and Δy = y2 - y1. Using these parametric equations, Cryus and Beck
developed an algorithm that is generally more efficient than the Cohen-Sutherland
algorithm. Later, Liang and Barsky independently devised an even faster parametric line
clipping algorithm. Following the Liang-Barsky approach, we first write the point clipping in
a parametric way:
p1 = -Δx, q1 = x1 – xmin
p2 = -Δx, q2 = xmax - x1
p3 = -Δy, q3 = y1 - ymin
p4 = -Δy, q4 = ymax - y1
Unit 4: 2D Viewing 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Any line that is parallel to one of the clipping boundaries has pk = 0 for the value of k
corresponding to that boundary (k = 1, 2, 3, 4 correspond to the left, bottom, and top
boundaries, respectively). If, for that value of k, one also finds qk >= 0, the line is inside the
parallel clipping boundary.
When pk < 0, the infinite extension of the line proceeds from the outside to the inside of the
infinite extension of the particular clipping boundary. If pk > 0, the line proceeds from the
inside to the outside. For a non-zero value of pk = 0, one can calculate the value of u that
corresponds to the point where the infinitely extended line intersects the extension of
boundary k as:
u = qk / p k
For each line, one can calculate values for parameters u1 and u2 that defines that part of the
line that lies within the clip rectangle. The value of u1 is determined by looking at the
rectangle edges for which the line proceeds from the outer side to the inner side. (p < 0). For
these edges we calculate rk = qk / pk.
The value of u1 is taken as the largest of the set consisting of 0 and the various values of r.
Conversely, the value of u2 is determined by examining the boundaries for which the line
proceeds from inside to outside (p > 0). A value of rk is calculated for each of these boundaries
and the value of u2 is the minimum of the set consisting of 1 and the calculated r values. If u1
> u2, the line is completely outside the clip window and it can be rejected. Otherwise, the end
points of the clipped line are calculated from the two values of parameter u.
This algorithm is presented in the following procedure. Line intersection parameters are
initialized to the values u1 = 0 and u2 = 1. For each clipping boundary, the appropriate values
for p and q are calculated and used by the function clip Test to determine whether the line
can be rejected or whether the intersection parameters are to be adjusted.
When p < 0, the parameter r is used to update u1; when p < 0, the parameter r is used to
update u2.
Unit 4: 2D Viewing 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Otherwise, one can update the appropriate u parameter only if the new value results in the
shortening of the line. When p = 0 and q < 0, one can discard the line since it is parallel to and
outside of this boundary. If the line has not been rejected after all four values of p and q have
been tested, the endpoints of the clipped line are determined from values of u1 and u2.
SELF-ASSESSMENT QUESTIONS - 3
Unit 4: 2D Viewing 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
5. POLYGON CLIPPING
A polygon is usually defined by a sequence of vertices and edges. If the polygons are unfilled,
line-clipping techniques are sufficient, however if the polygons are filled, the process in more
complicated. A polygon may be fragmented into several polygons in the clipping process, and
the original colour associated with each one. The Sutherland-Hodgeman clipping algorithm
clips any polygon against a convex clip polygon. The Weiler- Atherton clipping algorithm can
clip any polygon against any clip polygon. The polygons may even have holes.
An algorithm that clips a polygon must deal with many different cases. The case is
particularly noteworthy in that the concave polygon is clipped into two separate polygons.
Overall the task of clipping seems rather complex. Each edge of the polygon must be tested
against each edge of the clip rectangle; new edges must be added, and existing edges must
be discarded, retained, or divided. Multiple polygons may result from clipping a single
polygon. It is important to have an organized method to deal with all these cases.
It is essential to note the difference between this strategy for a polygon and the Cohen-
Sutherland algorithm for clipping a line: The polygon clipper clips against four edges in
succession, whereas the line clipper tests outcode to see which edge is crossed, and clips only
when necessary.
• Polygons can be clipped against each edge of the window one at a time. Windows/edge
intersections, if any, are easy to find since the X or Y coordinates are already known.
• Vertices which are kept after clipping against one window edge are saved for clipping
against the remaining edges.
Unit 4: 2D Viewing 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
• Note that the number of vertices usually changes and will often increase.
• The Divide and Conquer approach is used.
Figure 4.6 shows the clip boundary that determines a visible and invisible region. The edges
from vertex i to vertex i+1 can be one of the four types:
Because clipping against one edge is independent of all others, it is possible to arrange the
clipping stages in a pipeline. The input polygon is clipped against one edge and any points
that are kept are passed on as input to the next stage of the pipeline. In this way four polygons
can be at different stages of the clipping process simultaneously. This is often implemented
in hardware.
Unit 4: 2D Viewing 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
An array of records show the most recent point that was clipped for each clip-window
boundary. The main routine passes each vertex p to the clipPoint routine for clipping against
the first window boundary. If the line defined by endpoints p and s (boundary) crosses this
window boundary, the intersection is calculated and passed to the next clipping stage. If p is
inside the window, it is passed to the next clipping stage. Any point that survives clipping
against all window boundaries is then entered into the output array of points. The array first
Point stores for each window boundary the first point flipped against that boundary. After
all polygon vertices have been processed, a closing routine clips lines defined by the first and
last points clipped against each boundary.
Convex polygons are correctly clipped by the Sutherland-Hodgeman algorithm, but concave
polygons may be displayed with extraneous lines. This occurs when the clipped polygon
should have two or more separate sections. But since there is only one output vertex list, the
last vertex in the list is always joined to the first vertex. There are several things that can be
done to correct display concave polygons. For instance one could split the concave polygon
into two or more convex polygons and process each convex polygon separately.
Another approach to check the final vertex list for multiple vertex points is, along any clip
window boundary and correctly join pairs of vertices. Finally, one could use a more general
polygon clipper, such as wither, the Weiler- Atherton algorithm or the Weiler algorithm
described in the next section.
In this technique, the vertex-processing procedures for window boundaries are modified so
that concave polygons are displayed correctly. This clipping procedure was developed as a
method for identifying visible surfaces, and it can be applied with arbitrary polygon-clipping
regions.
The basic idea in this algorithm is that instead of always proceeding around the polygon
edges as vertices are processed, one can sometimes follow the window boundaries. Which
Unit 4: 2D Viewing 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
In figure 6.7, the processing direction in the Wieler-Atherton algorithm and the resulting
clipped polygon is shown for a rectangular clipping window.
Unit 4: 2D Viewing 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
SELF-ASSESSMENT QUESTIONS - 4
Unit 4: 2D Viewing 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
6. SUMMARY
Let us recapitulate the unit now. This unit started with the discussion of 2D viewing pipeline
that would allow the designer to pan the image content. Window to viewport coordinate
transformation, in which once the object descriptions have been transferred to the viewing
reference frame, one must choose the window extents in viewing coordinates and select the
viewport limits in normalized coordinates. We also discussed the various clipping operations
like point, line and polygon. The Cohen-Sutherland algorithm is efficient when out code
testing can be done cheaply (for example, by doing bit-wise operations in assembly
language) and trivial acceptance or rejection is applicable to the majority of line segments.
The Liang-Barsky algorithm is more efficient than the Cohen Sutherland algorithm, since
intersection calculations are reduced.
7. TERMINAL QUESTIONS
Unit 4: 2D Viewing 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
8. ANSWERS
1. Clipping window
2. Normalized and devised coordinates.
3. True.
4. a) scaling, translation
5. Any number of output devices can be open in a particular application, and another
window-to-viewport transformation can be performed for each open output device. This
mapping, called the workstation transformation
6. Clipping
7. Graphic primitives
8. True
9. Cohen-Sutherland Line-Clipping
10. Liang-Barsky
11. Vertices and edges
12. True
13. Divide-and-conquer
14. Concave polygons
Terminal Questions
Unit 4: 2D Viewing 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
common), polygonal or curved (complex). Clipping can be done on line, point and
polygon. For more details refer section 6.4.
4. The more efficient Cohen-Sutherland Algorithm performs initial tests on a line to
determine whether intersection calculations can be avoided. For more details refer sub
section 6.4.2.
5. Faster line clippers have been developed that are based on analysis of the parametric
equation of a line segment, which we can write in the form: for more details refer
subsection 6.4.2.
6. A polygon is usually defined by a sequence of vertices and edges. If the polygons are
unfilled, line-clipping techniques are sufficient however, but if the polygons are filled,
the process in more complicated. For more details refer section 6.5.
Unit 4: 2D Viewing 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 5
3D Transformation & Viewing
Table of Contents
8 Terminal Questions - - 26
9 Answers - - 27 - 28
1. INTRODUCTION
In the previous unit, we discussed the concept of 2D viewing including viewing pipeline,
window to viewport coordinate transformation, various clipping operations like point
clipping, circle and polygon clipping.
In this unit we will discuss 3D transformation, which includes basic transformations like
rotation, scaling and translation and the other transformations like reflection and shearing.
These transformations allow the developer to reposition, resize, and reorient models
without changing the base values that define them. We will also discuss the concept of
rotation about an arbitrary axis in space and reflection through an arbitrary plane. 3D
projection is a method of mapping three-dimensional points to a two- dimensional plane. We
will also look at parallel projection in this unit. This unit will conclude with a discussion on
3D viewing pipeline and coordinate system.
1.1 Objectives:
3D transformations refer to the process of changing the position, orientation, and size of a
three-dimensional object in space. These transformations are typically applied to 3D models
in computer graphics, video games, and virtual reality applications to create the illusion of
movement and depth. 3D transformations allow a developer to reposition, resize, and
reorient models without changing the base values that define them
Points are now represented with 3 numbers: <x, y, z>. This particular method of representing
3D space is the "left-handed" coordinate system. In the left-handed system the x axis
increases going to the right, the y axis increases going up, and the z axis increases going into
the page/screen. The right-handed system is similar but with the z-axis pointing in the
opposite direction.
Transformations
A static set of 3D points or other geometric shapes on a screen may not be very interesting.
A paint program is enough to produce one of these. To make a program interesting, one may
want a dynamic landscape on the screen. For this the points will have to move in the world
coordinate system, and even the point-of-view (POV) may be required to move. In short, the
objective could be to model the real world. The process of moving points in space is called
transformation, and can be divided into translation, rotation scaling and other kind of
transformations.
5.1Translation
For any point P (x, y, z) after translation, there is P′ (x′, y′, z′) wherein,
x′ = x + tx,
y′ = y + ty,
z′ = z + tz
Where
𝑥 𝑥′ 𝑡𝑥
′
𝑃 = [𝑦 ] 𝑃 = [𝑦 ′ ] 𝑇 = [𝑡𝑦 ]
𝑧 𝑧′ 𝑡𝑧
Now we will discuss the 3D translation with an example. Take the instance of a case where
one may want to move a point “3 meters east, -2 meters up, and 4 meters north.”
Homogeneous Coordinates
𝑥′ 1 0 0 𝑡𝑋 𝑥
𝑦′ 0 1 0 𝑡𝑦 𝑦
[ ′ ] = [0 0 1 𝑡𝑧 ] = [ 𝑧 ]
𝑍
0 00 1 1
1
𝑥′ 𝑥 + 𝑡2
𝑦′ 𝑦 + 𝑡𝑦
[ ′ ]=[ ]
𝑍 𝑧 + 𝑡𝑧
1 1
This shows that each of the 3 coordinates gets translated by the corresponding translation
distance.
SELF-ASSESSMENT QUESTIONS - 1
2.2 Rotation
Rotation is the process of moving a point in space in a non-linear manner. More particularly,
it involves moving the point from one position on a sphere whose center is at the origin, to
another position on the sphere. But what could be the purpose of doing something like this?
Allowing the point of view to move around is only an illusion – projection requires that the
POV be at the origin. When the user thinks the POV is moving, it actually means that all the
points are being translated in the opposite direction; and when the user thinks the POV is
looking down a new vector, all the points are being rotated in the opposite direction.
Normalization: The process of moving points in such a manner that the POV is at the origin
looking down the +Z axis is called normalization. Rotation of a point requires the need to
know the coordinates for the point, and the rotation angles.
It is important to know three different angles: how far to rotate around the X axis (YZ
rotation, or “pitch”); how far to rotate around the Y axis (XZ plane, or “yaw”); and how far to
rotate around the Z axis (XY rotation, or “roll”). Conceptually, three rotations are done
separately. First, one rotates around one axis, followed by another, then the last. The order
of rotations is important when rotations are cascaded; one must rotate first around the Z
axis, then around the X axis, and finally around the Y axis.
To show how the rotation formulas are derived, let us rotate the point
<x, y, z> around the Z axis with an angle of θ degrees as shown in figure 5.2
In the figure 5.2, it can be noted that, when one rotates around the Z axis, the Z element of
the point does not change. In fact, it is possible to just ignore the Z as its position post rotation
is known. If the Z element is ignored, then it is similar to rotating the two-dimensional point
<x, y> through the angle θ. This is the way a 2-D point is rotated as shown in figure 5.3. For
simplicity, consider the pivot at origin and rotate point P (x, y) where x = r cosФ and y = r
sinФ
If rotated by θ then: x′
= r cos(Ф + θ)
and
y′ = r sin(Ф + θ)
= x cosθ – y sinθ
and
Now for rotation around other axes shown in figure 5.4, cyclic permutation helps form the
equations for yaw and pitch as well: In the equations (5-1), replacing x with y and y with z
gives equations for rotation around x-axis. In the modified equations if y is replaced with z
and z with x then the equations for rotation around y-axis is arrived at.
x′ = x
y′=y
z′ = z cosθ – x sinθ
|x1 y1 z1|
|x2 y2 z2|
|x3 y3 z3|
This is designates as a 3x3 matrix (the first 3 is the number of rows, and the second 3 is the
number of columns). The “rows” of the matrix are the horizontal vectors that make it up; in
this case, <x1, y1, z1>, <x2, y2, z2>, and <x3, y3, z3>. In mathematics, vertical vectors are
called “columns.” In this case they are < x1, x2, x3>, <y1, y2, y3> and <z1, z2, z3>. The most
important thing to be done with a matrix is to multiply it by a vector or another matrix. One
simple rule followed when multiplying something by a matrix: multiply each column by a
multiplicand and store this as an element in the result. As mentioned earlier, each column
can be considered a vector, so when multiplying by a matrix, one is merely doing a bunch of
vector multiplies. So which vector multiply does one use-the dot product, or the cross
product? The dot product is used.
Another simple rule followed when multiplying a matrix by something is: multiply each row
by the multiplier. Again, rows are just vectors, and the type of multiplication is the dot
product.
Let us look at some examples. To begin with let us assume that there is a matrix M, and it
must be multiplied by a point < x, y, z>. The first thing known is that the vector rows of the
matrix must contain three elements (in other words, three columns). Because these rows
should be multiplied by a point using a dot product, and to do that, the two vectors must have
the same number of elements. Since there will be a dot product for each row in M, one is
likely to end up with a tuple that has one element for each row in
M. As stated earlier, we work almost exclusively with square matrices. If the requirement for
us is three columns, M will also have three rows. Let us see:
1 0 0
< 𝑥, 𝑦, 𝑧 > ∗ |0 1 0| = {< 𝑥, 𝑦, 𝑧 >∗< 1,0,0 > , < 𝑥, 𝑦, 𝑧 >< 0,1,0 >, < 𝑥, 𝑦, 𝑧 > ∗
0 0 1
< 0,0,1 >} = 𝑥, 𝑦, 𝑧
Using Matrices for rotation Roll (rotate about the z axis) with the matrix representation as
shown in figure 5.5:
𝑥′ 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 0 𝑥
𝑦′ 𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0 0 𝑦
[ ′ ]=[ 0 0 1 0] = [ 𝑧 ]
𝑍
0 0 01 1
1
𝑥′ 1 0 0 0 𝑥
𝑦′ 0 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃 0 𝑦
[ ′ ] = [0 𝑠𝑖𝑛𝜃 𝑐𝑜𝑛𝜃 0] = [ 𝑧 ]
𝑍
0 0 0 1 1
1
𝑥′ 𝑐𝑜𝑠𝜃 0 𝑠𝑖𝑛𝜃 0 𝑥
𝑦′ 0 1 0 0 𝑦
[ ′ ] = [−𝑠𝑖𝑛𝜃 0 𝑐𝑜𝑛𝜃 0] = [ 𝑧 ]
𝑍
0 0 0 1 1
1
2.3Scaling
X` = X. Sx
Y` = Y. Sy
Z` = Z. Sz
Scaling an object with transformation changes the size of the object and repositions the
object relative to the coordinate origin. If the transformation parameters are not all equal,
relative dimensions in the object are changed.
Uniform Scaling: Here the original shape of an object is preserved with uniform scaling (Sx =
Sy = Sz)
Differential Scaling: Here the original shape of an object is not preserved because of
differential scaling (Sx <> Sy <> Sz)
𝑆𝑥 0 0 0
𝑆 0 0
[ 0 𝑦 𝑆 0]
0 0 𝑧
0 0 0 1
Scaling with respect to a selected fixed position (Xf, Yf, Zf) can be represented with the
following transformation sequence:
For these three transformations one can have a composite transformation matrix by
multiplying three matrices into one as shown below.
1 0 0 𝑋𝑓 𝑆𝑥 0 0 0 1 0 0 −𝑋𝑓 𝑆𝑥 0 0 (1 − 𝑠𝑥 )𝑋𝑓
010 𝑌 𝑆 0 0 0 1 0 −𝑌 𝑆 0 (1 − 𝑠𝑦 )𝑌𝑓
[0 0 1 𝑓 ] [ 0 𝑦 𝑆 0] [0 0 1 𝑓 ] 0 𝑦 𝑆
𝑍𝑓 0 0 𝑧 −𝑍𝑓 0 0 𝑧 (1 − 𝑠𝑧 )𝑍𝑓
000 1 0 0 0 1 000 1 [0 0 0 1 ]
SELF-ASSESSMENT QUESTIONS - 2
3. OTHER TRANSFORMATIONS
Apart from translation, rotation and scaling, following are the other two transformations.
3.1 Reflection
1 0 00
0 −1 0 0
[0 0 1 0]
0 0 01
The matrix representation for this reflection of points relative to the Y axis
−1 0 0 0
0 100
[ 0 0 1 0]
0 001
The matric representation for this reflection of points relative to the XY plane is
10 0 0
01 0 0
[0 0 −1 0]
00 0 1
3.2 Shear
10𝛼0
01𝑏0
[0 0 1 0]
0001
Parameters a & b can be assigned as well as real values. The effect of this transformation
matrix is to alter X and Y coordinate values by an amount that is proportional to the z value,
while leaving the z coordinate unchanged.
Y-axis Shear:
1𝑎00
0100
[0 𝑐 1 0]
0001
X-axis Shear:
1000
100
[𝑏
𝑐 0 1 0]
0001
SELF-ASSESSMENT QUESTIONS - 3
To perform a rotation about an arbitrary axis in space, we need to first define the axis of
rotation. The axis of rotation can be represented by a unit vector, which indicates the
direction of the line of rotation. Next, we need to specify the angle of rotation, which is the
amount of rotation around the axis.
The general case of rotation about an arbitrary axis in space frequently happens in, robotics,
animation, and simulation. Since the technique for rotation about a coordinate axis is known,
the underlying procedural idea is to make the arbitrary rotation axis coincident with one of
the coordinate axes as shown in figure 5.6. Assume an arbitrary axis in space passing through
the point (x0, y0,z0) with direction cosines (cx, cy, cz). Rotation about this axis by some angle
δ is accomplished using the following procedure
Translate so that the point (x0, y0, z0) is at the origin of the coordinate system.
➢ Perform appropriate rotations to make the axis of rotation coincident with the z-axis.
➢ Rotate about the z-axis by the angle δ.
➢ Perform the inverse of the combined rotation transformation.
➢ Perform the inverse of the translation.
In general making an arbitrary axis passing through the origin coincident with one of the
coordinate axes requires two successive rotations about the other two coordinate axes. To
make the arbitrary rotation axis coincident with the z-axis, first rotate about the x-axis and
then about the y-axis. To determine the rotation angle , α, about the x-axis used to place the
arbitrary axis in the xz plane, first project the unit vector along the axis onto the yz plane as
shown in fig 5.7 (a). The y and z components of the projected vector are cy and cz, the
direction cosines of the unit vector along the arbitrary axis. In fig 7.7 (a) it is shown that
𝑑 = √𝑐𝑦 2 + 𝐶𝑧 2
and
𝑐𝑧 𝑐𝑦
𝑐𝑜𝑠 𝑎 = 𝑠𝑖𝑛 𝑎 = (5-1) and (5-2)
𝑑 𝑑
After rotation about the x-axis into the xz plane, the z component of the unit vector is d, and
the x component is cx. The direction cosine in the x direction as shown in fig 5.7(b). The
length of the unit vector is, of course, 1. Thus, the rotation angle β about the y-axis required
to make the arbitrary axis coincident with the z-axis is
1 0 0 0
0 1 0
[𝑇] = [ 0 0 1 0] (5-5)
0
−𝑥𝑜 −𝑦𝑜 −𝑧𝑜 1
1 0 0 0 1 0 0 0
0 𝑐𝑜𝑠 𝛼 𝑠𝑖𝑛 𝑎 0 0 𝑐𝑧 /𝑑 𝑐𝑦 /𝑑 0
[𝑅𝑥 ] = [0 −𝑠𝑖𝑛 𝑎 𝑐𝑜𝑠 𝛼 ] = [0 ](5-6)
0 −𝑐𝑦 /𝑑 𝑐𝑧 /𝑑 0
0 0 0 1 0 0 0 1
Figure 5.7: Rotation required to make the unit vector op coincident with the z-axis a)
rotation about x b) rotation about y.
Finally, the rotation about the arbitrary axis is given by a z-axis rotation matrix,
𝑐𝑜𝑠 𝛿 𝑠𝑖𝑛𝛿 0 0
[𝑅𝛿 ] = [ −𝑠𝑖𝑛𝛿 𝑐𝑜𝑠𝛿 0 0
0 0 1 0](5-8)
0 0 0 1
In practice the angles α and β are not explicitly calculated. The elements of the rotation
matrices [Rx] and [Ry] in equation (5-4) are obtained from equations (5-1 to 5-3) at the
expense of two divisions and a square root calculation. Although developed with the
arbitrary axis in the first quadrant, these results are applicable in all quadrants.
If the direction cosines of the arbitrary axis are not known, they can be obtained knowing a
second point on the axis (x1, y1, z1) by normalizing the vector from the first to the second
point. Specifically, the vector along the axis from (x0, y0, z0) to (x1, y1, z1) is
The transformation cause reflection through the x=0, y=0, n=0 coordinate planes,
respectively. Often it is necessary to reflect an object through a plane other than one of these.
This can be accomplished using a procedure incorporating the previously defined simple
transformations, one possible procedure is:
Translate a known point P, which lies in the reflection plane, to the origin of the coordinate
system. Rotate the normal vector to the reflection plane at the origin until it is coincident
with the +z-axis (Eqs. 5-6 and 5-7); this makes the reflection plane the z=0 coordinate plane.
After applying the above transformations to the object, reflect the object through the z=0
coordinate plane. Perform the inverse transformations to those given above to achieve the
desired result.
Where the matrices [T], [Rx], [Ry] are given by equations (5-5) to (5-7), respectively.(x0, y0,
z0) = (px, py, py), the components of point P in the reflection plane; and (cx, cy, cz) are the
direction cosines of the normal to the reflection plane.
SELF-ASSESSMENT QUESTIONS - 4
10. To make arbitrary rotation axis coincident with the z-axis, first rotate about
the____________ and then about the ________ .
11. The transformation cause reflection through the x= , y= , n= ___________
coordinate planes.
a. 0, 0, 0 b. 0, 0, 1 c. 1, 1, 0 d. 1, 1, 1
1 0 𝐿1 cos(𝑝ℎ𝑖) 0
0 1 𝐿1 sin(𝑝ℎ𝑖) 0
𝑃𝑎𝑟𝑒𝑙𝑙𝑒𝑙 𝑀 = [0 0 0]
1
00 0 1
Perspective projection is a technique that simulates the way the human eye perceives depth
and distance. In this technique, objects that are farther away from the viewer appear smaller
than objects that are closer. The projection is achieved by drawing lines from the viewer's
eye through each vertex of the 3D object and intersecting them with a projection plane. The
resulting image has the appearance of depth and perspective.
Orthographic projection is a technique that does not take into account the viewer's
perspective. In this technique, the 3D object is projected onto a 2D plane along parallel lines
that are perpendicular to the plane. This results in an image that is flat and does not have the
appearance of depth or perspective.
In addition to perspective and orthographic projection, other techniques such as ray tracing
and rasterization can be used for 3D viewing. Ray tracing involves tracing the path of light
rays as they interact with objects in the scene to create a realistic image. Rasterization
involves converting the 3D object into a series of pixels that can be displayed on a 2D surface.
Viewing a scene in 3D is much more complicated than 2D viewing. In the latter, the viewing
plane on which a scene is projected from WCs is basically the screen, except for its
dimensions. In 3D, one can choose different viewing planes, directions to view from and
positions to view from. There is also a choice in how to project from the WC scene onto the
viewing plane. In the process of viewing a 3D scene, a coordinate system for viewing is set
up, which holds the viewing or “camera” parameters: position and orientation of a viewing
or projection plane (~ camera “film”).
To establish the 3D viewing reference frame we first select a world- coordinate position P0
= (x0, y0, z0) for the viewing origin. This is also called the view point or viewing position.
Then one chooses a view up vector V which defines its y-direction, yv and in addition a vector
giving the direction along which viewing is done defining its zv direction. The view plane or
projection plane is usually taken as a plane that is ┴ zv -axis and is set at a position zvp from
the origin. Its orientation is specified by choosing a view-plane normal vector N which also
specifies the direction of the positive zv direction. In figure 5.14 right-handed systems are
indicative of the set up typically employed.
The direction of viewing is usually taken as the −N (or −zv ) direction, for RH coordinate
systems (or in the opposite direction corresponding to LH coordinate systems). .
Having chosen N the unit normal vector n is formed, for the zv direction, form the unit vector
u for the xv direction, and then adjust V to get a new unit vector v for the yv direction, using
cross-products to obtain each one orthogonal to the plane of the other two:
𝑁
𝑛= = (𝑛𝑥, 𝑛𝑦 , 𝑛𝑧 )
|𝑁|
𝑉×𝑛
𝑢= = (𝑢𝑥, 𝑢𝑦 , 𝑢𝑧 )
|𝑉|
𝑣 = 𝑛 × 𝑢 = (𝑣𝑥, 𝑣𝑦 , 𝑣𝑧 )
Finally, the view-plane is chosen as a plane ┴ n (or the zy -axis, at some point on it (at some
distance from the view-frame origin).
SELF-ASSESSMENT QUESTIONS - 5
7. SUMMARY
This unit provides information about the 3D transformations like rotation, translation and
scaling and other transformations including reflection and shear. Translation is used to move
a point, or a set of points, linearly in space. Rotation is the process of moving a point in space
in a non-linear manner. Scaling an object with transformation changes the size of the object
and repositions the object relative to the coordinate origin. Shearing transformation can be
used to modify object shapes. Rotation about an arbitrary axis in space frequently occurs, in
cases like robotics, animation, and simulation. Reflection through an arbitrary plane is the
transformation cause reflection through the x=0, y=0, n=0 coordinate planes, respectively.
Projection is a way of converting the object in N-dimensional system to N-1 dimensions. The
unit also discussed parallel projection with its types called orthographic and oblique
projection. it also focused on 3D viewing and the steps involved from the actual construction
of a 3D scene to its ultimate depiction on a device:
8. TERMINAL QUESTIONS
1. Explain 3D transformations.
2. Discuss the other transformations of 3D.
3. Explain rotation about an arbitrary axis in space.
4. Explain the reflection through an arbitrary plane.
5. Explain 3D viewing.
9. ANSWERS
1. point-of-view
2. Translation
3. A point is similar to its 2D counterpart; we simply add an extra component, Z, for the 3rd
axis
4. Rotation
5. Ordered set of numbers
6. uniform scaling
7. True
8. reflection axis.
9. Modify
10. x-axis y-axis
11. a) 0, 0,0
12. view point
1. A static set of 3D points or other geometric shapes on screen is not very interesting.
You could just use a paint program to produce one of these. To make your program
interesting, you will want a dynamic landscape on the screen. For more details refer
section 5.2.
2. Reflection and shear are the two other transformations of the 3D. reflection and
shearing these transformations allow the developer to reposition, resize, and reorient
models without changing the base values that define them. For more details refer
section 5.3.
3. Since the technique for rotation about a coordinate axis is known, the underlying
procedural idea is to make the arbitrary rotation axis coincident with one of the
coordinate axes. For more details refer section 5.4.
4. The transformation cause reflection through the x=0, y=0, n=0 coordinate planes,
respectively. Often it is necessary to reflect an object through a plane other than one of
these. For more details refer section 5.5.
5. Viewing a scene in 3D is much more complicated than 2D viewing, where in the latter
the viewing plane on which a scene is projected from WCs is basically the screen, except
for its dimensions. For further details refer section 5.6.
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 6: Curves 1
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Unit 6
Curves
Table of Contents
8 Terminal Questions - - 22
9 Answers - - 22 - 23
Unit 6: Curves 2
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
1. INTRODUCTION
Every curve or surface can be defined by a set of parametric functions. For instance (x, y, z)
coordinates of the points of the curve can be given as:
x= X (t), y = Y (t), z=Z (t), t being the parameter and x, y, z being polynomial functions in t. If
x, y and z are 1st degree polynomials, a line segment will be defined. In that case, two knowns
only (that is two points or a point and a slope) will be sufficient to define this curve. If x, y, z
are 2nd degree polynomials, a parabola segment is defined and 3 knowns will be necessary
to describe it (that is, 3 points or two points and a tangent). For higher degree polynomials,
describing the curve will involve more knowns. This number of knowns is what is called the
order of the curve, and is always the degree of the polynomials plus 1. Lower degree
polynomials describe very restrictive curves, being either lines or parabolas, which are
always planar curves. Various approaches have been conceptualized by mathematicians, for
instance, Bezier curves, B-Spline curves, Non uniform rational B-spline curves and surfaces,
which are discussed in this unit.
1.1 Objectives:
Unit 6: Curves 3
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Every graphics system has some form of primitive draw lines. Using these primitives one can
draw many complex shapes. However, as these shapes get more complex and finely detailed,
so does the data needed to describe them accurately.
The worst case scenario is the curve. A curve can be described by a finite number of short
straight segments. However, on close inspection, this is only an approximation. To get a
better approximation one can use more segments per unit length. This increases the amount
of data required to store the curve and makes it difficult to manipulate. It is necessary to have
a method for representing these curves in a mathematical fashion. Ideally, descriptions will
be:
➢ Reproducible - the representation should give the same curve every time;
➢ Computationally Quick;
➢ Easy to manipulate, especially important for design purposes;
➢ Flexible
➢ Easy to combine with other segments of curve.
The types of curve that will be discussed fall into two broad categories: interpolating and
approximation curves. Interpolating curves pass through the points used to describe it,
whereas an approximating curve will get near the points. The exact definition of ‘near’ will
be discussed later. The points through which the curve passes are known as knots; the curve
described by the equation is often referred to as a spline. This term originated in manual
design, where a spline is a thin strip, held in place by weights to create a curve which could
then be traced. In the same way knots are now used to describe a curve.
Everyone who has ever tried to apply simple linear interpolation to find a value between
pairs of data points will be aware that such attempts are unlikely to provide reliable results
if the data being used is anything other than broadly linear. In an attempt to deal with
inherent non-linearity, the next step usually involves some sort of polynomial interpolation.
This generally leads to far more stable and robust interpolation and fitting, but is also
potentially a difficult area as the end points, monotonicity, convexity and continuity of
derivatives, all make their influences felt, in often contradictory ways. One of the most
Unit 6: Curves 4
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
popular ways of dealing with these issues is to use splines. In their most general form, splines
can be considered as a mathematical model that associate a continuous representation of a
curve or surface with a discrete set of points in a given space. Spline fitting is an extremely
popular form.
The list of properties that can be regarded as a convenient set of features against which the
usefulness of various spline types can be measured are:
Unit 6: Curves 5
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
curvature, generally referred to as shape parameters), which should allow the user to
pull the spline locally toward one or more control points in an intuitive fashion.
• Existence of Refinement Algorithms: The spline model should lend itself to the use
of refinement or subdivision techniques which serve to increase the degree of freedom
for a spline without modifying its shape.
• Conic Representation: The spline model should permit the representation of conic
sections and therefore support a wide range of curves and surfaces such as circles,
ellipses, spheres and cylinders etc.
• Approximation/Interpolation: Spline models should provide both approximation
and interpolation splines in a unified formulation.
SELF-ASSESSMENT QUESTIONS - 1
Unit 6: Curves 6
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
3. BEZIER CURVES
Bezier curves are a mathematical representation of smooth curves often used in computer
graphics, animation, and other related fields. They were named after French engineer Pierre
Bezier, who first used them in the design of automobile bodies for Renault in the 1960s.
A Bezier curve is defined by a set of control points that define its shape. These control points
are used to calculate the curve's position and shape by interpolating between them. The
curve itself is defined as a polynomial function of one variable, which can be represented
using a parametric equation.
Bezier curves can be of different orders, meaning they can have different numbers of control
points. For example, a quadratic Bezier curve has three control points, while a cubic Bezier
curve has four control points. Higher-order curves can also be created by adding more
control points.
Bezier curves are often used in computer graphics software to create smooth shapes and
curves, such as lines, curves, and arcs. They are also used in animation to create motion paths
for objects and characters. Additionally, they are widely used in vector graphics editors, such
as Adobe Illustrator and Inkscape, to create shapes and curves.
Bezier curves are defined using four control points, known as knots. Two of these are the
end points of the curve, while the other two effectively define the gradient at the end points.
These two points control the shape of the curve. The curve is actually a blend of the knots.
This is a recurring theme of approximation curves which involves defining a curve as a blend
of the values of several control points. The figure 6.1 shows a Bezier curve; which shows how
the shape of the curve is affected by changing the knots.
Unit 6: Curves 7
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Bezier curves are more useful than any other type mentioned so far; however, they still do
not manage much local control. Increasing the number of control points does lead to slightly
more complex curves, but as is evident from the figure 6.2, the detail suffers due to the nature
of blending of all the curve points together.
Unit 6: Curves 8
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
SELF-ASSESSMENT QUESTIONS - 2
Unit 6: Curves 9
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
4. 4B-SPLINE CURVES
B-spline curves are a type of mathematical curve that is commonly used in computer
graphics, CAD (computer-aided design), and other areas where curves need to be accurately
represented. B-splines are a generalization of Bezier curves and can represent a wide variety
of curve shapes.
The "B" in B-spline stands for "basis," which refers to the set of functions that are used to
construct the curve. These basis functions are typically piecewise polynomial functions of a
certain degree, such as quadratic or cubic. The degree of the basis functions determines the
smoothness of the resulting curve.
To construct a B-spline curve, a set of control points is used to define the curve's shape. The
curve is then created by smoothly interpolating between the control points using the basis
functions. This results in a smooth and continuous curve that passes through each of the
control points.
B-spline curves have a number of advantages over other types of curves. For example, they
are more flexible than Bezier curves and can represent a wider range of curve shapes. They
are also more efficient to calculate, as they can be easily subdivided and modified without
changing the shape of the curve.
B-spline curves have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design curves for car bodies, airplane wings, and other complex shapes.
The main problem with Bezier curves is their lack of local control. Simply increasing the
number of control points adds little local control to the curve. This is due to the nature of the
blending used for Bezier curves. They combine all the points to create the curve. The obvious
solution is to combine only those points nearest to the current parameter. For this the points
can be defined as lying in a parametric space at equal intervals. Figure 6.3 shows the
positions of the knot.
Unit 6: Curves 10
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
These points are labeled internally from 0 to (number of points)-1. To calculate the curve at
any parameter t, a Gaussian curve is placed over the parameter space. This curve is actually
an approximation of a Gaussian as shown in the figure 6.4; it does not extend to infinity at
each end, just to +/- 2 by using the following equations:
Unit 6: Curves 11
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
This curve peaks at a value of 2/3, and at +/- 1, its value is 1/6. When this curve is placed
over the array of control points, it gives the weighting of each point. As the curve is drawn,
each point will in turn become the heaviest weighted; therefore gaining in local control. The
figure 8.8 shows this curve after action. Notice how the curve seems to go haywire at either
end.
At P0, the Gaussian curve covers points from -1 to 1 (at points -2 and 2 the Gaussian weight
is zero). The point at -1 is not defined, so the curve has an undefined value. In this example
it is being pulled towards the origin.
Unit 6: Curves 12
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Non-Uniform Rational B-Spline (NURBS) curves are a type of B-spline curve that adds
additional flexibility by allowing for non-uniform knot vectors and weights.
In a NURBS curve, each control point is assigned a weight, which determines its influence on
the shape of the curve. The weights can be used to create curves with varying thickness or to
add more control over the shape of the curve. Additionally, the knot vector can be non-
uniform, meaning that the basis functions are not evenly spaced along the curve. This allows
for greater control over the shape of the curve and can be used to create more complex
shapes.
A NURBS curve (Non Uniform Rational B-Spline Curve) is defined by its order, a set of
weighted control points, and a knot vector. NURBS curves and surfaces are generalizations
of both B-splines and Bézier curves and surfaces, the primary difference being the weighting
of the control points which makes NURBS curves rational (non-rational B-splines are a
special case of rational B-splines). Whereas Bézier curves evolve into only one parametric
direction, usually called s or u, NURBS surfaces evolve into two parametric directions, called
s and t or u and v.
By evaluating a NURBS curve at various values of the parameter, the curve can be
represented in Cartesian two- or three-dimensional space. Likewise, by evaluating a NURBS
surface at various values of the two parameters, the surface can be represented in Cartesian
space.
• They are invariant under affine as well as perspective transformations: operations like
rotations and translations can be applied to NURBS curves and surfaces by applying
them to their control points.
• They offer one common mathematical form for both standard analytical shapes
(example conics) and free-form shapes.
• They provide the flexibility to design a large variety of shapes.
Unit 6: Curves 13
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
• They reduce the memory consumption when storing shapes (compared to simpler
methods).
• They can be evaluated reasonably quickly by numerically stable and accurate
algorithms.
In the next sections, NURBS is discussed in one dimension (curves). It should be noted that
all of this can be generalized to two or even more dimensions.
Control Points
The control points determine the shape of the curve. Typically, each point of the curve is
computed by taking a weighted sum of a number of control points. The weight of each point
varies according to the governing parameter. For a curve of degree d, the weight of any
control point is only nonzero in d+1 intervals of the parameter space. Within those intervals,
the weight changes according to a polynomial function (basis functions) of degree d. At the
boundaries of the intervals, the basis functions go smoothly to zero, the smoothness being
determined by the degree of the polynomial. For example, the basis functions of degree one
is a triangle function. It rises from zero to one, then falls to zero again. While it rises, the basis
function of the previous control point falls. In that way, the curve interpolates between the
two points, and the resulting curve is a polygon, which is continuous, but not differentiable
at the interval boundaries, or knots. Higher degree polynomials have correspondingly more
continuous derivatives. It must be noted that within the interval the polynomial nature of
the basis functions and the linearity of the construction make the curve perfectly smooth, so
it is only at the knots that discontinuity can arise.
The fact that a single control point only influences those intervals where it is active is a highly
desirable property, known as local support. In modeling, it allows the changing of one part
of a surface while keeping other parts equal.
Adding more control points allows better approximation to a given curve, although only a
certain class of curves can be represented exactly with a finite number of control points.
NURBS curves also feature a scalar weight for each control point. This allows for more
control over the shape of the curve without unduly raising the number of control points. In
Unit 6: Curves 14
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
particular, it adds conic sections like circles and ellipses to the set of curves that can be
represented exactly. The term rational in NURBS refers to these weights.
Knot Vector
The knot vector is a sequence of parameter values that determines where and how the
control points affect the NURBS curve. The number of knots is always equal to the number
of control points plus curve degree plus one. The knot vector divides the parametric space
in the intervals mentioned earlier, usually referred to as knot spans. Each time the
parameter value enters a new knot span, a new control point becomes active, while an old
control point is discarded. It follows the rule that the values in the knot vector should be in
non decreasing order, so (0, 0, 1, 2, 3, 3) is valid while (0, 0, 2, 1, 3, 3) is not.
Consecutive knots can have the same value. This then defines a knot span of zero length,
which implies that two control points are activated at the same time (and of course two
control points become deactivated). This has an impact on continuity of the resulting curve
or its higher derivatives. For instance, it allows the creation of corners in an otherwise
smooth NURBS curve. A number of coinciding knots is sometimes referred to as a knot with
a certain multiplicity. Knots with multiplicity two or three are known as double or triple
knots respectively. The multiplicity of a knot is limited to the degree of the curve, since a
higher multiplicity would split the curve into disjoint parts and it would leave control points
unused. For first-degree NURBS, each knot is paired with a control point.
Using the definitions of the basic functions Ni,n , NURBS curve takes the following form.
𝑘
𝑁𝑖,𝑛 𝑤𝑖 ∑𝑘𝑖=1 𝑁𝑖,𝑛 𝑤𝑖 𝑃𝑖
𝐶 (𝑢 ) = ∑ 𝑘 𝑃 = 𝑘
∑𝑗=1 𝑁𝑗,𝑛 𝑤𝑗 𝑖 ∑𝑖=1 𝑁𝑖,𝑛 𝑤𝑖
𝑖=1
In this, k is the number of control points Pi and wi is the corresponding weights. The
denominator is a normalizing factor that evaluates to one if all weights are one. This can be
seen from the partition of unity property of the basis functions. It is customary to write this
as,
Unit 6: Curves 15
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
𝐶(𝑢) = ∑ 𝑅𝑖,𝑛, 𝑃𝑖
𝑖=1
𝑁𝑖,𝑛 𝑤𝑖
𝑅𝑖,𝑛 = 𝑘
∑𝑗=1 𝑁𝑗,𝑛 𝑤𝑗
SELF-ASSESSMENT QUESTIONS – 3
Unit 6: Curves 16
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
6. INTRODUCTION TO SURFACES
When comparing mathematics in two and three dimensions, there are many similarities.
Very often, the techniques used in the simpler two-dimensional case can be easily extended
to cover three dimensions. Some of the curve representations presented in the previous
sections easily extend to three dimensions and can therefore represent surfaces.
When creating a curve, a single parametric dimension is used, as well as defined points
within this dimension, and then used this to create a curve. For a surface, two orthogonal
parametric dimensions of points are required. These form a rectangular mesh. At any point
in parametric space, two blending functions are used, one in each parametric direction. For
every knot defined, the Cartesian product of the two blending functions is calculated and this
is the weight given to that knot. The sum of all the weights will still be one as it was for a
curve. The most commonly used methods of representing curved surfaces in computing are
Bézier surfaces and B-spline surfaces and these are discussed here.
Bezier surfaces are a type of mathematical surface that are commonly used in computer
graphics, CAD (computer-aided design), and other areas where surfaces need to be
accurately represented. Bezier surfaces are a generalization of Bezier curves and can
represent a wide variety of surface shapes.
Like Bezier curves, Bezier surfaces are defined by a set of control points. The surface is then
created by smoothly interpolating between the control points using a set of basis functions.
The basis functions used for Bezier surfaces are typically tensor product polynomials of a
certain degree, such as quadratic or cubic.
To construct a Bezier surface, a rectangular grid of control points is used to define the
surface's shape. The surface is then created by smoothly interpolating between the control
points using the basis functions. This results in a smooth and continuous surface that passes
through each of the control points.
Unit 6: Curves 17
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
Bezier surfaces have a number of advantages over other types of surfaces. For example, they
are more flexible than NURBS surfaces and can represent a wider range of surface shapes.
They are also more efficient to calculate, as they can be easily subdivided and modified
without changing the shape of the surface.
Bezier surfaces have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design surfaces for car bodies, airplane wings, and other complex shapes.
To create a Bézier surface, a mesh of Bézier curves is blended using the blending function
𝑚 𝑛
where j and k are points in parametric space and represents the location of the knots in
real space. The Bézier functions specify the weighting of a particular knot. They are called
the Bernstein coefficients.
where C(n,k) represents the binary coefficients. When u=0, the function is one for k=0 and
zero for all other points. When two orthogonal parameters are combined, a Bézier curve can
be found along each edge of the surface, as defined by the points along that edge. Bézier
surfaces are useful for interactive design and were first applied to car body design.
B-spline surfaces are a type of mathematical surface that are commonly used in computer
graphics, CAD (computer-aided design), and other areas where surfaces need to be
accurately represented. B-spline surfaces are a generalization of B-spline curves and can
represent a wide variety of surface shapes.
To construct a B-spline surface, a rectangular grid of control points is used to define the
surface's shape. The surface is then created by smoothly interpolating between the control
Unit 6: Curves 18
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
points using a set of basis functions. The basis functions used for B-spline surfaces are
typically tensor product polynomials of a certain degree, such as quadratic or cubic.
Like B-spline curves, B-spline surfaces can have a non-uniform knot vector and weights
assigned to each control point. The knot vector determines the distribution of the basis
functions along the surface, while the weights determine the influence of each control point
on the shape of the surface. This added flexibility allows for more control over the shape of
the surface and can be used to create more complex shapes.
B-spline surfaces have a number of advantages over other types of surfaces. For example,
they are more flexible than NURBS surfaces and can represent a wider range of surface
shapes. They are also more efficient to calculate, as they can be easily subdivided and
modified without changing the shape of the surface.
B-spline surfaces have a wide range of applications, including computer graphics, animation,
industrial design, and architecture. They are also used in the automotive and aerospace
industries to design surfaces for car bodies, airplane wings, and other complex shapes. B-
spline surfaces are supported by many popular CAD and 3D modeling software packages,
including AutoCAD, SolidWorks, and Rhino.
A B-Spline surface can be created using a similar method as the Bézier surface. For B-Spline
curves, two phantom knots are used to clamp the ends of the curve. For a surface, phantom
knots will be needed all around the knots as shown below for an M+1 by N+1 knot surface.
]There are two extra rows and two extra columns of knots in parametric space surrounding
the real knots, where these knots are placed determines the shape of the surface at the edges.
The method described here gives similar results to the method used for Bézier surfaces; that
Unit 6: Curves 19
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
is, the edges of the surface form a B-Spline curve of the edge knots. This means some of the
boundary conditions are:
or 0 <= m <= M and 0 <= n <= N. These conditions are essentially the same as the two-
dimensional case. It means that the weighting of a sample taken at the boundary m=0 is
dependent only on knots along the m=0 boundary (the phantom knots at m=-1 balance out
the real knots at m=1). The remaining boundary conditions make the surface corners and
the corner knots coincide. The co-ordinate of the corner as set by P0,0 (and hence the
parametric knot at {-1,-1}) is
This gives us a surface that interpolates the corner knots and forms B- Spline curves down
each side.
SELF-ASSESSMENT QUESTIONS – 4
Unit 6: Curves 20
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
7. SUMMARY
Let us recapitulate the contents of this unit content. The types of curve discussed fall into
two broad categories: interpolating and approximation curves. Interpolating curves pass
through the points used to describe it, whereas an approximating curve get near to the
points, Parametric equations can be used to generate curves that are more general than
explicit equations of the form y=f(x). We also discussed the Bezier curves which are defined
using four control points, known as knots. Two of these are the end points of the curve, while
the other two effectively define the gradient at the end points. B-Spline curves overcome the
short fall of the
Bezier curves. NURBS or as Non uniform Rational B-Spline Curve is used to represent the
curve in Cartesian two- or three-dimensional space. We also explored the reasons for the
usefulness of the NURBS curves and surfaces.. We concluded this unit with the discussion of
Bezier and B-Spline surfaces.
Unit 6: Curves 21
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
8. TERMINAL QUESTIONS
9. ANSWERS
1. Curve
2. interpolating and approximation curves. 3. y=f(x).
3. True
4. parametric equations
5. Bezier curves are defined using four control points
6. True
7. Lack of local control
8. Non Uniform Rational B-Spline Curve.
9. False, Which reduces the memory consumption
10. Curve
11. knot vector.
12. Bézier , B-spline
13. True.
Terminal Questions
1. Everyone that has ever tried to apply simple linear interpolation to find a value
between pairs of data points will be only too aware that such attempts are extremely
unlikely to provide reliable results. For more details refer section 6.2.
Unit 6: Curves 22
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
2. Bezier curves are defined using four control points, known as knots. Two of these are
the end points of the curve, while the other two effectively define the gradient at the
end points. For more details refer section 6.3.
3. NURBS curves and surfaces are generalizations of both B-splines and Bézier curves and
surfaces, the primary difference being the weighting. For more details refer section 6.4.
4. The most commonly used methods of representing curved surfaces in computing are
Bézier surfaces and B-spline surfaces are discussed here. For more details refer section
6.6.
Unit 6: Curves 23
DCA3142: Graphics and Multimedia Manipal University Jaipur (MUJ)
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 7
Hidden Surfaces
Table of Contents
1. INTRODUCTION
In the previous unit, we discussed about curve and surface representation. We also discussed
Bezier curves, B-Spline curves, and Rational B-Spline curves.
In this unit, we will discuss hidden surfaces, determination of hidden surfaces and visible
surface detection methods. We will also discuss the z-buffer algorithm, depth-sort algorithm,
back face detection method, BSP tree method scan line method and so on. We will conclude
this unit with a discussion of fractal geometry and wire frame methods.
1.1 Objectives:
There are many techniques for hidden surface determination. These are fundamentally an
exercise in sorting, and usually vary as per the order in which the sorting is done and manner
in which the problem is subdivided. Sorting large quantities of graphics primitives is usually
done by the divide and conquer method.
• Hidden surface removal (HSR) determines which polygons are nearest to the viewer at
a given pixel
• Key criterion: A point P occludes a point Q (and thus Q is “hidden”) if P and Q lie on the
same ray (line) from the camera or eye and P is between the camera location and Q as
shown in figure 7.1.
• Calculating this ray is tough with a frustum, but normalizing that frustum to a cube
(which the projection matrix does) transforms the oblique rays to straightforward
parallelism with the z axis.
• Thus, at the earliest, HSR happens after the projection matrix is applied- explaining the
separation between the projection and viewport transformations.
SELF-ASSESSMENT QUESTIONS -1
Before moving on to visible surface detection, the following terms shall be reviewed:
1. Modeling Transformation:
In this stage, objects are transformed in their local modeling coordinate systems into a
common coordinate system called the world coordinates.
x' = x * (d/z),
Where (x, y, z) is the original position of a vertex, (x', y', z') is the transformed position
of the vertex, and d is the distance of image plane from the center of projection.
3. Clipping:
In 3D clipping, all objects and parts of objects which are outside of the view volume are
removed. After perspective transformation has been done, the 6 clipping planes, which
form the parallelepiped, are parallel to the 3 axes and hence clipping is straight
forward. Hence the clipping operation can be performed in 2D. For example, one may
first perform the clipping operations on the x-y plane and then on the x-z plane.
To identify the parts of a scene that are visible from a chosen viewing position, surfaces
which are obscured by other opaque surfaces along the line of sight (projection) are
invisible to the viewer.
Characteristics of Approaches:
Considerations:
– Available equipment
– Static or animated
1. Object-space Methods
➢ Compare objects and parts of objects to each other within the scene definition to
determine which surfaces, as a whole, should be labelled as visible:
Begin
➢ Determine those parts of the object whose view is unobstructed by other parts of it, or
any other object with respect to the viewing specification.
➢ Draw those parts in the object color.
End
➢ Compare each object with all the other objects to determine the visibility of the object
parts.
➢ If there are n objects in the scene, complexity = O(n2)
➢ Calculations are performed at the resolution in which the objects are defined (only
limited by the computation hardware).
➢ Process is unrelated to display resolution or the individual pixel in the image and the
result of the process is applicable to different display resolutions.
➢ Display is more accurate but computationally more expensive as compared to image
space methods because step 1 is typically more complex. For instance, it could be due
to the possibility of intersection between surfaces.
➢ Suitable for scenes with small number of objects and objects with simple relationship
with each other.
2. Image-Space Methods (Mostly used)
Visibility is determined point by point at each pixel position on the projection plane.
➢ Determine the object closest to the viewer that is pierced by the projector through
the pixel
➢ Draw the pixel in the object color.
End
➢ For each pixel, examine all n objects to determine the one closest to the viewer.
➢ If there are p pixels in the image, complexity depends on n and p (O (np)).
➢ Accuracy of the calculation is bound by the display resolution.
• Making use of the results calculated for one part of the scene or image for other nearby
parts.
• Coherence is the result of local similarity
• As objects have continuous spatial extent, object properties vary smoothly within a
small local region in the scene. Calculations can then be made incremental.
Types of coherence:
1. Object Coherence:
2. Face Coherence:
Surface properties computed for one part of a face can be applied to adjacent parts after
small incremental modification. (For example, if the face is small, it can sometimes be
assumed that if one part of the face is invisible to the viewer, the entire face is also
invisible).
3. Edge Coherence:
The visibility of an edge changes only when it crosses another edge, so if one segment
of a non-intersecting edge is visible, the entire edge is also visible.
Line or surface segments visible in one scan line are also likely to be visible in adjacent
scan lines. Consequently, the image of a scan line is similar to the image of adjacent scan
lines.
A group of adjacent pixels in an image is often covered by the same visible object. This
coherence is based on the assumption that a small enough region of pixels will most
likely lie within a single polygon. This reduces computation effort involved in searching
for those polygons which contain a given screen area (region of pixels) as in some
subdivision algorithms.
6. Depth Coherence:
7. Frame Coherence:
Pictures of the same scene at successive points in time are likely to be similar, despite
small changes in objects and viewpoint, except near the edges of moving objects. Most
visible surface detection methods make use of one or more of these coherence
properties of a scene.
In a solid object, there are surfaces which are facing the viewer (front faces) and there are
surfaces which are opposite to the viewer (back faces) as shown in figure 9.2. These back
faces contribute to approximately half of the total number of surfaces. As these surfaces
cannot be seen, they can be removed from the clipping process with a simple step. This is
done to save processing time.
Each surface has a normal vector. If this vector points in the direction of the center of
projection, it is a front face and can be seen by the viewer. If it points away from the center
of projection, it is a back face and cannot be seen by the viewer. The test is very simple, if the
z component of the normal vector is positive, then, it is a back face. If the z component of the
vector is negative, it is a front face. It must be noted that this technique only works well for
non-overlapping convex polyhedra. In other cases where there are concave polyhedra or
overlapping objects, it is necessary to apply other methods to further determine where the
obscured faces are partially or completely hidden by other objects (For example, using the
Depth-Buffer Method or Depth-sort Method).
The BSP (Binary Space partitioning) tree algorithm is based on the observation that a
polygon will be scanned and converted correctly (i.e., will not overlap incorrectly or be
overlapped incorrectly by other polygons). Scan and conversion begins with the other end
of the viewer then followed by the viewer’s end to avoid the overlapping of polygons.
It must be ensured that this is so for each polygon. The algorithm makes it easy to determine
a correct order for scan conversion by building a binary tree of polygons, the BSP tree. The
BSP tree’s root is a polygon selected from those to be displayed and the algorithm works
correctly, no matter which is picked. The root polygon is used to partition the environment
into two half-spaces. One half-space contains all the remaining polygons in front of the root
polygon, relative to its surface normal; the other contains all polygons behind the root
polygon. Any polygon lying on both sides of the root polygon’s plane is split by the plane, and
its front and back pieces are assigned to the appropriate half-space. One polygon each from
the root polygon’s front and back half-spaces becomes its front and back child respectively,
and each child is recursively used to divide the remaining polygons in its half-space in the
same fashion.
Build a BSP-tree
If V1 is on the front of the plane, then traverse down the back side If V2 is on the front of the
plane, then traverse down the front side
As per the reference from the figure 9.4, painting order from V1: 3, 5, 1, 4b, 2, 6, 4a and the
painting order from V2: 6, 4a, 2, 4b, 1, 3, 5
Figure 7.4
SELF-ASSESSMENT QUESTIONS -2
6. DEPTH-SORT ALGORITHM
The painter's algorithm is a simplified version of the depth-sort algorithm. In the depth-sort
algorithm, there are 3 steps that are performed:
Step 2: Resolve ambiguities, rearranging the object list and splitting faces as necessary.
Step 3: Render each face on the rearranged list in ascending order of smallest z coordinate
(from back to front).
Resolving ambiguities in step 2 is what replaces simple rendering of the whole face of the
object. Let the object farthest away be called as P. Before this object is rendered, it must be
tested against other objects which can be called Q, where the z extent of q overlaps the z
extent of P. This is done to prove that P cannot obscure Q and that P can therefore be written
before Q. Up to 5 tests are performed, in order of increasing complexity.
If all 5 tests fail, the assumption is that P obscures Q. So to test whether Q could be rendered
before P, tests 3 and 4 are performed again, with the polygons reversed:
3'. Is Q entirely on the opposite side of P's plane from the viewpoint? 4'. Is P entirely on the
same side of Q's plane as the viewpoint?
The figure 9.5 is a top down view, relative to the viewpoint, of objects P and Q.
In this case, test 3' succeeds. So, Q is moved to the end of the list of objects, and the old Q
becomes the new P. The next picture is a front view. The tests are inconclusive: The objects
intersect each other.
The figure 9.8 shows a more subtle case: it is possible to move each object to the end of the
list to place in the correct order relative to one, but not both, of the other objects.
This would result in an infinite loop. To avoid looping, the object that moves to the end of
the list are marked. If the first 5 tests fail and the current Q is marked, tests 3' and 4' are not
performed. Instead, one of the objects is split, as if tests 3' and 4' had both failed, and the
pieces are reinserted into the list.
7. Z-BUFFER ALGORITHM
The basic idea is to test the z-depth of each surface to determine the closest (visible) surface.
To do this, declare an array z buffer (x, y) with one entry for
each pixel position. Initialize the array to the maximum depth. Note: if one has performed a
perspective depth transformation, then all z values 0.0 <= z (x, y) <="1.0". So it is necessary
to initialize all values to 1.0. The algorithm is as follows:
The polyscan procedure can easily be modified to add the z-buffer test. The computation of
the z_depth (x, y) is done using coherence calculations similar to the x-intersection
calculations. It is actually a bi-linear interpolation, that is., interpolation both down (y) and
across (x) scan lines.
Disadvantages:
• May paint the same pixel several times and computing the color of a pixel may be
expensive. As a result it might compute the color only if it passes the z_buffer test. It is
also possible to sort the polygons and scan front to back (reverse of painter's
algorithm). While this tests all the polygons but avoids the expense of computing the
intensity and writing it to the frame buffer.
• Large memory requirements: if used real (4 bytes) then for 640 x 480 resolution, the
requirement would be 4bytes/pixel = 1,228,000 bytes. Usually one uses a 24-bit z-
buffer, so 900,000 bytes or 16 - bit z-buffer = 614,000 bytes. Note: For VGA mode 19
(320 x 200 and use only 240 x 200) the requirement is only 96,000 bytes for a 16-bit
z_buffer. However, one may need additional z - buffers for special effects, like shadows.
An alternative method for computing z depth values using plane equations is as follows:
Now recall the 2D viewing transformation in the procedure Point Viewing Transform, where:
Sx (VTScaleX) Sy(VTScaleY)
Cx (VTConstX) Cy(VTConstY)
All of which are functions of the window, viewport, and PDC. Now in Point Viewing
Transform there is:
or x = (xp - Cx) / Sx = xp/Sx - Cx/ Sx can be computed Cx/Sx, Cy/Sy once and
Now from the plane equations Z(Xw, Yw) = (- A*Xw - B*Yw + D) / C So Z(Xw+1, Yw) = (-A *
(X w + 1/Sx) - B*Yw + D) / C
So can find Xw, Yw, Zw at the polygon vertices and use the above to compute rest of Zw
values.
SELF-ASSESSMENT QUESTIONS - 3
In this method, as each scan line is processed, all polygon surfaces intersecting that line are
examined to determine which ones are visible. Across each scan line, depth calculations are
made for each overlapping surface to determine which is nearest to the view plane. When
the visible surface has been determined, the intensity value for that position is entered into
the image buffer. Figure 9.9 depicts scan line projection of two surfaces.
– Step 2 is not efficient because not all polygons necessarily intersect with the scan line.
– Depth calculation in 2a is not needed if only one polygon in the scene is mapped onto
a segment of the scan line.
With similar idea, every scan line span can be filled by spans as shown in figure 7.10.
When a polygon overlaps on a scan line, depth calculations are performed at their edges to
determine which polygon should be visible at which span. Any number of overlapping
polygon surfaces can be processed using this method. Depth calculations are performed only
when there are overlapping polygons. It is necessary to take advantage of coherence along
the scan lines as one passes from one scan line to the next. If there are no changes in the
pattern of the intersection of polygon edges with the successive scan lines, it is not necessary
to do depth calculations. This works only if surfaces do not cut through or otherwise
cyclically overlap each other. If cyclic overlap happens, the surfaces can be divided to
eliminate the overlaps.
SELF-ASSESSMENT QUESTIONS - 4
10. Depth calculation in 2a is not needed if only ____________ polygon in the scene is
mapped onto a segment of the scan line.
11. _________ method all polygon surfaces intersecting that line are examined to
determine which are visible.
12. When polygon overlaps on a scan line, we perform depth calculations. (State
True/False)
9. FRACTAL GEOMETRY
Since the 19th century, fractals have been regarded as merely a form of mathematics and
geometry with little or no practical purpose. However, in the 1970's, the mathematician
Benoit Mandelbrot adopted a more abstract definition of dimension than what is generally
used in standard Euclidean geometry. He suggested that the fractal must be handled
mathematically as though it has fractional dimensions, rather than strictly a whole number
of dimensions.
Then in 1987, Dr. Michael Barnsley discovered the Fractal Transform which can detect
fractal codes in real-world images and natural formations. This led to some practical uses,
such as fractal image compression which is widely used in multimedia computer
applications.
These images were generated using the freeware Fractint, by the Stone Soup Group, an in-
depth and versatile fractal program which has been perfected and added-to for many years.
One can use the program to zoom in on the fractals, in which case the displayed pattern is
recalculated at the higher resolution and new detail is revealed, or to rotate to different
angles, generate a 3D map of the fractal or project it onto a 3D surface, and to change the
color pallet and even cycle the colors to produce some very dynamic and hypnotic effects.
Literally, an infinite number of possible patterns is possible using even a single fractal type
or function. Figure 7.11 Beautiful lambda function with a zoomed-in view.
A wire frame model is a visual presentation of a three dimensional or physical object used in
3D computer graphics. It is created by specifying each edge of the physical object where two
mathematically continuous smooth surfaces meet, or by connecting an object's constituent
vertices using straight lines or curves. The object is projected onto the computer screen by
drawing lines at the location of each edge. The term wireframe comes from designers using
metal wire to represent the 3D shape of solid objects. 3D wireframe allows the construction
and manipulation of solids and solid surfaces. 3D solid modeling technique efficiently draws
high quality representation of solids than the conventional line drawing. Figure 7.12 exhibits
the wireframe image using hidden line removal.
Using a wire frame model allows the visualization of the underlying design structure of a 3D
model. Traditional 2-D views and drawings can be created by appropriate rotation of the
object and selection of hidden line removal via cutting planes.
Since wireframe renderings are relatively simple and fast to calculate, they are often used in
cases where a high screen frame rate is needed (for instance, when working with a
particularly complex 3D model, or in real-time systems that model exterior phenomena).
When greater graphical detail is desired, surface textures can be added automatically after
completion of the initial rendering of the wireframe. This allows the designer to quickly
review chansolids or rotate the object to new desired views without long delays associated
with more realistic rendering.
SELF-ASSESSMENT QUESTIONS - 5
11. SUMMARY
This unit provides information about the hidden surfaces. In 3D computer graphics, hidden
surface determination (also known as hidden surface removal (HSR), occlusion culling (OC)
or visible surface determination (VSD)). We also discussed visible surface detection
methods. We advantages and disadvantages of the z-buffer algorithm were studied. There
are surfaces which are facing the viewer (front faces) and there are surfaces which are
opposite to the viewer (back faces). The BSP tree algorithm is based on the observation that
a polygon will be scan converted correctly. The painter's algorithm is a simplified version of
the depth-sort algorithm. Scan line algorithm, scan line is processed; all polygon surfaces
intersecting that line are examined to determine which are visible. Fractal is a simple
mathematical expression that generates an infinitely complex geometric shape A wire frame
model is a visual presentation of a three dimensional or physical object used in 3D computer
graphics.
13. ANSWERS
Terminal Questions
1. Hidden surface removal (HSR) determines which polygons are nearest to the viewer at
a given pixel. For more details refer section 7.2.
2. Before going to visible surface detection, we first review and discuss the following
modeling transformation, perspective transformation. For more details refer section
7.3.
3. In a solid object, there are surfaces which are facing the viewer (front faces) and there
are surfaces which are opposite to the viewer (back faces). For more details refer
section 7.4.
4. The BSP tree algorithm is based on the observation that a polygon will be scan
converted correctly. For more details refer section 7.5.
DCA3142
GRAPHICS AND MULTIMEDIA
Unit 8
Coloring and Shading Models
Table of Contents
1. INTRODUCTION
In the previous unit, we discussed the concept of hidden surfaces, depth comparison, z-buffer
algorithm; back face detection, BSP tree method, the - rinter’s algorithm, scan-line algorithm,
hidden line elimination, wire frame methods and fractal - geometry.
In this unit we will discuss the colour and shading models. Colour model is an orderly system
for creating a whole range of colours from a small set of primary colours. The two famous
colour models RGB and CMYK are discussed here. Shading deals with the appearance of
objects depends, among other things, on the lighting that illuminates the scene and on the
interaction of light with the objects in the scene. Also discussed here is the texture mapping
which focused on texture resources, mapping from surfaces into texture space, texture and
phong reflectance as well as aliasing.
1.1 Objectives:
The process of altering the color of an object, surface, or polygon in a 3D scene can vary
depending on the software or programming language being used, but in general, it involves
the following steps:
1. Access the object/surface/polygon: First, you need to identify the specific object,
surface, or polygon that you want to change the color of. This can be done by selecting
it from the 3D scene or referencing it in your code.
2. Define the new color: Next, you need to define the new color that you want to apply.
This can be done using RGB values (red, green, blue), hexadecimal codes, or other color
models.
3. Apply the new color: Finally, you need to apply the new color to the
object/surface/polygon. This can be done by setting the object's material properties,
changing the color of the polygon's vertices, or using shaders to modify the color of the
object in real-time.
In most 3D software and programming languages, there are specific functions or APIs that
allow you to perform these steps. For example, in the popular 3D modeling software Blender,
you can change the color of an object by selecting it, opening the Properties panel, and
modifying the Material properties. In the programming language Python, you can use the
PyOpenGL library to modify the color of objects in a 3D scene.
3.1 Light
So for, we have studied the geometric aspects of how objects are transformed and
projected to images. We will discuss the shading of objects, what the appearance of
objects depends on among other things, on the lighting that illuminates the scene, and
on the interaction of light with the objects in the scene. Some of the basic qualitative
properties of lighting and object reflectance need to be modelled include:
Light Source – There are different types of sources of light, such as point sources
(example, a small light at a distance), extended sources (example, the sky on a cloudy
day), and secondary reflections example, light that bounces from one surface to
another).
Reflectance – Different objects reflect light in different ways. For example, diffuse
surfaces appear the same when viewed from different directions, whereas a mirror
looks very different from different points of view.
Discussed here is a simplified model of lighting that is easy to implement and fast to
compute, and used in many real-time systems such as OpenGL. This model is an
approximation and does not fully capture all of the effects observed in the real world.
Diffuse reflection
Let us begin with the diffuse reflectance model. A diffuse surface is one that appears similarly
bright from all viewing directions. That is, the emitted light appears independent of the
viewing location. Let𝑝̅ be a point on a diffuse surface with normal 𝑛⃗, light by a point light
source in direction 𝑠 from the surface. The reflected intensity of light is represented as:
where I is the intensity of the light source, rd is the diffuse reflectance of the surface, and is
the direction of the light source. This equation requires the vectors to be normalized, i.e., ||
𝑠 || = 1, ||𝑛⃗ = 1||.
The 𝑠 · 𝑛⃗ , term is called the foreshortening term. When a light source projects light
obliquely at a surface, that light is spread over a large area, and less portion of the light hits
any specific point. For example, imagine pointing a flashlight directly at a wall versus in a
direction nearly parallel. In the latter case, the light from the flashlight will spread over a
greater area, and individual points on the wall will not be as bright.
For colour rendering, it is important to specify the reflectance in colour (as (r d,R, T d,G, Td,B)),
and specify the light source in colour as well (IR, IG, IB). The reflected colour of the surface is
then represented as:
For pure specular (mirror) surfaces, the incident light from each incident direction ⃗⃗⃗
𝑑𝑖 is
reflected toward a unique emittant direction ⃗⃗⃗⃗
𝑑𝑒 . The emittant direction lies in the same
plane as the incident direction ⃗⃗⃗
𝑑𝑖 and the surface normal 𝑛⃗ , and the angle between 𝑛⃗ and
⃗⃗⃗⃗
𝑑𝑒 is equal to that between 𝑛⃗ and ⃗⃗⃗
𝑑𝑖 as shown in figure 8.1. One can show that the emittant
direction is given by ⃗⃗⃗⃗ ⃗⃗⃗𝑖 ) 𝑛⃗, − ⃗⃗⃗
𝑑𝑒 = 2(𝑛⃗ , ·𝑑 𝑑𝑖 .
Many materials exhibit a sign,ificant specular component in their reflectance, but only a few
are perfect mirrors. First, most specular surfaces do not reflect all light, and that is easily
handled by introducing a scalar constant to attenuate intensity. Second, most specular
surfaces exhibit some form of off-axis specular reflection. That is, many polished and shiny
surfaces (like plastics and metals) emit light in the perfect mirror direction and in some
nearby directions as well. These off-axis specularities may however look a little blurred.
Good examples are highlights on plastics and metals. More precisely, the light from a distant
point source in the direction of 𝑠 is reflected into a range of directions about the perfect
mirror directions 𝑚
⃗⃗ = 2(𝑛⃗. 𝑠)𝑛⃗ − 𝑠 . One common model for this is the following:
where rs is called the specular reflection coefficient, I is the incident power from the point
source, and α ≥ 0 is a constant that determines the width of the specular highlights. As α
increases, the effective width of the specular reflection decreases. In the limit, as α increases,
this becomes a mirror. The intensity of the specular region is proportional to max(0, cos φ)α,
⃗⃗ and ⃗⃗⃗⃗
where φ is the angle between 𝑚 𝑑𝑒 . One way to understand the nature of specular
reflection is to plot this function as depicted in figure 8.2.
Ambient Illumination
The diffuse and specular shading models are easy to compute, but often appear artificial. The
biggest issue is the point light source assumption, the most obvious consequence of which is
that any surface normal pointing away from the light source will have a radiance of zero. A
better approximation to the light source is a uniform ambient term plus a point light source.
This is a still a remarkably crude model, but it is much better than the point source by itself.
Ambient illumination is modelled by:
La( 𝑃̅ ) = ra Ia (8-6)
where ra is often called the ambient reflection coefficient, and Ia denotes the integral of the
uniform illuminant.
SELF-ASSESSMENT QUESTIONS - 1
A colour model is an abstract mathematical model describing the way the colours can be
represented as tuples of numbers, typically as three or four values or colour components.
When this model is associated with a precise description of how the components are to be
interpreted (viewing conditions, and so on) and the resulting set of colours is called colour
space. This section describes the methods by which human colour vision can be modelled.
3. 2 Color Model
The RGB color model is a color model used in digital imaging and computer graphics.
The name "RGB" stands for Red, Green, and Blue, which are the primary colors of light.
The RGB color model works by combining these three primary colors in various
proportions to produce a wide range of colors.
In this model, each color is represented by a value between 0 and 255, with 0 indicating
no color and 255 indicating the maximum amount of color. Thus, any color can be
represented by a combination of three numbers, one for each primary color.
For example, pure red would be represented as (255, 0, 0), pure green as (0, 255, 0),
and pure blue as (0, 0, 255). White would be represented as (255, 255, 255), while black
would be represented as (0, 0, 0).
The RGB color model is widely used in computer graphics, as it is the basis for the colors
displayed on computer screens and other digital devices. It is also used in digital
cameras, scanners, and other imaging devices.
The main purpose of the RGB colour model is for sensing, representation, and display
of images in electronic systems, such as televisions and computers, though it has also
been used in conventional photography. Before the electronic age, the RGB colour
model already had a solid theory behind it, based in human perception of colours. It
suggested that media that transmit light (such as television) use additive colour mixing
with primary colours of red, green, and blue, each of which stimulates one of the three
types of the eye's colour receptors with as little stimulation as possible of the other two.
This is called "RGB" colour space. Mixtures of light of these primary colours cover a
large part of the human colour space and thus produce a large part of human colour
experiences. This is why colour television sets or colour computer monitors need only
to produce mixtures of red, green and blue light. Other primary colours could be used
in principle, but with red, green and blue the largest portion of the human colour space
can be captured. Unfortunately there is no exact consensus as to what loci in the
chromaticity diagram the red, green, and blue colours should have. RGB is a device-
dependent colour model. Different devices detect or reproduce a given RGB value
differently, since the colour elements (such as phosphors or dyes) and their response
to the individual R, G, and B levels vary from manufacturer to manufacturer, or even
within the same device over time. Thus an RGB value does not define the same colour
across devices without some kind of colour management.
HSV and HSL are two alternative color models used to represent colors in digital
imaging and computer graphics.
HSV (Hue, Saturation, Value) and HSL (Hue, Saturation, Lightness) are both cylindrical
coordinate systems, which means they represent colors in three dimensions. However,
while they share some similarities, they are different in their approach to color
representation.
HSV represents colors based on their hue, saturation, and value. Hue refers to the color
itself, such as red, green, or blue. Saturation represents the intensity or purity of the
color, with 0 being a shade of gray and 100 being the most intense or pure color. Value
represents the brightness of the color, with 0 being black and 100 being the brightest
possible color.
HSL, on the other hand, represents colors based on their hue, saturation, and lightness.
Hue again refers to the color itself, while saturation represents the intensity or purity
of the color, as in the HSV model. Lightness, however, is different from value. It
represents the perceived brightness of the color, with 0 being black and 100 being
white.
Both models are often used in computer graphics and image editing software, as they
offer an alternative way to manipulate and adjust colors. For example, adjusting the hue
in HSV can shift the color spectrum, while adjusting the lightness in HSL can make an
image appear brighter or darker.
Because HSL and HSV are simple transformations of device-dependent RGB models, the
physical colours they define depend on the colours of the red, green, and blue primaries
of the device or of the particular RGB space, and on the gamma correction used to
represent the amounts of those primaries. Each unique RGB device therefore, has
unique HSL and HSV spaces to accompany it, while numerical HSL or HSV values
describe the different colours for each basis RGB space. Both of these representations
are used widely in computer graphics, and one or the other is often more convenient
than RGB, but both are also criticized for not adequately separating colour-making
attributes
CMYK colour model (process colour, four colour) is a subtractive colour model, used in
colour printing, and is also used to describe the printing process itself. CMYK refers to
the four inks used in some types of colour printing: cyan, magenta, yellow, and key
(black). Though it varies according to the print house, press operator, press
manufacturer and press run, the ink is typically applied in the order of the abbreviation.
The "K" in CMYK stands for key since in four-colour printing cyan, magenta, and yellow
printing plates are carefully keyed or aligned with the key of the black key plate. Some
sources suggest that the "K" in CMYK comes from the last letter in "black" and was
chosen because B already indicates blue. This explanation, though plausible and useful
as a mnemonic, is incorrect. The CMYK model works by partially or entirely masking
colours on a lighter, usually white, background. The ink reduces the light that would
otherwise be reflected. Such a model is called subtractive because inks "subtract"
brightness from white.
In additive colour models such as RGB, white is the "additive" combination of all
primary coloured lights, while black is the absence of light. In the CMYK model, it is the
opposite: white is the natural colour of the paper or other background, while black
results from a full combination of coloured inks. To save money on ink, and to produce
deeper black tones, unsaturated and dark colours are produced by using black ink
instead of the combination of cyan, magenta and yellow.
SELF-ASSESSMENT QUESTIONS - 2
4. The main purpose of the RGB colour model is for the ______________ ,
______________ and _______________ in electronic systems.
5. RGB is a device-dependent colour model. (State True/False).
6. HSL stands for
Shading is a process used in drawing for depicting levels of darkness on paper by applying
media more densely or with a darker shade for darker areas, and less densely or with a
lighter shade for lighter areas. There are various techniques of shading including cross
hatching where perpendicular lines of varying closeness are drawn in a grid pattern to shade
an area. The closer the lines are, the darker the area appears. Likewise, the farther apart the
lines are, the lighter the area appears.
Flat shading is a lighting technique used in 3D computer graphics to shade each polygon of
an object based on the angle between the polygon's surface normal and the direction of the
light source, their respective colors and the intensity of the light source. It is usually used for
high speed rendering where more advanced shading techniques are computationally
expensive. As a result of flat shading, all the polygon's vertices are coloured with one colour,
allowing differentiation between adjacent polygons. Specular highlights are rendered poorly
with flat shading. If there happens to be a large specular component at the representative
vertex, that brightness is drawn uniformly over the entire face. If a specular highlight doesn’t
fall on the representative point, it is missed entirely. Consequently, one does not include the
specular reflection component in the shading computation.
The idea of interpolative shading is to avoid computing the full lighting equation at each pixel
by interpolating quantities at the vertices of the faces.
Gouraud Shading
Gouraud shading is considered superior to flat shading, which requires significantly less
processing than Gouraud shading but usually results in a faceted look. If a mesh covers more
pixels in screen space than it has vertices, interpolating colour values from samples of
expensive lighting
calculations at vertices and is less processor intensive than performing the lighting
calculation for each pixel as in Phong shading. However, highly localized lighting effects will
not be rendered correctly, and if a highlight lies in the middle of a polygon but does not
spread to the polygon's vertex, it will not be apparent in Gouraud rendering. Conversely, if a
highlight occurs at the vertex of a polygon, it will be rendered correctly at this vertex (as this
is where the lighting model is applied), but will be spread unnaturally across all neighboring
polygons via the interpolation method. The problem is easily spotted in a rendering which
ought to have a specular highlight moving smoothly across the surface of a model as it
rotates. Gouraud shading will instead produce a highlight continuously fading in and out
across neighboring portions of the model, peaking in intensity when the intended specular
highlight passes over a vertex of the model.
Phong Shading
Phong shading is a technique used in computer graphics to produce a smooth shading effect
on 3D surfaces by interpolating surface normals across a polygon mesh. . It is also called
Phong interpolation or normal-vector interpolation shading This shading method was
invented by Bui Tuong Phong in 1975, and it has since become a fundamental tool for
rendering realistic images in 3D graphics.. Specifically, it interpolates surface normals across
rasterized polygons and computes pixel colours based on the interpolated normals and a
reflection model.
Phong shading improves upon Gouraud shading and provides a better approximation of the
shading of a smooth surface. Phong shading assumes a smoothly varying surface normal
vector. The Phong interpolation method works better than Gouraud shading when applied
to a reflection model that has small specular highlights such as the Phong reflection model.
The most serious problem with Gouraud shading occurs when specular highlights are found
in the middle of a large polygon. Since these specular highlights are absent from the
polygon's vertices and Gouraud shading interpolates based on the vertex colors, the specular
highlight will be missing from the polygon's interior. This problem is fixed by Phong shading.
Unlike Gouraud shading, which interpolates colours across polygons, in Phong shading a
normal vector is linearly interpolated across the surface of the polygon from the polygon's
vertex normals. The surface normal is interpolated and normalized at each pixel and then
used in a reflection model, like the Phong reflection model, to obtain the final pixel colour.
Phong shading is more computationally expensive than Gouraud shading
since the reflection model must be computed at each pixel level instead of at each vertex.
In modern graphics hardware, variants of this algorithm are implemented using pixel or
fragment shaders.
SELF-ASSESSMENT QUESTIONS - 3
Texture Mapping
Texture mapping is the process of determining where in a particular space a texture will be
applied. A texture consists of a series of pixels (also called texels), each occupying a texture
coordinate determined by the width and height of the texture. These texture coordinates are
then mapped into values ranging from 0 to 1 along a, u and v axes (u is width, v is height).
This process is called UV mapping. The resulting coordinates are UV coordinates. It is
usually preferred to give objects a more varied and realistic appearance through complex
variations in reflectance that convey textures. Surface marking variations in albedo (that is
the total light reflected from ambient and diffuse components of reflection). Areas covered
in this sections are, where textures come from, how to map textures onto surfaces, how
texture changes reflectance and shading, scan conversion under perspective warping, and
aliasing.
A) Texture Sources
Digital Images
To map an arbitrary digital image to a surface, one can define texture coordinates (u, v) 𝜖 [0,
1]2. For each point [u0, v0] in texture space, one gets a point as shown in figure 8.4.
For each face of a mesh, it is necessary to specify a point (μi, νi) for vertex
𝑝̅𝑖 i. Then define a continuous mapping from the parametric form of the surface 𝑠 (α, β) onto
the texture, that is define m such that (μ, ν) = m(α, β).
Example: For a surface of revolution, 𝑠(α, β) = (cx(α) cos(β), cx(α) sin(β), cz(α)). So let 0 ≤ α ≤
1 and 0 ≤ β ≤ 2π.
Scale texture values in the source image to be in the range 0 ≤ τ ≤ 1 and use them to
scale the reflection coefficients rd and ra. That is,
𝑟̃𝑑 = 𝑡𝑟𝑑 ,
𝑟̃𝑑 = 𝑡𝑟𝑎 .
One could also multiply τ by the specular reflection, in which case it would mean simply
scaling E from the Phong model.
D) Aliasing
A problem with high resolution texturing is aliasing, which occurs when adjacent
pixels in a rendered image are sampled from pixels that are far apart in a texture image.
By down-sampling or reducing the size of a texture, aliasing can be reduced for far away
or small objects, but then textured objects look blurry when close to the viewer. What
one really wants is a high resolution texture for nearby viewing, and down-sampled
textures for distant viewing. A technique called mipmapping gives us this by pre-
rendering a texture image at several different scales as shown in figure 8.6. For
example, a 256x256 image might be down-sampled to 128x128, 64x64, 32x32, 16x16,
and so on. Then it is up to the renderer to select the correct mipmap to reduce aliasing
artifacts at the scale of the rendered texture.
Figure 8.6: An aliased high resolution texture image (left) and the same texture after
mipmapping (right)
SELF-ASSESSMENT QUESTIONS – 4
5. SUMMARY
This unit provides information about the RGB and CYMK colouring models. RGB is a colour
model that uses the three primary (red, green, blue) additive colors, which can be mixed to
make all other colours. HSV and HSL (hue, saturation, value and hue, saturation, lightness),
were introduced in the late 1970s. HSV and HSL improve on the colour cube representation
of RGB by arranging colours of each hue in a radial slice, around a central axis of neutral
colours which ranges from black at the bottom to white at the top. The section on basic
lighting and reflection discussed about the simple reflection models. Shading is a process
used in drawing for depicting levels of darkness on paper by applying media more densely
or with a darker shade for darker areas, and less densely or with a lighter shade for lighter
areas. Flat and interpolative shadings were covered in this section. Texture mapping is the
process of determining where in a particular space a texture will be applied. Discussed here
was the sources of texture. This unit concluded with the discussion of texture and phong
reflectance and aliasing.
6. TERMINAL QUESTIONS
7. ANSWERS
1. Light
2. A diffuse surface is one that appears similarly bright from all viewing directions
3. False.
4. sensing, representation, and display of images
5. True.
6. a. hue, saturation, and lightness
7. Key(black)
8. Shading
9. True.
10. Gouraud and phong
11. Texture procedure and Digital images.
12. Aliasing.
Terminal Questions
1. A color model is an abstract mathematical model describing the way colors can be
represented as tuples of numbers, typically as three or four values or color components.
When this model is associated with a precise description of how the components are to
be interpreted (viewing conditions, etc.). For further details refer section 8.3.
2. How the appearance of objects depends, among other things, on the lighting that
illuminates the scene, and on the interaction of light with the objects in the scene. For
more details refer section 8.3.
3. Simplified model of lighting that is easy to implement and fast to compute, and used in
many real-time systems such as OpenGL. This model will be an approximation and does
not fully capture all of the effects we observe in the real world. For more details refer
sub section 8.4.1
4. Shading is a process used in drawing for depicting levels of darkness on paper by
applying media more densely or with a darker shade for darker areas, and less densely
or with a lighter shade for lighter areas. For more details refer section 8.5.
Unit 9 Multimedia
Structure:
9.1 Introduction
Objectives
9.2 Introduction and Concepts of Multimedia
Definition
Medium
9.3 Uses of Multimedia
9.4 Role of Hypertext and Hypermedia
9.5 Image and Video
9.6 Standards in Multimedia
9.7 Summary
9.8 Terminal Questions
9.9 Answers
9.1 Introduction
We perceive the environment through our senses. These senses, that is, sight
and hearing are brought into play as we interact with our surroundings. Our
sensory organs send signals to the brain, which interprets this interaction.
The process of communication which is sending messages from one person
to another is dependent on the understanding abilities of our senses. In
general, the more information that is perceived by the receiver, the more
effective communication will be.
Let us take the case of a person wanting to tell a friend the details of a trip
made during the vacations. With advancements in technology, there are
different ways through which this communication can happen. Some of the
methods to communicate are:
Case 1: Assuming a letter is written to the friend describing the trip. In this
case the friend can just read the text but not see the expression and
excitement of the writer. Similarly, the writer would need to wait for the friend
to reply to know how he/she felt.
Case 2: Assuming a few photographs taken during the trip are sent along with
the letter, then the friend can visualize the fun the writer had.
Case 3: Assuming that communication was over phone, then the friend can
hear the person’s excitement over the trip in his/her voice and understand
Manipal University Jaipur B1552 Page No.: 177
Graphics and Multimedia Systems Unit 9
the emotions better. Similarly the friend’s reactions are also spontaneous over
the phone.
Case 4: Assuming the case of a video chat with the friend. Here it is possible
for both participants to see each other, hear each other and share a
conversation.
In each case, the message or information is conveyed but with a different
approach. Therefore the more information is sent, the greater is the impact
of the communication and the different media discussed like letter (text),
photograph (image), telephone (voice), video chat (video) form the basic
components of multimedia.
In this unit, we will discuss the definition and basic concepts of multimedia
and where the multimedia can be used. We will also discuss the concept of
hypertext, hypermedia and its relationship with multimedia. Finally we will
conclude this unit with the discussion of standards available for images, video
and audio
Objectives:
After studying this unit, you should be able to:
• discuss the basic concepts of multimedia
• list and explain the various mediums of multimedia
• explain the uses of multimedia
• differentiate and explain hypertext and hypermedia
• discuss the image, video and audio standards
The introduction to terminology begins with the notion multimedia, followed by the
description of media and the important properties of multimediasystems. Actually the word
“Multimedia” comes from the Latin words ‘multus’ which means “numerous” and media
which means “middle”. Incidentally the word ‘media’ conveys the meaning ‘intermediary’
therefore; multimedia means multiple intermediaries or multiple means. The multiple
means by which the information/data is stored, transmitted or presented are:
➢ Text (example books, letters, and newspapers) includes both unformatted
text comprising of characters from limited character set, andformatted text
strings that are used for the structuring access and presentation of
electronic documents.
➢ Images and Graphics (example photographs, charts, maps, logos, and
sketches) include computer generated images, comprising lines curves
and circles, and digitized images of documents and pictures.
➢ Audio/Sound (example radio, gramophone, records and audio
cassettes) includes both low fidelity speeches as used in telephony as
well as high fidelity stereophonic music that are used in CD’s.
➢ Video and Animation (example T.V, radio cassettes and motion pictures)
includes short sequence of moving images (in video clips) and complete
movies/film.
9.2.2Medium
The meaning of the word media varies according to the context in which it is
used. Our definition of medium is a means to distribute and represent
information. Media can be text, graphics, pictures, voice, sound and music.
Media can be classified with respect to different criteria that is, perception,
representation, presentation, storage, transmission, and information
exchange. Each of these criteria will be discussed in detail.
Perception Medium: The perception media helps us to sense our
environment. We can perceive information mostly through seeing and hearing
the information. The perception of information through seeing or visual
media includes text, graphics, image and video. The perception of
information through hearing which is auditory media includes music, sound
and voice.
Representation Medium: Representation media refers to how the
information is represented internally by the computer. There are various
formats used for representing media information in a computer. For example,
9.3Uses of Multimedia
Multimedia has found large applications in various areas including, but not
limited to, advertisements, art, education, entertainment, engineering,
medicine, mathematics, business, scientific research and spatial temporal
applications. Some examples are as follows:
Entertainment: Multimedia is heavily used in the entertainment industry,
especially to develop special effects in movies and animations. Computer
games are also one of the main applications of multimedia because of the
high amount of interactivity involved.
Education: In education, multimedia is used to produce computer-based
training courses (popularly called CBTs) and reference books like
encyclopedia and manuals. A CBT lets the user go through a series of
presentations, text about a particular topic, and associated illustrations in
various information formats. Edutainment is an informal term used to describe
the combination of education with entertainment, especially multimedia
entertainment.
Industry: In the industrial sector, multimedia is used as a way to help present
information to shareholders, superiors and co-workers. Multimedia is also
helpful for providing employee training, advertising and selling products all
over the world via virtually unlimited web-based technology. For example, in
case of tourism and travel industry, travel companies can market packaged
Hypertext System
A hypertext system is mainly determined through non-linear links of
information. Pointers connect the nodes. The data of different nodes can be
represented with one or several media types. In a pure text system, only text
Multimedia System
A multimedia system contains information which is coded at least in a
continuous and discrete medium. For example, if only links to text data are
present, then it is not a multimedia system, it is a hypertext. A video
conference, with simultaneous transmission of text and graphics, generated
by a document processing program, is a multimedia application, although it
does not have any relation to hypertext and hypermedia.
Hypermedia System
As the figure 9.2 shows, a hypermedia system includes the non-linear
information links of hypertext systems and continuous and discrete media of
multimedia systems. For example, if a non-linear link consists of text and
video data, then this is a hypermedia, multimedia and hypertext system.
Self Assessment Questions
8. Hypertext can be either or .
9. What is hypermedia?
10.A non-interactive cinema presentation is an example of ------------
9.5.1 : Digital video : refers to a video signal that has been converted into a digital format, allowing
it to be processed, stored, and transmitted using digital technology. In contrast to analog video, which
is represented as a continuous waveform, digital video is represented as discrete numerical values,
typically in binary code, that can be manipulated and stored by computers and other digital devices.
• In the early 1980s, Sony introduced the first digital recording system for broadcast television,
called the Betacam. This system used a digital signal to record video onto magnetic tape,
providing higher quality and greater flexibility than traditional analog recording methods.
• The 1990s saw the introduction of several new digital video formats.
• In 1995 and 1996 Professional digitаl videоtарe and DVD plays were released along with
WRAL-TV becomes the first television station.
• In the early 2000s, digital video began to be widely adopted for online streaming and
distribution. And all TV stations nationwide in US began in 2009.
Digital media is used in the field of
• Education
• Entertainment
• Information
• Advertising
Characteristics of video :
Analog video signals are continuous and vary in voltage or amplitude over time. Analog video signals
are typically transmitted through analog broadcast or recorded on analog media such as VHS tapes
or analog film. Analog video can degrade over time or through repeated copying, resulting in lower
quality and visual noise or distortion.
Digital video, on the other hand, represents video as a series of binary numbers. Digital video is
typically recorded on digital media such as DVDs, Blu-ray discs, or digital files such as MP4 or MOV.
Digital video is less prone to degradation over time and can be copied and distributed without loss of
quality.
Аsрeсt Ratio
• Dimension of width to height.: Aspect ratio in computer graphics refers to the proportional
relationship between the width and height of an image or screen. It is usually expressed as
a ratio of the width to the height, such as 4:3, 16:9, or 21:9.
Frаme Rate: It is Speed аt which video frames аррeаr. It is Measured in frames per second (fрs).
frame rate refers to the number of frames or still images displayed per second in a video or animation.
The frame rate is an important factor in determining the quality and smoothness of a video or
animation. A higher frame rate generally results in smoother motion and less flicker, while a lower
frame rate can result in a choppy or stuttering appearance.
File formats :
Manipal University Jaipur B1552 Page No.: 187
Graphics and Multimedia Systems Unit 9
There are several common video file formats used in computer graphics and video production. Some
of the most popular ones include:
MP4: This is a widely used video format that is compatible with most devices and platforms. It
provides good compression while maintaining high quality, making it suitable for streaming and
sharing online.
AVI: This is a popular format used for storing video files on Windows-based computers. It supports
multiple codecs and is capable of storing high-quality video files.
MOV: This is a video format developed by Apple that is widely used for video editing and production.
It supports high-quality video and audio, and is compatible with both Mac and Windows operating
systems.
WMV: This is a video format developed by Microsoft that is commonly used for streaming and sharing
online. It provides good compression and is compatible with most Windows-based devices.
FLV: This is a video format commonly used for online video streaming and sharing, especially on
websites such as YouTube. It provides good compression and is compatible with most web browsers.
9.5.2 : Image :
• Image representation deals with creation, representation and management оf images оn the
computer display.
• It also Deals with professional and reasonable perspectives оf virtual image synthesis.
Images can be stored in digital or physical formats and can be viewed or displayed in a variety of
ways, including on screens, printed on paper, or projected onto surfaces.
An image made by a computer can show a basic scene as well as complex scenes.
Images can be created using a variety of techniques, such as scanning a physical photograph or
drawing, rendering a 3D model, or drawing directly onto a computer using a digital tablet or stylus.
Once an image is created, it can be manipulated and edited using specialized software to adjust
color, brightness, contrast, and other visual properties.
In 1963, Ivan Sutherland created a groundbreaking computer program called Sketchpad, which
allowed users to create and manipulate simple images using a light pen and a computer display. This
marked the beginning of interactive computer graphics, and Sketchpad is considered one of the first
examples of a graphical user interface.
Throughout the 1960s and 1970s, researchers continued to develop new techniques and
technologies for computer graphics, including raster graphics, vector graphics, and 3D graphics. In
1972, the first video game, Pong, was created using computer graphics, and it quickly became a
cultural phenomenon.
Manipal University Jaipur B1552 Page No.: 188
Graphics and Multimedia Systems Unit 9
Image files are digital files that contain visual information, typically in the form of a bitmap or a vector
graphic. There are several types of image files, each with its own characteristics and intended uses.
• An image file refers to any pictorial representation that is stored in the computer memory.
• An image file format refers to the particular format in which an image file is stored.
• A file format stores the number of rows and columns of image pixels.
• File formats is important in the process of printing, process of scanning and internet use.
File Formats :
Here are some of the most common types of image files:
JPEG: JPEG (Joint Photographic Experts Group) is a popular file format for digital photos and other
images with complex color gradients. It uses lossy compression, which means that some of the
original image data is discarded to reduce the file size.
PNG: PNG (Portable Network Graphics) is a file format that supports transparent backgrounds and
is commonly used for web graphics and logos. It uses lossless compression, which means that the
original image data is preserved and can be edited without losing quality.
GIF: GIF (Graphics Interchange Format) is a file format that supports animation and is commonly
used for small, simple animations and web graphics. It uses lossless compression and a limited color
palette, which makes it less suitable for photos and other complex images.
TIFF: TIFF (Tagged Image File Format) is a high-quality file format that is often used for printing and
professional graphics applications. It
Image compression :
There are two types of image compression: lossless and lossy.
Lossless compression algorithms preserve all the original data in the image, while still reducing its
size. Common lossless compression techniques include Run-Length Encoding (RLE), Huffman
coding, and Lempel-Ziv-Welch (LZW) compression. Lossless compression is commonly used in
medical imaging and other fields where it is important to preserve all the original data.
On the other hand, lossy compression algorithms discard some of the original data to achieve greater
compression. The degree of loss can be controlled to balance the reduction in file size with the loss
in image quality. Lossy compression algorithms are commonly used in digital photography and web
graphics, where the file size is more important than the minor loss in image quality.
JPEG (Joint Photographic Experts Group): This is a standard for compressing digital images, which
is widely used for photos and graphics on the web and in digital media. JPEG (Joint Photographic
Experts Group) is a standard for compressing digital images. The JPEG standard was first introduced
in 1992 and has since become one of the most widely used formats for storing and sharing digital
photos and other images.JPEG uses a lossy compression algorithm to reduce the size of digital
images. This means that some information is lost during compression, resulting in a reduction in
image quality. However, JPEG compression is designed to minimize the loss of image quality, while
still achieving significant reductions in file size.JPEG files can be opened and viewed on a wide range
of devices and software applications, making it a highly versatile format for sharing and storing digital
images. However, the lossy compression used by JPEG means that it may not be suitable for all
applications, such as professional photography or scientific imaging, where preserving image quality
and accuracy is paramount.
MHEG (Multimedia and Hypermedia information coding Expert Group) is a standard for interactive
digital television (DTV) that was developed by the International Organization for Standardization
(ISO) and the International Electrotechnical Commission (IEC). MHEG (Multimedia and Hypermedia
information coding Expert Group) is a standard for authoring interactive multimedia content, such as
television programs and interactive digital TV services. It was developed by ISO/IEC (International
Organization for Standardization/International Electrotechnical Commission) in the 1990s as a way
to enable interactivity and multimedia content in broadcast television.MHEG is based on the concept
of declarative programming, which means that instead of writing instructions for the computer to
follow, the author of the content describes what the content should look like and how it should behave.
This makes it easier for non-programmers to create interactive multimedia content, such as TV shows
with interactive elements or quizzes, without needing to know how to write computer code.MHEG is
designed to work with a range of devices, including TVs, set-top boxes, and other devices that
support digital TV services.
• DC (Dublin Core) is a standard for describing and organizing digital resources, such as web
pages, images, and videos. It was developed by the Dublin Core Metadata Initiative, an
international organization focused on developing metadata standards for describing digital
resources.The Dublin Core standard consists of a set of 15 elements, such as title, creator,
and date, that can be used to describe digital resources in a standardized way. These
elements provide basic information about the resource and are intended to be used in
combination with other metadata standards to provide more detailed information about the
resource.
RDF (Resource Description Framework) is a standard for describing and representing information
on the web. It provides a way to express metadata about resources on the web, such as web pages,
images, and videos, in a machine-readable format that can be understood by computers.RDF is
based on the idea of using simple statements, or "triples," to describe relationships between
resources. Each triple consists of a subject, a predicate, and an object, which together express a
statement about the resource. For example, a triple might describe the relationship between a web
page, its author, and the date it was created.RDF is designed to be flexible and extensible, allowing
users to define their own vocabularies and ontologies for describing resources.
9.6Summary
In this unit we discussed the meaning of multimedia and various
developments in the field. Let us summarize the important points discussed
in the unit:
Multimedia involves multiple media like text, image, graphics, audio, video
and animation. These provide more effective ways to communicate ideas and
views when compared to the traditional textual form. Hypertext allows non-
sequential reading and writing of documents by using embedded links to
jump from one place in the document to another Hypermedia is a computer
based information retrieval system that enables a user to gain or provide
access to texts, audio and video recordings.
A multimedia system enables the user to navigate to specific portions of the
content as desired thus providing non-linearity property. Interactivity helps the
user to get involved with the system. Multimedia is used in different fields in
order to improvise the quality of the work.
This unit concludes with the discussion of multimedia standard, which refers
to the exchange of content. It is important to be aware of the standards and
use them wisely. The goal of the standards was to develop a Coded
Representation of Multimedia and Hypermedia Information.
The development of powerful multimedia computers and the evolution of the
Internet have led to a wide range of applications of multimedia.
9.7Terminal Questions
1. List and explain the various criterions of multimedia.
2. Explain the uses of multimedia.
3. Discuss in detail the difference between hypertext and hypermedia.
4. Explain the types and uses of hypertext.
5. Discuss the file formats of video.
6. Explain the Standards of multimedia in detail.
9.8Answers
Self-Assessment Questions
1. "rich media"
2. multus
3. True
Manipal University Jaipur B1552 Page No.: 195
Graphics and Multimedia Systems Unit 9
4. Voice
5. a. Engineering
6. Education
7. tele - medicine
8. Static or dynamic
9. It is used as a logical extension of the term hypertext; different mediums
are intertwining to create a generally nonlinear medium of information.
10. Multimedia
11. Ivan Sutherland
12. TIFF
13. Analog
14. Multimedia and Hypermedia information coding Expert Group
15. SGML
16. RDF
Terminal Questions
1. Media can be classified with respect to different criteria i.e., perception,
representation, presentation, storage, transmission, and information
exchange. Refer sub sections 9.2.
2. Multimedia has found large applications in various areas including
advertisements, art, education, entertainment, engineering etc. Refer
section 9.3.
3. Hypertext based on the user demand which leads the user towards the
related information. Refer section 9.4.
4. Hypertext documents can either be static (prepared and stored in
advance) or dynamic (continually changing in response to user input).
Refer section 9.4.
5. File formats: There are several common video file formats used in
computer graphics and video production. Some of the most popular ones
include: MP4,AVI,MOV,WMV,FLV. Refer section 9.5
6. Standards in multimedia refer to agreed-upon specifications and
guidelines for creating and delivering multimedia content, including
audio, video, images, and interactive media.. Refer section 9.6.
Unit 10 Audio
Structure:
10.1 Introduction
Objectives
10.2 Standard and the compression technique
10.3 Digital Audio
10.4 MIDI
MIDI Basic Concepts
MIDI Devices
MIDI Messages
10.5 Processing and Sampling Sound
10.6Compression
Differential Pulse Code Modulation
Adaptive Differential PCM
Adaptive Predictive Coding
Linear Predictive Coding
10.7Summary
10.8Terminal Questions
10.9Answers
10.1 Introduction
In the previous unit, we discussed the basic concepts and applications of
multimedia and explored the difference between hypertext and hypermedia
as well as its significance. We also discussed the standards of image, video
and audio. Apart from image and text, audio too plays an important role in
various multimedia applications. Audio comprises of continuously varying
analog signals which are converted in to digital form by the digitization
process known as PCM (Pulse code Modulation). MIDI, which stands for
“Musical Instrument Digital Interface,” is a system that allows electronic
musical instruments and computers to send instructions to each other. It
sounds simple, but MIDI provides some profound creative opportunities that
are discussed in this unit. The process of sampling is the reduction of a
continuous signal to a discrete signal. Finally we will discuss the concept of
audio compression. It is a form of data compression designed to reduce the
transmission bandwidth requirement of digital audio streams and the storage
size of audio files.
Objectives:
After studying this unit, you should be able to:
• Explain the concepts of standards in audio.
• explain the concepts of digital audio
• discuss the MIDI concepts
• explain the concept of processing and sampling sound
• list and discuss various audio compression techniques
Standards in audio refer to the technical specifications and guidelines that are used
to ensure interoperability, compatibility, and quality in the creation, recording,
processing, and distribution of audio signals. There are several standards in audio,
including:
1. Sampling rate and bit depth: The sampling rate and bit depth are technical
specifications that define the quality and resolution of an audio signal. The most
common sampling rate for audio is 44.1 kHz, while the most common bit depth is 16
bits.
2. Digital Audio Workstation (DAW) standards: DAWs are software applications
that are used for recording, editing, and producing audio. Common DAW standards
include the Audio Engineering Society (AES) and the Broadcast Wave Format
(BWF).
3. Audio codecs: Audio codecs are software algorithms that are used to
compress and decompress audio data. Common audio codecs include MP3, AAC,
and FLAC.
4. Audio file formats: Audio file formats are used to store and distribute audio
data. Common audio file formats include WAV, AIFF, MP3, and FLAC.
5. Loudness standards: Loudness standards are used to ensure that audio
signals have consistent volume levels across different platforms and devices. The
most common loudness standards include the European Broadcasting Union (EBU)
R128 and the Advanced Television Systems Committee (ATSC) A/85.
Adhering to these standards ensures that audio signals can be recorded, processed,
and distributed with consistency, compatibility, and high quality.Top of Form
10.4 MIDI
MIDI stands for Musical Instrument Digital Interface. It is a technical standard that
was developed in the 1980s to allow electronic musical instruments, computers, and
other devices to communicate with each other. MIDI uses a standardized protocol to
transmit information about musical events, such as notes played, pitch, duration,
velocity, and other performance parameters.
MIDI data can be recorded and edited in a MIDI sequencer or Digital Audio
Workstation (DAW), allowing musicians to create and manipulate musical
compositions using software instruments, virtual synthesizers, and other digital tools.
MIDI files can also be played back on MIDI-compatible hardware devices, such as
synthesizers, drum machines, and sequencers.
MIDI has revolutionized the way music is created, produced, and performed. It has
enabled musicians to create complex arrangements and orchestral scores with ease,
and has made it possible to synchronize music with video, lighting, and other
multimedia elements. MIDI has also paved the way for the development of electronic
dance music (EDM), video game music, and other forms of electronic music that rely
heavily on digital instrumentation and production techniques.
instrument to the MIDI IN of another instrument, and vice versa. For example,
the following figure 10.2 shows the connection between a computer's MIDI
interface and a MIDI keyboard that has built-in sounds.
Figure 10.2: Connection between computer MIDI interface and MIDI keyboard
10.4.2MIDI devices
Any musical instrument that satisfies both the components of the MIDI
standard capable of communicating with other MIDI devices through
PCM System
Two basic operations in the conversion of analog signal into the digital is time
discretization and amplitude discretization. In the context of PCM, the former
is accomplished with the sampling operation and the latter by means of
quantization. In addition, PCM involves another step, namely, conversion of
quantized amplitudes into a sequence of simpler pulse patterns (usually
binary), generally called as code words. (The word code in pulse code
modulation refers to the fact that every quantized sample is converted to an
R -bit code word.). Fig. 10.5 illustrates a PCM system
analog signal, denoted by m (t ) . If no channel errors, m (t ) ≈ m (t )
10.6 Compression
Compression involves the encoding of digital audio data to take up less
storage space and transmission bandwidth. Audio compression typicallyuses
lossy methods, which eliminate bits that are not restored at the other end.
10.6.1 Differential Pulse Code Modulation
Differential Pulse Code Modulation (DPCM) is derived from PCM and is based
on the fact that most audio signals show significant correlation between
successive samples. As a result encoding incorporates redundancy in
sample values. Therefore in DPCM only the difference between two
successive samples is considered instead of the original sample signal. In
DPCM, only the digitized difference signal is used to encode the waveform,
thus taking fewer bits as compared to the PCM signalwith the same sampling
rate. Figure 10.6 shows a DPCM encoder and decoder.
(a)
(b)
Figure 10.6: DPCM (a) Encoder (b) Decoder
The DPCM encoder as shown in figure 10.6 (a) consists of a Register (R). It
is a temporary storage where the previous digitized sample of the analog input
signal is stored. The difference signal (that is, DPCM in figure 10.6(a)) is
computed by a subtractor which subtracts the current contents of the register
from the new digitized sample or in other words the output by the ADC (PCM).
As shown in figure 10.6, the contents of register R is updated by adding the
current register contents and the computed difference signal output by the
subtractor. The DPCM values thus computed are fed to serial to parallel
convertor and then transmitted.
The block diagram of the decoder is shown in figure 10.6 (b). The decoder
performs the reverse operation of the encoder. So, it operates by adding the
received DPCM to the previously computed signal held in the register. The
compression achieved through DPCM is limited to 1-bit. Therefore the bit rate
required for a standard PCM voice signals is reduced from 64kbps to56
kbps.
The accuracy of each computed difference signal is determined by the
accuracy of the previous signal/value held in the register. So the overall
accuracy of DPCM depends on the register value and more over the previous
value in the register is just an approximation value. Hence, a more enhanced
technique has been developed to estimate a more accurate value of the
previous signal. This technique is known as predicting. In this technique, the
value of the previous signal is predicted by using not only the estimate of the
current signal but also varying proportions of other
(a)
(b)
Figure 10.7: Predictive DPCM (a) Encoder (b) Decoder
(a)
(b)
Figure 10.8: (a) ADPCM encoder (b) ADPCM decoder
As shown in figure 12.8 (a), the encoder consists of two filters. One filter
passes frequency in the range 50Hz to 3.5 kHz while the other passes
frequency in the range 3.5 kHz to 7 kHz. Therefore, the input signal is divided
into two signals: lower sub-band and upper sub-band signal. Each sub-band
signal is then sampled and encoded independently using ADPCM. The
sampling rate for the upper sub-band signal is 16ksps (kilo samples per
seconds) as it contains the higher frequency components and the sampling
rate for lower sub-band signal is 8ksps. Therefore, different bit rates can be
used for each of them. The bit rates can be 64, 56, or 48 kbps. Considering
a bit rate of 64 kbps, then the lower sub-band is ADPCM encoded at
48 kbps and the upper sub-band is encoded at 16 kbps. The two bit streams
are then merged together using a multiplexer to produce the signal to be
transmitted.
On receiving the encoded stream the de-multiplexer in the decoder divides
the stream into two separate streams depending on the frequency range as
shown in figure 12.8 (b). The stream within the low band range is decoded by
a lower sub-band ADPCM decoder. Similarly, the stream within the uppersub-
band range is decoded by upper sub-band ADPCM decoder. The decoded
stream is passed through a low pass filter which produces the speech signal.
10.6.3 Adaptive Predictive Coding
The principle of Adaptive Predictive Coding (APC) is to make the predictor
coefficients, adaptive since the predictor coefficients vary continuously as
they are dependent on the characteristics of the audio signal being digitized.
Therefore the input signal is divided into fixed time segments and the
characteristics are determined for each segment. The optimum set of
coefficients is then computed and these are used to predict the previous
signal more accurately. Using this method, a high level of compression is
achieved thereby reducing the bandwidth requirement to 8 kbps, while still
maintaining an acceptable perceived signal.
10.6.4 Linear Predictive Coding
The algorithms discussed in the previous sections 10.6.1 to 10.6.3 arebased
on sampling and then quantizing only the difference signal. An alternative
approach also exists and is called Linear Predictive Coding (LPC), which
involves the analysis of the audio waveform to determine a
(a)
(b)
Figure 10.9: (a) LPC encoder (b) LPC decoder
10.7 Summary
In this unit we began with a discussion on the basic concepts of audio and
about the MIDI (Musical Instrument Digital Interface). MIDI is a protocol used
to perform a direct connection between a MIDI instrument and the computer.
This protocol is widely used by composers, musicians and performers as a
tool. All sound waves including speech and music can be approximated using
audio signals. We discussed the Pulse Code Modulation technique by which
a sampled analog signal is converted into digital audio signal.
We also learnt some of the basic principles of audio compression. In DPCM,
the difference between the consecutive signals is considered for encoding
instead of the original signal. Therefore the overall accuracy of DPCM
depends on the accuracy of the previous signal. To overcome this, an
enhanced technique known as Predictive DPCM was designed in which the
value of the previous signal is predicted by using not only the estimate of the
current signal but also the varying proportions of other immediately preceding
estimated signals. Another technique similar to DPCM calledAdaptive DPCM
(ADPCM) uses variable bits to represent the difference signals. We also
discussed another technique called LPC which is based on the perceptual
features of the signal.
10.9 Answers
Self Assessment Questions
1. Digitizing
2. True
3. Musical Instrument Digital Interface
4. MIDI IN and MIDI OUT
5. d. Omni On/Off
6. Pulse code Modulation
7. Band limited signal.
8. amplitudes
9. Period
10. LPC
Terminal Questions
1. Digital audio has emerged because of its usefulness in the recording,
manipulation, mass-production, and distribution of sound. Refer section
10.3
2. The MIDI is a standard protocol that enables electronic musical
instruments) and computers to communicate and synchronize with each
other. Refer Sub-sections 10.4.1,10.4.2,10.4.3.
3. Sampling the audio signal at a minimum rate which is twice that of the
maximum frequency component of the signal. Refer Section 10.5
4. Differential Pulse Code Modulation (DPCM) is derived from PCM and is
based on the fact that most audio signals show significant correlation.
Refer Sub-section 10.6.1.
5. ADPCM is based on the same principle as that of DPCM except that the
difference signal obtained is represented by variable number of bits
depending on its amplitude. Refer Sub-section 10.6.2.
6. LPC involves the analysis of the audio waveform to determine a selection
of the perceptual features it contains. Refer Sub-section 10.6.4.
Unit 11 Video
Structure:
11.1 Introduction: The concept of video in Multimedia
Objectives
11. 2 Compression Techniques
11.3 MPEG Compression Standard
MPEG-1
MPEG-2
MPEG-4
11.4 Compression Through spatial and temporal Redundancy
11.5 Inter Frameand Intra Frame Compression
11.6 Summary
11.7 Terminal Questions
11.8 Answers
Objectives:
After studying this unit, you should be able to:
• list and discuss various MPEG compression standards.
• discuss the techniques of compression through redundancy.
• list and explain the types of frames.
• explain the concept of interframe and intraframe compression.
Video data may be represented as a series of still frames, or fields for interlaced
video. The sequence of frames will almost certainly contain both spatial and
temporal redundancy that video compression algorithms can use. Most video
compression algorithms use both spatial compression, based on redundancy
within a single frame or field, and temporal compression, based on
redundancy between different video frames.
Lossy algorithms.
Lossy compression is typically used when a file can afford to lose some data,
and/or if storage space needs to be drastically ‘freed up’.
Here, an algorithm scans image files and reduces their size by discarding
information considered less important or undetectable to the human eye.
Lossless algorithms.
With lossless compression the file data is restored and rebuilt in its original form
after decompression, enabling the image to take up less space without any
discernible loss in picture quality.
No data is lost and as the process can be reversed, it’s also known as reversible
compression.
It is basically used for VHS quality audio and video on CD-ROM at a bit rate
of 1.5 Mbps. MPEG-1 uses a combination of I-frames only, I- and P-frames
only, or I-,P- and B-frames. It does not support D-frames. The compression
algorithm used is based on the H.261 standard with two main differences. The
first is that timestamps are inserted which will enable the decoder to
resynchronize quickly when there are one or more corrupted or missing
macroblocks. The number of macroblocks between two timestamps is known
as a slice and a slice can comprise from 1 to a maximum number of
macroblocks in a frame that is 22. The second difference is because B-frames
are supported by MPEG-1 and this increases the time interval
11.3.2 MPEG-2
The MPEG-2 video standard is defined in ISO Recommendation 13818. It is
used in the recording and transmission of studio quality audio and video. The
basic coding structure of MPEG-2 video is the same as that of MPEG-1 with
some difference. In this case there are different levels of video resolution
possible:-
Low: It is based on SIF digitization format with a resolution of 352 x 288
pixels. It is comparable with MPEG-1 format and produces VHS-quality. The
audio is of CD quality and the target bit rate is up to 4 Mbps.
Main: It is based on 4:2:0 digitization formats with a resolution of 720 x 576
pixels. It produces studio quality video and audio with bit rate up to 15 Mbps
or 20 Mbps with the 4:2:2 digitization format.
High 1440: It is based on the 4:2:0 digitization format with the resolution of
1440 x 1152 pixels. It is proposed for high definition television (HDTV) at bit
rates up to 60 Mbps or 80 Mbps with the 4:2:2 digitization format.
High: It is based on the 4:2:0 with a resolution of 1920x1152 pixels. It is
proposed for wide screen HDTV at bit rates up to 80 Mbps or 100 Mbps with
the 4:2:2 digitization format.
For each of the above levels, MPEG-2 provides five profiles: simple, main,
spatial resolution, quantization accuracy, and high. The four levels and five
profiles collectively form a two-dimensional table which acts as a framework
for all standards activities associated with MPEG-2. Since the Low level is
compatible with MPEG-1, discuss here is only the main profile at the main
level (MP@ML).
MP@ML: The main application of MP@ML is digital television broadcasting.
Hence interlaced scanning is used with a frame refresh rate of 30Hz for NTSC
and 25Hz for PAL. The 4:2:0 digitization format is used and the bit rate ranges
from 4Mbps to 15 Mbps. The coding scheme is similar to MPEG-1 with only
a difference in the scanning method. It uses interlaced scanning instead of
progressive scanning which results in each macroblock having two fields: odd
and even fields each with the refresh rate of 60Hz for NTSC or 50Hz for PAL.
Therefore for the I-frame, the DCT blocks from eachmacroblock have to be
derived. So there are two alternatives:
• Field Mode: This method is used for encoding DCT blocks when a large
amount of motion is present. This means that there will be shorter time
difference between successive fields due to which compression ratio will
be higher. For example, a live cricket match video can be encoded using
this mode as there will be a large amount of movement.
• Frame Mode: This method is used when there is small amount of
movement. This means that the time intervals between successive fields
are large and the compression ratio very low. Hence the DCT blocks are
encoded from each complete frame. For example, a news broadcast can
be encoded using this mode since the movement between frames is more
as compared to the movements in the field.
Similarly for encoding P- and B-frames, three different modes are possible:
field, frame and mixed. The field and frame mode works the same way as I-
frame with additional consideration of the preceding frame filed for both P-
and B-frame and for B-frame, the immediately succeeding field. In the mixed
mode, both the motion vectors of frame and field modes are computed and
the one with the smallest value is selected.
11.3.3 MPEG-4
This standard is related to interactive audio and video over the internet and
other entertainment network. This standard has a feature that allows the user
to not only access a video sequence, but also to influence independently, the
individual elements that make up the video sequence. In short, using MPEG-
4 standard, the user can be made capable of not only starting, stopping and
pausing functions, but also, repositioning, deletingand altering the movement
of each characters within a scene. The MPEG-4 standard has a very high
coding efficiency and therefore it can be used over low bit-rate networks like
wireless and PSTNs (Public Switched Telephone Network). The MPEG-4
standard is a good alternative for the H.263 standard.
An important feature of MPEG-4 is content based coding which signifies that
before the video is compressed, each scene is defined in the form of a
background and one or more foreground audio-visual objects (AVO). Each
AVO is composed of one or more video objects and/or audio objects. Taking
the example of a news broadcast, the laptop in a scene can be considered as
a single video object while the news reader can be defined as using both
an audio and video object. Similarly, each video and audio object can further
be made up of many sub-objects. In the case of news reader example there
is a movement in his/her eyes and mouth only. So the reader’s face can be
defined in the form of three sub-objects: one each for head, eye and mouth.
Once the content-based coding is done, the encoding of the background and
each AVO is carried out separately.
Each audio and video object is described by an object descriptor which
enables a user to manipulate the objects. The language used to describe the
objects and define functions for manipulating the shape, size and location of
the objects is called Binary Format for Scenes (BIFS). In a complete scene
there may be many AVOs and some relation may exist between these AVOs.
So the relation between the AVOs is defined by a scene descriptor. Each
video scene is segmented into a number of Video Object Planes (VOP),
each of which corresponds to an AVO of interest. Forexample, in the news
broadcast example, VOP0 represents the news reader, VOP1 represents the
laptop present in the table, VOP2 represent the background setting of the
studio. Each VOP is encoded separatelybased on its shape, motion and
texture as shown in figure 11.3.
The resulting bit stream from the VOP encoder is encoded for transmission
by multiplexing the VOPs together with the related object and scene
descriptors as shown in figure 11.4(a). Similarly, at the receiver, the bit
stream is first demultiplexed and the individual VOPs decoded. The individual
VOPs together with the object and scene descriptors are then used to create
the video frame that is played out on the terminal as shown infigure 11.4(b).
(a)
(b)
Figure 11.4: MPEG-4 (a) Encoder (b) Decoder
The audio associated with an AVO is compressed using any one algorithm,
depending on the available bit rate of the transmission channel and the sound
quality required. For example, CELP can be used for video telephony. The
audio encoder and decoder are included inside the MPEG-4 encoder and
decoder as shown in figure 11.4(a) & (b).
Self Assessment Questions
1. MPEG-1 does not support D-frames. (True/False)
2. The MPEG-2 video standards is defined in ISO Recommendation
3. PSTN stands for .
a) Public Standard Telephone Network
b) Public Switched Telephone Network
c) Private Switched Telephone Network
d) Private Standard Telephone Network
4. In VOP encoder the relation between the AVOs is defined by.
B-frame. These frames have the highest level of compression and because
they are not involved in the coding of other frames, they do not propagate
errors.
When B-frames are received at the destination, they are first decoded and
then the resulting information is used together with the decoded information
of the preceding I- or P-frame and the immediately succeeding I- or P-frame
to derive the decoded frame contents. While decoding B-frame, if either the
preceding or succeeding I- or P-frame is not available, then the time required
to decode the B-frame increases. Therefore to minimize the decoding time all
the required frames should be made available, and the reordering of frames
are done. For example, if the uncoded frame sequenceis I B B P B B P B BI
... then,
Let us number these frame for easy understanding, therefore,
0123456789
I B B P B B P B BI ...
Then the reordered sequence would be
0312645789
I P B B P B BBBI ...
applied, the farther one pushes beyond the perception threshold, with the
result being a noticeable reduction in image quality.
Self Assessment Questions
6. The repetition of same information many times is known as
.
7. Additional information indicates the small differences between thepredicted and actual
positions of the moving segments are called
a. motion estimation
b. motion calculation
c. motion compensation
d. motion segmentation
7. Number of frames between a P-frame and the immediately precedingI-frame or P-frame
is called .
8. Spatially compressed frame is often referred to as an .
9. CODEC is the short form of Compressor/Decompressor. [True/False]
10. identifies the differences between framesand stores only those differences.
11.6 Summary
This unit provides information about the various video compression
techniques. Let us recapitulate the unit content. One of the most popular
MPEG standards has evolved from the original MPEG-1 audio and video
compression schemes into MPEG-2 now used for digital cable, satellite and
terrestrial broadcasting, DVD, High-definition and many other applications
and MPEG-4, which includes several improvements over MPEG-2, including
multi-directional motion vectors, ¼-pixel offset, object coding for greater
efficiency, separate control of individual picture elements and behavior coding
for interactive use. Spatial and temporal are the two categories of
compression technique helps to compress the video in an efficient way. Inter
and intra frame compression helps save large amounts of data when
compared to the task of storing or transmitting a full description of every pixel
in the original image.
11.8 Answers
Self-Assessment Questions
1. True
2. 13818
3. (b) Public Switched Telephone Network
4. scene descriptor.
5. redundancy
6. (c) motion compensation
7. Prediction span
8. intraframe
9. True
10. Temporal compression
Terminal Questions
1. The MPEG-1 video standard is defined in ISO Recommendation 11172.
Refer subsection 11.3.1.
2. Each video scene is segmented into a number of Video Object Planes
(VOP), each of which corresponds to an AVO of interest. Refer sub
section 11.3.3.
3. The resulting bit stream from the VOP encoder is encoded for
transmission by multiplexing the VOPs together with the related object
and scene descriptors. Similarly, at the receiver, the bit stream is first
demultiplexed and the individual VOPs decoded. Refer Sub Section
11.3.3.
4. The repetition of same information many times is known as redundancy.
Redundancy has been categorized into two types they are spatial and
temporal redundancy. Refer Section 11.4.
5. Redundancy can be eliminated / reduced by taking into consideration only
those portions in a frame that involves some changes compared to the
previous frame. Depending in the type of redundancy exploited there are
different types of frames. Refer sub section 11.4.1.
6. Intraframe compression looks for redundant or imperceptible data within
each frame and eliminates it, but keeps each frame separate from all
others. Interframe compression looks for portions of the image which don't
Manipal University Jaipur B1552 Page No.: 225
Graphics and Multimedia Systems Unit 11
change from frame to frame and encodes them only once. Refer Section
11.5.
Unit 12 Animation
Structure:
12. 1 Introduction
Objectives
12.2 Role of animation in Multimedia
12.3 Types and Techniques of Animation
• Stop Motion
• Computer Animation
• Traditional Animation
12.4 Key Frame Animation
12.5 Utility
12.6 Morphing
12. 7 Virtual Reality Concepts
Types of VR
12.8 Summary
12.9 Terminal Questions
12.10 Answers
12.1 Introduction
In the previous unit we discussed the MPEG standard compression. We dealt
in detail about how the compression can be done through spatial and temporal
redundancy. We also explored the techniques of interframe and intraframe
compression.
In this unit we will discuss the topic of animation. Animation is a visual
technique that provides the illusion of motion by displaying a collection of
images in rapid sequence. Each image contains a small change, for example
a leg moves slightly, or the wheel of a car turns. When the images are viewed
rapidly, the human eye fills in the details and the illusion of movement is
complete.
Technically, animation can be defined as a simulation of movement created
by displaying a series of pictures, or frames. Cartoons on television are one
example of animation. Animation on computers is one of the chief ingredients
of multimedia presentations. There are many software applications that
enable one to create animations that can be displayed on acomputer monitor.
We will also discuss the concept of key frame animation and morphing and
discuss in detail the different techniques of morphing. We will conclude this
unit a discussion of virtual reality concepts and its various types.
Objectives:
After studying this unit, you should be able to:
• explain the concept of animation
• list and explain the techniques of animation.
• describe the role of key frame animation
• able to explain morphing
• analyse the concepts of virtual reality.
Animation plays a crucial role in multimedia as it allows for the creation of engaging,
interactive, and visually appealing content that can be used in various applications
such as video games, movies, educational materials, marketing campaigns, and
more.
Here are some of the key roles of animation in multimedia:
• Enhancing engagement: Animation can bring characters and stories to life,
creating a more immersive and engaging experience for the audience.
• Communicating complex ideas: Animation can be used to convey complex
ideas or concepts in a simple and easy-to-understand way. This is particularly
useful in educational materials, where animations can be used to visualize
abstract or complex concepts.
• Creating visual appeal: Animation can be used to create visually stunning
and memorable content that captures the attention of the audience. This is
particularly important in marketing and advertising, where animations can be
used to create memorable brand experiences.
• Providing interactivity: Animation can be used to create interactive
experiences that allow users to engage with content in a more meaningful
way. This is particularly useful in video games and e-learning, where
interactivity can help to enhance learning outcomes and engagement.
• Overall, animation is a powerful tool that can be used to create engaging,
interactive, and visually appealing content in multimedia.
Disney Studio (Beauty and the Beast, Aladdin, Lion King) to the more
'cartoony' styles of those produced by the Warner Bros. Animation Studio.
Limited Animation is a process of making animated cartoons that does not
redraw entire frames but variably reuses common parts between frames. One
of its major trademarks is the stylized design in all forms and shapes, which
in the early days was referred to as modern design. Pioneered by the artists
at the American studio, United Productions of America, limited animation can
be used as a method of stylized artistic expression. Its primary use, however,
has been in producing cost-effective animated content for media such as
television (the work of Hanna-Barbara, Filmation, and other TV animation
studios) and later on in the Internet (web cartoons).
Rotoscoping is an animation technique in which animators trace over live-
action film movement, frame by frame, for use in animated films. Originally,
recorded live-action film images were projected onto a frosted glass panel
and re-drawn by an animator. This projection equipment is called a
rotoscope, although this device has been replaced by computers in recent
years. In the visual effects industry, the term rotoscoping refers to the
technique of manually creating a matte for an element on a live-action plate
so that it can be composited over another background.
Live Action/Animation is a motion picture that features a combination of real
actors or elements, live-action and animated elements, typically interacting.
Originally, animation was combined with live action in several ways,
sometimes as simply as double-printing two negatives onto the same release
print. More sophisticated techniques used optical printers or aerial image
animation cameras, which enable more exact positioning, and better
interaction of actors and animated characters. Often, every frame of the live
action film was traced by rotoscoping, so that the animator could add his
drawing in the exact position. The combination of live action and animation
is very common in TV commercials, especially those promoting products
appealing to children.
12.3.2 Stop Motion
Stop motion (also known as stop action) is an animation technique used to
make a physically manipulated object appear to move on its own. The object
is moved in small increments between individually photographed frames,
creating the illusion of movement when the series of frames is played as
continuous sequence. Clay figures are often used in stop motion for their ease
Manipal University Jaipur B1552 Page No.: 228
Graphics and Multimedia Systems Unit 12
of repositioning. Motion animation using clay is called clay animation or clay-
motion.
Puppet Animation typically involves stop-motion puppet figures interacting
with each other in a constructed environment, in contrast to the real-world
interaction in model animation. The puppets generally have an armature inside
of them to keep them still and steady as well as constraining their movements
to particular joints. Examples of puppet animation include The Tale of the Fox
(France, 1937), The Nightmare Before Christmas (US, 1993), Corpse Bride
(US, 2005), Coraline (US, 2009).
Puppetoon animation is a type of replacement animation, which is itself a
type of stop-motion animation. In traditional stop-motion, the puppets are
made with movable parts which are repositioned between frames to create
the illusion of motion when the frames are played in rapid sequence. In
puppetoon animation the puppets are rigid and static pieces. Each puppet is
typically used in a single frame and then switched with a separate, near-
duplicate puppet for the next frame. Thus puppetoon animation requires many
separate figures. It is thus more analogous in a certain sense to cel animation
than is traditional stop-motion as the characters are created from scratch for
each frame.
Clay Animation: Here, each object is sculpted in clay or a similarly pliable
material such as plasticine, usually around a wire skeleton called an
armature. As in other forms of object animation, the object is arranged on the
set (background), a film frame is exposed, and the object or character is then
moved slightly by hand. Another frame is taken, and the object is moved
slightly again. This cycle is repeated until the animator has achieved the
desired amount of film.
Graphic Animation is a variation of stop motion (and possibly more
conceptually associated with traditional flat cel animation and paper drawing
animation, but still technically qualifying as stop motion) consisting of the
animation of photographs (in whole or in parts) and other non-drawn flat visual
graphic material, such as newspaper and magazine clippings.
In its simplest form, graphic animation can take the form of the animation
camera merely panning up and down and/or across individual photographs,
rotations; the robot arm can have a total of 12 degrees of freedom. The human
body, in comparison, has over 200 degrees of freedom.
Self-Assessment Questions
5. Design and control of animation sequences are handled with a set of
.
6. What are Parameterized systems?
12.5Utility
Multimedia utility refers to the use of multimedia elements such as text, images,
audio, video, and animation to convey information or messages through various
digital devices. It has become an essential tool in modern society due to the
increasing use of digital devices and the internet. In this article, we will explore the
various applications of multimedia utility and its importance in today's world.
One of the main uses of multimedia utility is in the field of education. With the advent
of technology, it has become easier for students to access a wide range of
multimedia resources, including online textbooks, videos, and interactive learning
materials. This has revolutionized the way in which education is delivered, making it
more engaging and interactive for students. Multimedia utility has also made it
possible for educators to personalize learning and cater to the individual needs of
students, which was previously difficult to achieve with traditional classroom
methods.
Multimedia utility is also important in the field of marketing and advertising. With the
increasing use of social media and online platforms, companies can now reach a
wider audience through various multimedia elements such as images, videos, and
interactive advertisements. The use of multimedia in marketing and advertising has
been shown to increase customer engagement and improve brand recognition,
leading to increased sales and revenue.
Multimedia utility has also been used in the field of healthcare, where it has been
used to improve patient care and education. The use of multimedia elements such
as videos, animations, and infographics has made it easier for healthcare
professionals to communicate complex medical information to patients. Additionally,
multimedia tools such as telemedicine have made it possible for patients to receive
medical care remotely, reducing the need for in-person visits and improving access
The use of multimedia utility has also had a significant impact on the field of
journalism. With the rise of digital media, journalists can now use a wide range of
multimedia elements to enhance their reporting, including images, videos, and audio
recordings. This has made it possible for journalists to provide more in-depth
coverage of events and issues, as well as to reach a wider audience through online
platforms.
In conclusion, multimedia utility has become an essential tool in today's world, with
applications in various fields such as education, entertainment, marketing,
healthcare, and journalism. Its ability to enhance communication, engagement, and
accessibility has made it a valuable resource for individuals and organizations alike.
As technology continues to advance, it is likely that the use of multimedia utility will
become even more widespread, leading to further innovation and development in
the field.
12.6Morphing
Morphing is a special effect in motion pictures and animations that changes
one image into another through a seamless transition. It is frequently used
to depict one person turning into another through technological means or as
part of a fantasy or surreal sequence. Traditionally such a depiction would be
achieved through cross-fading techniques on film.
Transformation of object shapes from one form to another is called
morphing, which is a shortened form of metamorphosis. Morphing methods
can be applied to any motion or transition involving a change in shape. Given
two key frames for an object transformation, it is necessary to first adjust the
object specification in one of the frames so that the number of polygon edges
(or the number of vertices) is the same for the two frames. This preprocessing
step is illustrated in Figure. 12.2. A straight-line segment in key frame k is
transformed into two line segments in key frame k+1. Sincekey frame k+1 has
an extra vertex, one needs to add a vertex between vertices 1 and 2 in key
frame k to balance the number of vertices (and edges) in the two key frames.
Figure 12.2: An edge with vertex positions 1 and 2 in key frame k evolves into
two connected edges in key frame k + 1
Using linear interpolation to generate the in-betweens, one can transition the
added vertex in key frame k into vertex 3 along the straight-line path as shown
in Figure 12.3.
Pixel Manipulation
Morphing is a combination of two processes, each of which changes pixel
attributes. Cross-dissolving changes the image's colors, pixel by pixel while
warping changes the shapes of features in the image by shifting its pixels
around.
During the morphing process one needs to do the warp process first. If the
cross-dissolve is done with two arbitrary images, one gets the double-image
effect. Re-positioning of all pixels in the source images is required to avoid
or minimize the double-image effect Intermediate Images.
In order to ensure smooth transition, each intermediate frame is created by
the combination of beginning and ending pictures. An image's position in the
sequence determines the influence that the beginning and ending frames will
have upon it.
It can be observed in figure 12.7 that the early intermediate images in the
sequence are much like the first source image. The middle image of the
sequence is the average of the first source image distorted halfway towards
the second one and the second source image distorted halfway back towards
the first one. The last images in the sequence are similar to the second source
image.
For example, in a sequence of ten total frames, the first frame is 100% of
the start image blended with 0% of the end image. The second frame is 90%
of the start image blended with 10% of the end image. Frame three is 80%
and 20%, and so forth.
Self-Assessment Questions
7. Name the technique used to change shapes during morphing.
8. Specify the role of cross dissolving in morphing.
9. In order to make a transition smooth, each intermediate frame is created by
the combination of beginning and ending pictures.(True/False)
Currently, there are two main kinds of haptic interfaces, namely the off-body
interface and the on-body interface. The main difference is that the mass of
the on-body interface is supported by the operator while the off-body interface
rests on the floor. These days, most commercially available devices are off-
body.
(a) (b)
Figure 12.9: a) On-body interface (Exoskeleton) b) Off-body interface (Phantom
Desktop)
12.7.1 Types of VR
Although it is difficult to categorize all VR systems, most configurations fall
into three main categories and each category can be ranked by the sense
of immersion, or degree of presence it provides. Immersion or presence can
be regarded as how powerfully the attention of the user is focused on the task
in hand. Immersion presence is generally believed to be the product of several
parameters including level of interactivity, image complexity, stereoscopic
view, and field of regard and the update rate of the display.
Non-Immersive (Desktop) Systems
Non-immersive systems, as the name suggests, are the least immersive
implementation of VR techniques. Using the desktop system, the virtual
environment is viewed through a portal or window by utilizing a standard
higher resolution monitor. Interaction with the virtual environment can occur
by conventional means such as keyboards, mouse and trackballs or may be
enhanced by using 3D interaction devices such as a Space Ball; or Data
Glove.
The non-immersive system has advantages as they do not require thehighest
level of graphics performance, no special hardware and can be implemented
on high specification PC clones. This means that these systems can be
regarded as the lowest cost VR solution which can be used for many
applications. Additionally, these systems are of little use where the perception
of scale is an important factor. However, one would expect to see an increase
in the popularity of such systems for VR use in the near future. This is due to
the fact that Virtual Reality Modeling Reality Language (VRML) is expected
to be adopted as a de-facto standard for the transfer of 3D model data and
virtual worlds via the internet. The advantage of VRML for the PC desktop
user is that this software runs relatively well on a PC, which is not always the
case for many proprietary VR authoring tools. Further, many commercial VR
software suppliers are now incorporatingVRML capability into their software
and exploring the commercial possibilities of desktop VR in general.
Self-Assessment Questions
12.8Summary
Now let us recapitulate the content discussed in this unit. Animation is the
rapid display of a sequence of images of 2-D artwork or model positions in
order to create an illusion of movement due to the phenomenon of persistence
of vision. Also discussed were the various types and techniquesof animation.
A key frame animation basically supports one by providing exact control over
the way one layers the animation. We understood from this unit that morphing
is a technology for transforming one image to another and morphing is popular
in the entertainment industry. Virtual reality is also known as artificial reality.
It represents computer interface technology that isdesigned to leverage our
natural human capabilities. We also discussed the types of virtual reality;
major distinction of VR systems is the mode with which they interface to the
user.
12.9Terminal Questions
1. List and explain the types of animation.
2. Describe the role of key frame animation.
3. Explain the techniques involved in morphing process.
4. What is virtual reality?
5. Explain the types of virtual reality.
12.10 Answers
Self-Assessment Questions
1. Limited animation
2. a. Puppets
3. True
4. Animation technique, here animators trace over live-action film
movement, frame by frame, used in animated films.
5. Animation routines
6. It allows object-motion characteristics to be specified as part of the
object definitions
7. Warping
Manipal University Jaipur B1552 Page No.: 245
Graphics and Multimedia Systems Unit 12
8. Changes the image color pixel by pixel
9. True
10. Head Mounted Display
11. Off body and on body interface
12. c. Fully Immersive
Terminal Questions
1. Three types of animation remains as three primary types of animation
called traditional, stop-motion, and computer. For more details Refer
section 12.3.
2. Key-frame systems are specialized animation languages designed simply
to generate the in-betweens from the user-specified key frames. For more
details Refer section 12.4.
3. Morphing is a combination of two processes, each of which changes pixel
attributes. Cross-dissolving changes the image's colors, pixel by pixel;
warping changes the shapes of features in the image by shifting its pixels
around. For more details Refer 12.6.
4. Virtual reality (VR) is a term that applies to computer-simulated
environments that can simulate physical presence in places in the real
world, as well as in imaginary worlds. For more details Refer section 12.7.
5. Non immersive, semi immersive and full immersive are the important
three types of virtual reality. For more details Refer sub section 12.7.1.
References:
1. Donald D. Hearn., M. Pauline Baker Computer Graphics (2009).
Pearson Education.
2. J. Foley, A. van Dam, S. Feiner, & J. Hughes, Computer Graphics:
Principles and Practice (1996), 2nd edition, Addison-Wesley,
3. Alan Watt 3D Computer Graphics (3rd Edition) (1999).
4. S. Hoggar, Mathematics for Computer Graphics (1992)., Cambridge
University Press,
5. Bhatnagar G., Mehta S., & Mitra S. (2004). Introduction to Multimedia
Systems. India: Elsevier.
6. Buford J.F.K. (2007). Multimedia Systems. India: Pearson Education.
7. Li Z. & Drew M. S. (2009). Fundamentals of Multimedia. India: Pearson
Education.
8. Parekh R. (2007). Principles of Multimedia. India: Tata McGraw-Hill.