BSC IT 5th Sem Assignment Solved Answer
BSC IT 5th Sem Assignment Solved Answer
Refreshing on raster scan display is carried out at the rate of 60 to 80 frames per second.
Some displays use interlaced refresh procedure. First, all points on the even numbered scan
lines are displayed then all the points along odd numbered lines are displayed. This is an
effective technique for avoiding flickering.
Random scan display
When operated as a random-scan display unit, a CRT has the electron beam directed only to
the parts of the screen where a picture is to be drawn. Random scan monitors draw a picture
one line at a time and for this reason they are also referred as vector displays (or strokewriting or calligraphic displays). The component lines of a picture can be drawn and
refreshed by a random-scan system in any specified order. A pen plotter operates in a similar
way and is an example of a random-scan, hard-copy device.
Refresh rate on a random-scan system depends on the number of lines to be displayed.
Picture definition is now stored as a set of line- drawing commands in an area of memory
referred to as the refresh display file. Sometimes the refresh display file is called the display
list, display program, or simply the refresh buffer.
4. How many colors are possible if
a. 24 bits / pixel is used
b. 8 bits / pixel is used Justify your answer
a). 24 bit color provides 16.7 million colors per pixels, That 24 bits are divided into 3 bytes;
one each for the read, green, and blue components of a pixel.
b). 256, 8 bits per pixel = 2^8 colours.
Widely accepted industry standard uses 3 bytes, or 24 bits, per pixel, with one byte for each
primary color results in 256 different intensity levels for each primary color. Thus a pixel can
take on a color from 256 X 256 X 256 or 16.7 million possible choices. In Bi-level image
representation one bit per pixel is used to represent black-and white images. In gray level
image 8 bits per pixel to allow a total of 256 intensity or gray levels. Image representation
using lookup table can be viewed as a compromise between our desire to have a lower
storage requirement and our need to support a reasonably sufficient number of simultaneous
colors.
5. List and explain different text mode built-in functions of C Programming language.
The different text mode built-in functions of C Programming language are listed below :i). textmode( int mode);
This function sets the number of rows and columns of the screen, mode variable can take the
values 0, 1, 1, or 3.
0: represents 40 column black and white
1: represents 40 column color
2: represents 80 column black and white
3: represents 80 column color
Example: textmode(2); // sets the screen to 80 column black and white
ii). clrscr();
This function clears the entire screen and locates the cursor on the top left corner(1,1)
Example clrscr(); // clears the screen
line(175,125,159,143);
line(175,125,193,107);
setcolor(YELLOW);
rectangle(0,0,640,43);
setfillstyle(SOLID_FILL,YELLOW);
bar(0,0,640,43);
setcolor(BLACK);
settextstyle(1,HORIZ_DIR,5);
outtextxy(150,0,"INDIAN FLAG");
getch();
}
fully
supports Kernighan and Ritchie definitions. It includes C++ class libraries, mouse support,
multiple overlapping windows, Multi file editor, hypertext help, far objects and error
analysis. Turbo C++ comes with a complete set of graphics functions to facilitate preparation
of charts and diagrams. It supports the same graphics adapters as turbo Pascal. The Graphics
library consists of over 70 graphics functions ranging from high level support like facility to
set view port, draw 3-D bar charts, draw polygons to bitoriented functions like get image and
put image. The graphics library supports numerous objects, line styles and provides several
text fonts to enable one to justify and orient text, horizontally and vertically. It may be noted
that graphics functions use far pointers and it is not supported in the tiny memory model.
5. Define resolution.
Resolution: Image resolution refers as the pixel spacing i.e. the distance from one pixel to the
next pixel. A typical PC monitor displays screen images with a resolution somewhere
between 25 pixels per inch and 80 pixels per inch. Pixel is the smallest element of a displayed
image, and dots (red, green and blue) are the smallest elements of a display surface (monitor
screen). The dot pitch is the measure of screen resolution. The smaller the dot pitch, the
higher the resolution, sharpness and detail of the image displayed.
6. Define aspect ratio.
Aspect ratio: The aspect ratio of the image is the ratio of the number of X pixels to the
number of Y pixels. The standard aspect ratio PCs is 4:3, and some use 5:4. Monitors are
calibrated to this standard so that when you draw a circle it appears to be a circle and not an
ellipse.
7. Why refreshing is required in CRT?
When the electron beam strikes a dot of phosphor material, it glows for a fraction of a second
and then fades. As brightness of the dots begins to reduce, the screen-image becomes unstable
and gradually fades out. In order to maintain a stable image, the electron beam must sweep
the entire surface of the screen and then return to redraw it number of times per second. This
process is called refreshing the screen. If the electron beam takes too long to return and
redraw a pixel, the pixel begins to fade results in flicker in the image. In order to avoid flicker
the screen image must be redrawn sufficiently quickly that the eye cannot tell that refresh is
going on. The refresh rate is the number of times per second that the screen is refreshed.
Some monitor uses a technique called interlacing for refreshing every line of the screen. In
the first pass, odd-numbered lines are refreshed, and in the second pass, even numbered
lines are refreshed. This allows the refresh rate to be doubled because only half the screen is
redrawn at a time.
8. Name the different positioning devices.
The devices discussed so far, the mouse, the tablet, the joystick are called positioning
devices. They are able to position the curser at any point on the screen. (We can operate at
that point or the chain of points) Often, one needs devices that can point to a given position
on the screen. This becomes essential when a diagram is already there on the screen, but
some changes are to be made. So, instead of trying to know its coordinates, it is advisable to
simply point to that portion of the picture and asks for changes. The simplest of such
devices is the light pen. Its principle is extremely simple.
9. What are pointing devices?
A pointing device is an input interface (specifically a human interface device) that allows a
user to input spatial (i.e., continuous and multi-dimensional) data to a computer. CAD
systems and graphical user interfaces (GUI) allow the user to control and provide data to the
computer using physical gestures point, click, and drag for example, by moving a handheld mouse across the surface of the physical desktop and activating switches on the mouse.
Movements of the pointing device are echoed on the screen by movements of the pointer (or
cursor) and other visual changes.
10. What is multimedia?
The word Multimedia seems to be everywhere nowadays. The word multimedia is a
compound
of the Latin prefix multi meaning many, and the Latin-derived work media, which is the
plural
of the world medium. So multimedia simply means using more than one kind of medium.
Multimedia is the mixture of two or more media effects-Hypertext, Still Images, sound,
Animation and Video to be interacted on a computer terminal.
11. What are sound cards?
Sound cards: The first sound blaster was an 8-bit card with 22 KHz sampling, besides being
equipped with a number of drives and utilities. This became a king of model for the other
sound cards. Next came the Sound Blaster Pro, again 8-bit sound but with a higher sampling
rate of 44 KHz, which supports a wider frequency range. Then there was Yamaha OPL3
chipset with more voices. Another development was built-in CD ROM interface through
which huge files could be played directly via the sound card.
12. What is sampling?
Sampling: Sampling is like breaking a sound into tiny piece and storing each piece as a small,
digital sample of sound. The rate at which a sound is Sampled can affect its quality. The
higher the sampling rate (the more pieces of sound that are stored) the better the quality of
sound. Higher quality of sound will occupy a lot of space in hard disk because of more
samples.
13. What is morphing?
Morphing: The best example would be the Kawasaki advertisement, where the motorbike
changes into a cheetah, the muscle of MRF to a real muscle etc.. Morphing is making an
image change into another by identifying key points so that the key points displacement, etc.
are taken into consideration for the change.
14. What is rendering?
Rendering: The process of converting your designed objects with texturing and animation
into an image or a series of images is called rendering. Here various parameters are available
like resolution, colors type of render, etc.
15. What is warping?
Warping: Certain parts of the image could be marked for a change and made to change to
different one. For examples, the eyes of the owl had to morph into the eyes of cat, the eyes
can alone be marked and warped.
16. Why we use scanner?
Photographs, illustrations, and paintings continue to be made the old fashioned way, even by
visual artists who are otherwise immersed in digital imaging technology. Traditional
photographs, illustrations, and paintings are easily imported into computers through the use
of a device called a scanner.
A Scanner scans over an image such as photo, drawing, logo, etc, converting it into
an image and it can be seen on the screen. Using a good paint programme, Image Editor we
can do adding, removing colors, filtering, Masking color etc.
17. What is ganut in Photoshop?
Write yourself...
all of that. HTML is simply a markup language used to define a logical structure rather than
compute
anything.
For example, it can describe which text the browser should emphasize, which text should be
considered
body text versus header text, and so forth.
The beauty of HTML of course is that it is generic enough that it can be read and interpreted
by a web
browser running on any machine or operating system. This is because it only focuses on
describing the
logical nature of the document, not on the specific style. The web browser is responsible for
adding style.
For instance emphasized text might be bolded in one browser and italicized in another. it is
up to the
browser to decide
3. Give the different classification of HTML tags with examples for each category
LIST OF HTML TAGS :Tags for Document Structure
HTML
HEAD
BODY
Heading Tags
TITLE
BASE
META
STYLE
LINK
Block-Level Text Elements
ADDRESS
BLOCKQUOTE
DIV
H1 through H6
P
PRE
XMP
Lists
DD
DIR
DL
DT
LI
MENU
OL
UL
Text Characteristics
B
BASEFONT
BIG
BLINK
CITE
CODE
EM
FONT
I
KBD
PLAINTEXT
S
SMALL
4. Write CGI application which accepts 3 numbers from the user and displays biggest number
using GET and POST methods
#!/usr/bin/perl
#print "Content-type:text/html\n\n";
#$form = $ENV{'QUERY_STRING'};
use CGI;
$cgi = new CGI;
print $cgi->header;
print $cgi->start_html( "Question Ten" );
my $one = $cgi->param( 'one' );
my $two = $cgi->param( 'two' );
my $three = $cgi->param( 'three' );
if( $one && $two && $three )
{
$lcm = &findLCM( &findLCM( $one, $two ), $three );
print "LCM is $lcm";
}
else
{
print '
';
print 'Enter First Number
';
print 'Enter Second Number
';
print 'Enter Third Number
';
print '
';
print "
";
}
print $cgi->end_html;
sub findLCM(){
my $x = shift;
my $y = shift;
my $temp, $ans;
if ($x < $y) {
$temp = $y;
$y = $x;
$x = $temp;
}
$ans = $y;
$temp = 1;
while ($ans % $x)
{
$ans = $y * $temp;
$temp++ ;
}
return $ans;
}
5. What is Javascript? Give its importance in web.
JavaScript is an easy to learn way to Scriptyour web pages that is have them to do actions
that cannot be handled with HTML alone. With JavaScript, you can make text scroll across
the screen like ticker tape; you can make pictures change when you move over them, or any
other number of dynamic enhancement.
JavaScript is generally only used inside of HTML document.
i) JavaScript control document appearance and content.
ii) JavaScript control the browser.
iii) JavaScript interact with document content.
iv) JavaScript interact with the user.
v) JavaScript read and write client state with cookies.
vi) JavaScript interact with applets.
vii) JavaScript manipulate embedded images.
6. Explain briefly Cascading Style Sheets
Cascading Style Sheet (CSS) is a part of DHTML that controls the look and placement of the
element on the page. With CSS you can basically set any style sheet property of any element
on a html page. One of the biggest advantages with the CSS instead of the regular way of
changing the look of elements is that you split content from design. You can for instance link
a CSS file to all the pages in your site that sets the look of the pages, so if you want to change
like the font size of your main text you just change it in the CSS file and all pages are
updated.
7. What is CGI? List the different CGI environment variables
CGI or Common Gateway Interface is a specification which allows web users to run
program from their computer.CGI is a part of the web server that can communicate with other
programs running on the server. With CGI, the web server can call up a program, while
passing user specific data to a program. The program then processes that data and the server
passes the programs response back to the web browser.
When a CGI program is called, the information that is made available to it can be
roughly broken into three groups:i). Information about client, server and user.
ii). Form data that are user supplied.
iii). Additional pathname information.
Most Information about client, server and user is placed in CGI environmental variables.
Form data that are user supplied is incorporated in environment variables. Extra pathname
Part - A
a) What is the difference between Internet and Intranet?
Internet: Internet is global network of networks.Internet is a tool for collaborating academic
research,and it has become a medium for exchanging anddistributing information of all kinds.
It is aninterconnection between several computers of different types belongingto various
networks all over global.
Intranet: Intranet is not global. It is a mini web that islimited to user machines and software
program of particulars organization or company
b) List any five HTML tags.
Five HTML tags are:i). UL (unordered list): The UL tags displays a bulleted list. You can use the tags TYPE
attribute to change the bullet style.
ii). TYPE: defines the type of bullet used of each list item. The value can be one of the
following-CIRCLE, DISC, SQUARE
iii). LI (list item): The LI tag indicates an itemized element, which is usually preceded by
bullet, a number, or a letter. The LI is used inside list elements such as OL (ordered list) and
UL (unordered list).
iv). TABLES (table): The TABLE tag defines a table. Inside the TABLE tag, use the TR tag
to define rows in the table, use the TH tag to define row or column headings, and the TD tag
to define table cells.
v). HTML (outermost tag): The HTML identifies a document as an HTML document. All
HTML documents should start with the and end with the tags.
c) Write the difference between HTML and DHTML.
HTML: HTML stands for Hyper Text MarkupLanguage. It is a language. HTML cant
bedone after the page loads. HTML can be or not usedwith JavaScript.
DHTML: DHTML stands for Dynamic Hyper TextMarkup Language. DHTML isnt really
alanguage or a thing in itself its just a mix of thosetechnologies. Dynamic HTML is simply
HTMLthat can change even after a page has been loaded into a browser. DHTML can be used
with JavaScript.
d) Explain the different types of PERL variables.
Perl has three types of variables:
i). Scalars
ii). Arrays
iii). Hashes.
i). Scalars: A scalar variable stores a single (scalar) value.Perl scalar names are prefixed with
a dollar sign ($), so for example, $username, and $url are all examples of scalar variable
names. A scalar can hold data of anytype, be it a string, a number, or whatnot. We can alsouse
scalars in double-quoted strings: my $fnord = 23;my $blee = The magic number is $fnord.;
Now if you print $blee, we will get The magic number is 23.Perl interpolates the variables
in the string, replacingthe variable name with the value of that variable.
ii). Arrays: An array stores an ordered list of values. Whilea scalar variable can only store one
value, an array canstoremany. Perl array names are prefixed with a @-sign.e.g.:my @colors =
(red,green,blue); foreach my $i(@colors) { print $i\n; }
iii). Hashes: Hashes are an advanced form of array. One of the limitations of an array is that
the information contained within it can be difficult to get to. For example, imagine that you
have a list of people and their ages. The hash solves this problem very neatly by allowing us
to access that @ages array not by an index, but by a scalar key. For example to use age of
different people we can use thier names as key to define a hash.
e) How are JSPs better than servlets.
Java programming knowledge is needed todevelop and maintain all aspects of the
application,since the processing code and the HTML elements are jumped together.
Changing the look and feel of theapplication,or adding support for a new type of client,
requires theservlet code to be updated and recompiled.
Its hardto take advantage of web-page development tools whendesigning the application
interface. If such tools areused to develop the web page layout, the generatedHTML must
then be manually embedded into theservletcode, a process which is time consuming, error
prone,and extremely boring. Adding JSP to the puzzle wesolvethese problems.So JSPs better
than servlets.
Part - B
1. a) Explain GET and POST method with the help of an example.
When a client sends a request to the server, theclients can also additional information with the
URL todescribe what exactly is required as output from theserver by using the GET method.
The additionalsequenceof characters that are appended to URL is called a querystring.
However, the length of the query string islimited to 240 characters. Moreover, the query
string isvisible on the browser and can therefore be a securityrisk.to overcome these
disadvantages, the POST method can be used. The POST method sends the data as
packetsthrough a separate socket connection. The completetransaction is invisible because to
the client. Thedisadvantageof POST method is that it is slower compared to theGET method
because data is sent to the server asseparate packets.
b) Explain in detail the role played by CGI programming in web programming.
CGI opened the gates of more complex Web applications. It enabled developers to write
scripts,
which can communicate with server applications and databases. In addition, it enables
developers to write scripts that could also parse client's input, process it, and present it in a
user
friendly way.
The Common Gateway Interface, or CGI, is a standard for external gateway
programs to interface with information servers such as HTTP servers. A plain HTML
document
that the Web daemon retrieves is static, which means it exists in a constant state: a text file
that
doesn't change. A CGI program, on the other hand, is executed in real-time, so that it can
output
dynamic information.
CGI programming allows us to automate passing information to and from web pages. It can
also
be used to capture and process that information, or pass it off to other software (such as in an
SQL database).
CGI programs (sometimes called scripts) can be written in any programming language, but
the
two most commonly used are Perl and PHP. Despite all the flashy graphics, Internet
technology
is fundamentally a text-based system. Perl was designed to be optimal for text processing, so
it
quickly became a popular CGI tool. PHP is a scripting language designed specifically to
make
web programming quick and easy.
2. a) With the help of an example explain the embedding of an image in an HTML tag.
<HTML>
<HEAD>
</HEAD>
<BODY>
<IMG SRC="Images/123.jpg" ALT="Image" />
</BODY>
</HTML>
b) Create a HTML page to demonstrate the usage of Anchor tags.
<HTML>
<HEAD></HEAD>
<BODY>
<A NAME=section2>
<H2>A Cold Autumn Day</H2></A>
If this anchor is in a file called "nowhere.htm," you could define a link that jumps to the
anchor as follows:
<P>Jump to the second section <A HREF="nowhere.htm#section2">
A Cold Autumn Day</A> in the mystery "A man from Nowhere."
</BODY>
</HTML>
3. a) Explain the usage of script tags.
Using the SCRIPT Tag: The following example uses the SCRIPT tag to define a JavaScript
script in the HEAD tag. The script is loaded before anything else in the document is loaded.
The JavaScript code in this example defines a function, changeBGColor(), that changes the
documents background color.
The body of the document contains a form with two buttons. Each button invokes the
changeBGColor()
function to change the background of the document to a different color.
<HTML>
<HEAD><TITLE>Script Example</TITLE>
</HEAD>
<SCRIPT language="JavaScript">
function changeBGColor (newcolor) {
document.bgColor=newcolor;
return false;
}
</SCRIPT>
<BODY >
<P>Select a background color:</P>
<FORM>
<INPUT TYPE="button" VALUE=blue onClick="changeBGColor('blue');">
#$form = $ENV{'QUERY_STRING'};
use CGI;
$cgi = new CGI;
print $cgi->header;
print $cgi->start_html( "Question Ten" );
my $one = $cgi->param( 'one' );
my $two = $cgi->param( 'two' );
my $three = $cgi->param( 'three' );
if( $one && $two && $three )
{
$lcm = &findLCM( &findLCM( $one, $two ), $three );
print "LCM is $lcm";
}
else
{
print '
';
print 'Enter First Number
';
print 'Enter Second Number
';
print 'Enter Third Number
';
print '
';
print "
";
}
print $cgi->end_html;
sub findLCM(){
my $x = shift;
my $y = shift;
my $temp, $ans;
if ($x < $y) {
$temp = $y;
$y = $x;
$x = $temp;
}
$ans = $y;
$temp = 1;
while ($ans % $x)
{
$ans = $y * $temp;
$temp++ ;
}
return $ans;
}
5. a) List the differences between web server and application server.
The main differences between Web servers and application servers :-
A Web server is where Web components are deployed and run. An application server is where
components that implement the business logic are deployed. For example, in a JSP-EJB Web
application, the JSP pages will be deployed on the Web server whereas the EJB components
will
be deployed on the application servers.
A Web server usually supports only HTTP (and sometimes SMTP and FTP). However, an
application server supports HTTP as well as various other protocols such as SOAP.
In other word :Difference between AppServer and a Web server :i). Webserver serves pages for viewing in web browser, application server provides exposes
businness logic for client applications through various protocols
ii). Webserver exclusively handles http requests.application server serves bussiness logic to
application programs through any number of protocols.
iii). Webserver delegation model is fairly simple,when the request comes into the webserver,it
simply passes the request to the program best able to handle it(Server side program). It may
not
support transactions and database connection pooling.
iv). Application server is more capable of dynamic behaviour than webserver. We can also
configure application server to work as a webserver.Simply applic! ation server is a superset
of
webserver.
b) What is a war file? Explain its importance.
WAR or Web Application Archive file is packaged servlet Web application. Servlet
applications
are usually distributed as a WAR files.
WAR file (which stands for "web application_ archive" ) is a JAR_ file used to distribute a
collection of JavaServer Pages_ , servlets_ , Java_ classes_ , XML_ files, tag libraries and
static Web pages ( HTML_ and related files) that together constitute a Web application.
6. a) Explain implicit objects out, request response in a JSP page.
Following are the implicit objects in a JSP page:out: This implicit object represents a JspWriter that provides a stream back to the requesting
client. The most common method of this object is out.println(),which prints text that will be
displayed in the client's browser request: This implicit object represents the
javax.servlet.HttpServletRequest interface. The request object is associated with every HTTP
request. One common use of the request object is to access request parameters. You can do
this by calling the request object's getParameter() method with the parameter name you are
seeking. It will return a string with the values matching the named parameter. response: This
implicit object represents the javax.servlet.HttpServletRequest object. The response object is
used to pass data back to the requesting client. A common use of this object is
writing HTML output back to the client browser.
b) With the help of an example explain JSP elements.
JSP elements are of 3 types:Directive: Specifies information about the page itself that remains the same between requests.
For example, it can be used to specify whether session tracking is required or not,
buffering
requirements, and the name of the page that should be used to report errors.
questions like who has scored the highest marks? ; In which subject the maximum number
of students
have failed?; Which students are weak in more than one subject? etc. Of course, appropriate
programs
have to be written to do these computations. Also, as the database becomes too large and
more and more
data keeps getting included at different periods of time, there are several other problems
about maintaining
these data, which will not be dealt with here.
Since handling of such databases has become one of the primary jobs of the computer in
recent years,
it becomes difficult for the average user to keep writing such programs. Hence, special
languages
called database query languages- have been deviced, which makes such programming easy,
there languages
help in getting specific queries answered easily.
4. With example explain the different views of a data.
Data is normally stored in tabular form, unless storage in other formats becomes
advantageous, we
store data in what are technically called relations or in simple terms as tables.
The views are Mainly 2 types .
i). Simple View
ii). Complex View
Simple view:
- It is created by selecting only one table.
- It does not contains functions.
- it can perform DML (SELECT,INSERT,UPDATE,DELETE,MERGE, CALL,LOCK
TABLE) operations through simple view.
Complex view :
-It is created by selecting more than one table.
-It can performs functions.
-You can not perform always DML operations through
5. Briefly explain the concept of normalization.
Normalization is dealt with in several chapters of any books on database management
systems. Here, we will take the simplest definition, which suffices our purpose namely any
field should not have subfields.
Again consider the following student table.
Here under the field marks, we have 3 sub fields: marks for subject1, marks for subject2 and
subject3.
However, it is preferable split these subfields to regular fields as shown below
Quite often, the original table which comes with subfields will have to be modified suitable,
by the
process of normalization.
ii). The day to day management of data ware house is not to be confused with maintenance
and management of hardware and software. When large amounts of data are stored and new
data are being continually added at regular intervals, maintaince of the quality of data
becomes an important element.
iii). Ability to accommodate changes implies the system is structured in such a way as to be
able to cope with future changes without the entire system being remodeled. Based on these,
we can view the processes that a typical data ware house scheme should support as follows.
8. Explain the extract and load process of data ware house.
Extract and Load Process : This forms the first stage of data ware house. External physical
systems like the sales counters which give the sales data, the inventory systems that give
inventory levels etc. constantly feed data to the warehouse. Needless to say, the format of
these external data is to be monitored and modified before loading it into the ware house. The
data ware house must extract the data from the source systems, load them into their data
bases, remove unwanted fields (either because they are not needed or because they are
already there in the data base), adding new fields / reference data and finally reconciling with
the other data. We shall see a few more details of theses broad actions in the subsequent
paragraphs.
i). A mechanism should be evolved to control the extraction of data, check their
consistency
etc. For example, in some systems, the data is not authenticated until it is audited.
ii). ?Having a set of consistent data is equally important. This especially happens when we
are
having several online systems feeding the data.
iii). Once data is extracted from the source systems, it is loaded into a temporary data
storage
before it is Cleaned and loaded into the warehouse.
9. In what ways data needs to be cleaned up and checked? Explain briefly.
Data needs to be cleaned up and checked in the following ways :i) It should be consistent with itself.
ii) It should be consistent with other data from the same source.
iii) It should be consistent with other data from other sources.
iv) It should be consistent with the information already available in the data ware house.
While it is easy to list act the needs of a clean data, it is more difficult to set up systems that
automatically cleanup the data. The normal course is to suspect the quality of data, if it does
not meet the normally standards of commonsense or it contradicts with the data from other
sources, data already available in the data ware house etc. Normal intution doubts the validity
of the new data and effective measures like rechecking, retransmission etc., are undertaken.
When none of these are possible, one may even resort to ignoring the entire set of data and
get on with next set of incoming data.
10. Explain the architecture of data warehouse.
The architecture for a data ware is indicated below. Before we proceed further, we should be
clear about the concept of architecture. It only gives the major items that make up a data ware
house. The size and complexity of each of these items depend on the actual size of the ware
house itself, the specific requirements of the ware house and the actual details of
implementation.
Let us elaborate a little on the example. Consider a customer A. If there is a situation, where
the
warehouse is building the profiles of customer, then A becomes a fact - against the name A,
we can list his address, purchases, debts etc. One can ask questions like how many purchases
has A made in the last 3 months etc. Then A is fact. On the other hand, if it is likely to be used
to answer questions like how many customers have made more than 10 purchases in the last
6 months, and one uses the data of A, as well as of other customers to give the answer, then
it becomes a fact table. The rule is, in such cases, avoid making A as a candidate key.
16. Explain the designing of star-flake schema in detail.
A star flake schema, as we have defined previously, is a schema that uses a combination of
denormalised star and normalized snow flake schemas. They are most appropriate in decision
support data ware houses. Generally, the detailed transactions are stored within a central fact
table, which may be partitioned horizontally or vertically. A series of combinatory data base
views are created to allow the user to access tools to treat the fact table partitions as a single,
large table.
The key reference data is structured into a set of dimensions. Theses can be referenced from
the fact table. Each dimension is stored in a series of normalized tables (snow flakes), with an
additional denormalised star dimension table.
This is a 2 dimensional table. One the other hand, if the company wants a data of all items
sold by its outlets, it can be done by simply by superimposing the 2 dimensional table for
each of these items one behind the other. Then it becomes a 3 dimensional view.
Then the query, instead of looking for a 2 dimensional rectangle of data, will look for a 3
dimensional cuboid of data.
There is no reason why the dimensioning should stop at 3 dimensions. In fact almost all
queries can be thought of as approaching a multi-dimensioned unit of data from a
multidimensioned volume of the schema.
19. Why partitioning is needed in large data warehouse?
Partitioning is needed in any large data ware house to ensure that the performance and
manageability is improved. It can help the query redirection to send the queries to the
appropriate partition, thereby reducing the overall time taken for query processing.
20. Explain the types of partitioning in detail.
i). Horizontal partitioning :This is essentially means that the table is partitioned after the first few thousand entries, and
the next
few thousand entries etc. This is because in most cases, not all the information in the fact
table needed all the time. Thus horizontal partitioning helps to reduce the query access time,
by directly cutting down the amount of data to be scanned by the queries.
ii). Vertical partitioning :As the name suggests, a vertical partitioning scheme divides the table vertically i.e. each
row is
divided into 2 or more partitions.
iii). Hardware partitioning :Needless to say, the dataware design process should try to maximize the performance of the
system. One of the ways to ensure this is to try to optimize by designing the data base with
respect to specific hardware architecture.
21. Explain the mechanism of row splitting.
Row Splitting :The method involved identifying the not so frequently used fields and putting them into
another table.
This would ensure that the frequently used fields can be accessed more often, at much lesser
computation time.
It can be noted that row splitting may not reduce or increase the overall storage needed, but
normalization may involve a change in the overall storage space needed. In row splitting, the
mapping is 1 to 1 whereas normalization may produce one to many relationships.
22. Explain the guidelines used for hardware partitioning.
Guidelines used for hardware partitioning :Needless to say, the dataware design process should try to maximize the performance of the
system. One of the ways to ensure this is to try to optimize by designing the data base with
respect to specific hardware architecture. Obviously, the exact details of optimization
depends on the hardware platforms. Normally the following guidelines are useful:i). maximize the processing, disk and I/O operations.
ii). Reduce bottlenecks at the CPU and I/O
23. What is aggregation? Explain the need of aggregation. Give example.
Aggregation : Data aggregation is an essential component of any decision support data ware
house. It helps us to ensure a cost effective query performance, which in other words means
that costs incurred to get the answers to a query would be more than off set by the benefits of
the query answer. The data aggregation attempts to do this by reducing the processing power
needed to process the queries. However, too much of aggregations would only lead to
unacceptable levels of operational costs.
Too little of aggregations may not improve the performance to the required levels. A file
balancing of
the two is essential to maintain the requirements stated above. One thumbrule that is often
suggested is that about three out of every four queries would be optimized by the aggregation
process, whereas the fourth will take its own time to get processed. The second, though
minor, advantage of aggregations is that they allow us to get the overall trends in the data.
While looking at individual data such overall trends may not be obvious, whereas aggregated
data will help us draw certain conclusions easily.
24. Explain the different aspects for designing the summary table.
Summary table are designed by following the steps given below :i). Decide the dimensions along which aggregation is to be done.
ii). Determine the aggregation of multiple facts.
iii). Aggregate multiple facts into the summary table.
iv). Determine the level of aggregation and the extent of embedding.
v). Design time into the table.
vi). Index the summary table.
25. Give the reasons for creating the data mart.
The following are the reasons for which data marts are created :i). Since the volume of data scanned is small, they speed up the query processing.
ii). Data can be structured in a form suitable for a user access too
iii). Data can be segmented or partitioned so that they can be used on different platforms
and
also different control strategies become applicable.
26. Explain the two stages in setting up data marts.
There are two stages in setting up data marts :i). To decide whether data marts are needed at all. The above listed facts may help you to
decide whether it is worth while to setup data marts or operate from the warehouse itself.
The problem is almost similar to that of a merchant deciding whether he wants to set up retail
shops or not.
ii). If you decide that setting up data marts is desirable, then the following steps have to be
gone
through before you can freeze on the actual strategy of data marting.
a) Identify the natural functional splits of the organization.
b) Identify the natural splits of data.
c) Check whether the proposed access tools have any special data base structures.
d) Identify the infrastructure issues, if any, that can help in identifying the data marts.
e) Look for restrictions on access control. They can serve to demarcate the warehouse
details.
27. What are disadvantages of data mart?
There are certain disadvantages :i). The cost of setting up and operating data marts is quite high.
ii). Once a data strategy is put in place, the datamart formats become fixed. It may be fairly
difficult to change the strategy later, because the data marts formats also have to be changes.
28. What is role of access control issue in data mart design?
Role of access control issue in data mart design :This is one of the major constraints in data mart designs. Any data warehouse, with its huge
volume
of data is, more often than not, subject to various access controls as to who could access
which part of data. The easiest case is where the data is partitioned so clearly that a user of
each partition cannot access any other data. In such cases, each of these can be put in a data
mart and the user of each can access only his data .
In the data ware house, the data pertaining to all these marts are stored, but the partitioning
are retained. If a super user wants to get an overall view of the data, suitable aggregations can
be generated.
29. Explain the purpose of using metadata in detail.
Metadata will be used for the following purposes :i). data transformation and loading.
ii). data management.
iii). query generation.
30. Explain the concept of metadata management.
Meta data should be able to describe data as it resides in the data warehouse. This will help
the warehouse manager to control data movements. The purpose of the metadata is to
describe the objects in the database. Some of the descriptions are listed here.
Tables
- Columns
* Names
* Types
Indexes
- Columns
* Name
* Type
Views
- Columns
* Name
* Type
Constraints
- Name
- Type
- Table
* Columns
31. How the query manager uses the Meta data? Explain in detail.
Meta data is also required to generate queries. The query manger uses the metadata to build a
history of all queries run and generator a query profile for each user, or group of uses.
We simply list a few of the commonly used meta data for the query. The names are self
explanatory.
o Query
o Table accessed
Column accessed
Name
Reference identifier
o Restrictions applied
o Column name
o Table name
o Reference identifier
o Restrictions
o Join criteria applied
o Column name
o Table name
o Reference identifier
o Column name
o Table name
o Reference identifier
o Aggregate function used
o Column name
o Reference identifier
o Aggregate function
o Group by criteria
o Column name
o Reference identifier
o Sort direction
o Syntax
o Resources
o Disk
o Read
o Write
o Temporary
32. Why we need different managers to a data ware house? Explain.
Need for managers to a data ware house :Data warehouses are not just large databases. They are complex environments that integrate
many
technologies. They are not static, but will be continuously changing both contentwise and
structurewise. Thus, there is a constant need for maintenance and management. Since huge
amounts of time, money and efforts are involved in the development of data warehouses,
sophisticated management tools are always justified in the case of data warehouses.
When the computer systems were in their initial stages of development, there used to be an
army of
human managers, who went around doing all the administration and management. But such a
scheme became both unvieldy and prone to errors as the systems grew in size and complexity.
Further most of the management principles were adhoc in nature and were subject to human
errors and fatigue.
33. With neat diagram explain the boundaries of process managers.
A schematic diagram that defines the boundaries of the three types of managers :-
decide the amount of inventory to be held, the no. of employees to be hired, the amount to be
procured on loan etc.,.
The above definition may not be precise - but that is how data ware house systems are. There
are different definitions given by different authors, but we have this idea in mind and
proceed. It is a large collection of data and a set of process managers that use this data to
make information available. The data can be meta data, facts, dimensions and aggregations.
The process managers can be load managers, ware house managers or query managers. The
information made available is such that they allow the end users to make informed decisions.
Roles of education in a data warehousing delivery process:This has two roles to play - one to make people, specially top level policy makers,
comfortable with the concept. The second role is to aid the prototyping activity. To take care
of the education concept, an initial (usually scaled down) prototype is created and people are
encouraged to interact with it. This would help achieve both the activities listed above. The
users became comfortable with the use of the system and the ware house developer becomes
aware of the limitations of his prototype which can be improvised upon.
c) What is process managers? What are the different types of process managers?
Process Managers: These are responsible for the smooth flow, maintainance and upkeep of
data into and out of the database.
The main types of process managers are:i). Load manager: to take case of source interaction, data transformation and data load.
ii). Ware house manger: to take care of data movement, meta data management and
performance
monitoring.
iii). Query manager: to control query scheduling and monitoring.
We shall look into each of them briefly. Before that, we look at a schematic diagram that
defines the boundaries of the three types of managers.
d) Give the architectures of data mining systems.
e) What are the guidelines for KDD environment ?
It is customary in the computer industry to formulate rules of thumb that help information
technology (IT) specialists to apply new developments. In setting up a reliable data mining
environment we may follow the guidelines so that KDD system may work in a manner we
desire.
i). Support extremely large data sets
ii). Support hybrid learning
iii). Establish a data warehouse
iv). Introduce data cleaning facilities
v). Facilitate working with dynamic coding
vi). Integrate with decision support system
vii). Choose extendible architecture
viii). Support heterogeneous databases
ix). Introduce client/server architecture
x). Introduce cache optimization
PART - B
one should identify the facts and store it in the read-only area and the dimensions surround
the area. Whereas the dimensions are liable to change, the facts are not. But given a set of raw
data from the sources, how does one identify the facts and the dimensions? It is not always
easy, but the following steps can help in that direction.
i) Look for the fundamental transactions in the entire business process. These basic entities
are the facts.
ii) Find out the important dimensions that apply to each of these facts. They are the
candidates
for dimension tables.
iii) Ensure that facts do not include those candidates that are actually dimensions, with a set
of
facts attached to it.
iv) Ensure that dimensions do not include these candidates that are actually facts.
3. a) What is an event in data warehousing? List any five events.
An event is defined as a measurable, observable occurrence of a defined action. If this
definition is quite vague, it is because it encompasses a very large set of operations. The
event manager is a software that continuously monitors the system for the occurrence of the
event and then take any action that is suitable (Note that the event is a measurable and
observable occurrence). The action to be taken is also normally specific to the event.
A partial list of the common events that need to be monitored are as follows:
i). Running out of memory space.
ii). A process dying
iii). A process using excessing resource
iv). I/O errors
v). Hardware failure
b) What is summary table? Describe the aspects to be looked into while designing a summary
table.
The main purpose of using summary tables is to cut down the time taken to execute a specific
query.
The main methodology involves minimizing the volume of data being scanned each time the
query is to be
answered. In other words, partial answers to the query are already made available. For
example, in the
above cited example of mobile market, if one expects
i) the citizens above 18 years of age
ii) with salaries greater than 15,000 and
iii) with professions that involve traveling are the potential customers, then, every time the
query is to be processed (may be every month or every quarter), one will have to look at the
entire data base to compute these values and then combine them suitably to get the relevant
answers. The other method is to prepare summary tables, which have the values pertaining
toe ach of these sub-queries, before hand, and then combine them as and when the query is
raised.
Summary table are designed by following the steps given below:
i) Decide the dimensions along which aggregation is to be done.
ii) Determine the aggregation of multiple facts.
iii) Aggregate multiple facts into the summary table.
iv) Determine the level of aggregation and the extent of embedding.
v) Design time into the table.
functionalities together.
d) Classification according to mining techniques used: Data mining systems employ and
provide different techniques. This classification categorizes data mining systems according to
the data analysis approach used such as machine learning, neural networks, genetic
algorithms, statistics, visualization, database oriented or data warehouse-oriented, etc.
b) List and explain different kind of data that can be mined.
Different kind of data that can be mined are listed below:i). Flat files: Flat files are actually the most common data source for data mining algorithms,
especially at the research level.
ii). Relational Databases: A relational database consists of a set of tables containing either
values of entity attributes, or values of attributes from entity relationships.
iii). Data Warehouses: A data warehouse as a storehouse, is a repository of data collected
from multiple data sources (often heterogeneous) and is intended to be used as a whole under
the same unified schema.
iv). Multimedia Databases: Multimedia databases include video, images, audio and text
media. They can be stored on extended object-relational or object-oriented databases, or
simply on a file system.
v). Spatial Databases: Spatial databases are databases that in addition to usual data, store
geographical information like maps, and global or regional positioning.
vi). Time-Series Databases: Time-series databases contain time related data such stock market
data or logged activities. These databases usually have a continuous flow of new data coming
in, which sometimes causes the need for a challenging real time analysis.
vii). World Wide Web: The World Wide Web is the most heterogeneous and dynamic
repository available. A very large number of authors and publishers are continuously
contributing to its growth and metamorphosis and a massive number of users are accessing its
resources daily.
6. a) Give the syntax for task relevant data specification.
Syntax for tax-relevant data specification:The first step in defining a data mining task is the specification of the task-relevant data, that
is, the data on which mining is to be performed. This involves specifying the database and
tables or data warehouse containing the relevant data, conditions for selecting the relevant
data, the relevant attributes or dimensions for exploration, and instructions regarding the
ordering or grouping of the data retrieved. DMQL provides clauses for the clauses for the
specification of such information, as follows:i). use database (database_name) or use data warehouse (data_warehouse_name): The use
clause directs the mining task to the database or data warehouse specified.
ii). from (relation(s)/cube(s)) [where(condition)]: The from and where clauses respectively
specify the database tables or data cubes involved, and the conditions defining the data to be
retrieved.
iii). in relevance to (attribute_or_dimension_list): This clause lists the attributes or
dimensions for exploration.
iv). order by (order_list): The order by clause specifies the sorting order of the task relevant
data.
v). group by (grouping_list): the group by clause specifies criteria for grouping the data.
vi). having (conditions): The having cluase specifies the condition by which groups of data
are considered relevant.
b) Explain the designing of GUI based on data mining query language.
A data mining query language provides necessary primitives that allow users to communicate
with data mining systems. But novice users may find data mining query language difficult to
use and the syntax difficult to remember. Instead , user may prefer to communicate with data
mining systems through a graphical user interface (GUI). In relational database technology ,
SQL serves as a standard core language for relational systems , on top of which GUIs can
easily be designed. Similarly, a data mining query language may serve as a core language for
data mining system implementations, providing a basis for the development of GUI for
effective data mining.
A data mining GUI may consist of the following functional components:a) Data collection and data mining query composition - This component allows the user to
specify task-relevant data sets and to compose data mining queries. It is similar to GUIs used
for the specification of relational queries.
b) Presentation of discovered patterns This component allows the display of the discovered
patterns in various forms, including tables, graphs, charts, curves and other visualization
techniques.
c) Hierarchy specification and manipulation - This component allows for concept hierarchy
specification , either manually by the user or automatically. In addition , this component
should allow concept hierarchies to be modified by the user or adjusted automatically based
on a given data set distribution.
d) Manipulation of data mining primitives This component may allow the dynamic
adjustment of data mining thresholds, as well as the selection, display and modification of
concept hierarchies. It may also allow the modification of previous data mining queries or
conditions.
e) Interactive multilevel mining This component should allow roll-up or drill-down
operations on discovered patterns.
f) Other miscellaneous information This component may include on-line help manuals,
indexed search , debugging and other interactive graphical facilities.
7. a) Explain how decision trees are useful in data mining.
Decision trees are powerful and popular tools for classification and prediction. The
attractiveness of tree-based methods is due in large part to the fact that, it is simple and
decision trees represent rules. Rules can readily be expressed so that we humans can
understand them or in a database access language like SQL so that records falling into a
particular category may be retrieved.
b) Identify an application and also explain the techniques that can be incorporated in solving
the problem using data mining techniques.
Write yourself...
8. Write a short notes on :
i) Data Mining Querying Language
ii) Schedule Manager
iii) Data Formatting.
i) Data Mining Querying Language
A data mining language helps in effective knowledge discovery from the data mining
systems. Designing
a comprehensive data mining language is challenging because data mining covers a wide
spectrum of
tasks from data characterization to mining association rules, data classification and evolution
analysis.
Each task has different requirements. The design of an effective data mining query language
requires a
deep understanding of the power, limitation and underlying mechanism of the various kinds
of data mining
tasks.
ii) Schedule manager
The scheduling is the key for successful warehouse management. Almost all operations in the
ware
house need some type of scheduling. Every operating system will have its own scheduler and
batch
control mechanism. But these schedulers may not be capable of fully meeting the
requirements of a data
warehouse. Hence it is more desirable to have specially designed schedulers to manage the
operations.
iii) Data formatting
Final data preparation step which represents syntactic modifications to the data that do not
change its
meaning, but are required by the particular modelling tool chosen for the DM task. These
include:
a). reordering of the attributes or records: some modelling tools require reordering of the
attributes
(or records) in the dataset: putting target attribute at the beginning or at the end, randomizing
order of records (required by neural networks for example)
b). changes related to the constraints of modelling tools: removing commas or tabs, special
characters, trimming strings to maximum allowed number of characters, replacing special
characters with allowed set of special characters.
reliability of the product means finding and removing errors. Hence one
should not test a product to show that it works; rather, one should start
with the assumption that the program contains errors and then test the
program to find as many errors as possible.
2. Explain the test information flow in a typical software test life
cycle.
Ans:- Testing is a complex process and requires effort similar to software
development . a typicaltest information flow is show in figure
Predicted Reliability
Software Configuration includes a Software Requirements Specification, a
Design Specification, and source code. A test configuration includes a Test
Plan and Procedures, test cases, and testing tools. It is difficult to predict
the time to debug the code, hence it is difficult to schedule.
Once the right software is available for testing, proper test plan and test
cases are developed. Then the software is subjected to test with simulated
test data. After the test execution, the testresults are examined. It may
have defects or the software is passed with out any defect. The software
with defect is subjected to debugging and again tested for its correctness.
This process will continue till the testing reports zero defects or run out of
time for testing.
3.What is risk in software testing? How risk management
improves the quality of the software?
Ans:- The risk associated with a software application being developed are
called software risks. These risks can lead to errors in the code affecting
the functioning of the application.
Following are the factors that lead to software risks:
1.
Skills of software
ii. Disgruntled
iii. Poorly defined project objectives
iv. Project risks
v. Technical risks
5. Explain the black and white box testing? Explain with an example.
Which method better? List out drawbacks of each one.
Ans:- Black box testing treats the system as a black-box, so it doesnt
explicitly use Knowledge of the internal structure or code. Or in other
words the Test engineer need not know the internal working of the Black
box or application.
Main focus in black box testing is on functionality of the system
as a whole. The termbehavioral testing is also used for black box
testing and white box testing is also sometimes called structural
testing. Behavioral test design is slightly different from black-box test
design because the use of internal knowledge isnt strictly forbidden, but
its still discouraged.
Disadvantages of Black Box Testing
- The test inputs needs to be from large sample space.
- It is difficult to identify all possible inputs in limited testing time. So
writing test cases is slow and difficult
- Chances of having unidentified paths during this testing
White box testing involves looking at the structure of the code. When you
know the internal structure of a product, tests can be conducted to ensure
that the internal operations performed according to the specification. And
all internal components have been adequately exercised. Drawbacks of
WBT:
Not possible for testing each and every path of the loops in program. This
means exhaustive testing is impossible for large systems.
This does not mean that WBT is not effective. By selecting important
logical paths and data structure for testing is practically possible and
effective.
6. What is cyclomatic complexity? Explain with an illustration.
Discuss its role in software testing and generating test cases.
Ans:The cyclomatic complexity gives a quantitative measure of the logical
complexity. This value gives the number of independent paths in the basis
set, and an upper bound for the number of tests to ensure that each
statement is executed at least once. An independent path is any path
through a program that introduces at least one new set of processing
statements or a new condition
Cyclomatic Complexity of 4 can be calculated as:
1. Number of regions of flow graph, which is 4.
2. #Edges #Nodes + 2, which is 11-9+2=4.
3. #Predicate Nodes + 1, which is 3+1=4.
The above complexity provides the upper bound on the number of tests
cases to be generated or independent execution paths in the program.
The independent paths (4 paths) for the program shown in
Independent Paths:
1. 1, 8
2. 1, 2, 3, 7b, 1, 8
3. 1, 2, 4, 5, 7a, 7b, 1, 8
4. 1, 2, 4, 6, 7a, 7b, 1, 8
Data coupling
Data coupling is when modules share data through, for example,
parameters. Each datum is an elementary piece, and these are the only
data shared (e.g., passing an integer to a function that computes a square
root).
8. Compare and contrast between Verification and Validation with
examples.
Ans:- Verification testing ensures that the expressed user requirements,
gathered in the Project Initiation phase, have been met in the Project
Execution phase. One way to do this is to produce auser requirements
matrix or checklist and indicate how you would test for each requirement.
For example, if the product is required to weigh no more than 15 kg.
(about 33 lbs.), the test could be, Weigh the object does it weigh 15 kg.
or less?, and note yes or no on the matrix or checklist.
Validation testing ensures that any implied requirement has been met. It
usually occurs in theProject Monitoring and Control phase of project
management. Using the above product as an example, you ask the
customer, Why must it be no more than 15 kg.? One answer is, It
must be easy to lift by hand. You could validate that requirement by
having twenty different people lift the object and asking each one, Was
the object easily to lift? If 90% of them said it was easy,
you could conclude that the object meets the requirement.
8.What is stress testing? Where do you need this testing? Explain.
Ans:- Stress testing is a form of testing that is used to determine the
stability of a given system or entity. It involves testing beyond normal
operational capacity, often to a breaking point, in order to observe the
results
The Need of stress Testing:
A web server may be stress tested using scripts, bots, and various
denial of service tools to observe the performance of a web site during
peak loads.
4) Illegal data format: Make one data set of illegal data format. System
should not accept data in invalid or illegal format. Also check proper error
messages are generated.
5) Boundary Condition data set: Data set containing out of range data.
Identify application boundary cases and prepare data set that will cover
lower as well as upper boundary conditions.
6) Data set for performance, load and stress testing: This data set
should be large in volume.