CS614 - Data Warehousing Quiz No.2 May 07,2012
CS614 - Data Warehousing Quiz No.2 May 07,2012
consideration is
Select correct option:
OLTP
OLAP
DSS
Inverted Index
www.vu39.com
Index nested-loop join
Temporary index nested-loop join
None of these
Data mining derives its name from the similarities between searching for valuable business
information in a large database, for example, finding linked products in gigabytes of store
scanner data, and mining a mountain for a _________ of valuable ore.
Select correct option:
Furrow
Streak
Trough
Vein
www.vu39.com
Question # 10 of 10 ( Start time: 10:41:16 PM ) Total Marks: 1
________ is the technique in which existing heterogeneous segments are reshuffled, relocated
into homogeneous segments.
Select correct option:
Clustering
Aggregation
Segmentation
Partitioning
The goal of ideal parallel execution is to completely parallelize those parts of a computation
that are not constrained by data dependencies. The ______ the portion of the program that
must be executed sequentially, the greater the scalability of the computation
Larger
Smaller
Unambiguous
Superior
_______________, if fits into memory, costs only one disk I/O access to locate a record by given
39
key.
An Inverted Index
A Sparse Index
A Dense Index
None of these
vu
If someone told you that he had a good model to predict customer usage, the first thing you
might try would be to ask him to apply his model to your customer _______, where you already
knew the answer.
Base
Drive
File
Log
The automated, prospective analyses offered by data mining move beyond the analyses of past
events provided by _____________ tools typical of decision support systems.
Introspective
Intuitive
Reminiscent
Retrospective
If every key in the data file is represented in the index file then index is
Dense Index
Sparse Index
Inverted Index
www.vu39.com
None of these
A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given
key.
One
Two
Linear
Quadratic
With data mining, the best way to accomplish this is by setting aside some of your data in a
vault to isolate it from the mining process; once the mining is complete, the results can be
tested against the isolated data to confirm the model's _______.
Validity
Security
Integrity
None of these
Data mining uses _________ algorithms to discover patterns and regularities in data.
Mathematical
39
Computational
Statistical
None of these
The goal of ___________ is to look at as few blocks as possible to find the matching records(s).
vu
Indexing
Partitioning
Joining
None of these
_______________, if too big and does not fit into memory, will be expensive when used to find
a record by given key.
An Inverted Index
A Sparse Index
A Dense Index
None of these
There are many variants of the traditional nested-loop join. If the index is built as part of the
query plan and subsequently dropped, it is called
Naive nested-loop join
Index nested-loop join
Temporary index nested-loop join
None of these
www.vu39.com
_______________, if fits into memory, costs only one disk I/O access to locate a record by given
key.
An Inverted Index
A Sparse Index
A Dense Index
None of these
If ‘M’ rows from table-A match the conditions in the query then table-B is accessed ‘M’ times.
Suppose table-B has an index on the join column. If ‘a’ I/Os are required to read the data block
for each scan plus ‘b’ I/Os for each data block then the total cost of accessing table-B is
_____________ logical I/Os approximately.
(a + b)M
(a - b)M
(a + b + M)
(a * b * M)
With data mining, the best way to accomplish this is by setting aside some of your data in a
39
________ to isolate it from the mining process; once the mining is complete, the results can be
tested against the isolated data to confirm the model's validity.
Cell
Disk
Folder
vu
Vault
The goal of ideal parallel execution is to completely parallelize those parts of a computation
that are not constrained by data dependencies. The smaller the portion of the program that
must be executed __________, the greater the scalability of the computation.
In Parallel
Distributed
Sequentially
None of these
Data mining is a/an __________ approach, where browsing through data using data mining
techniques may reveal something that might be of interest to the user as information that was
unknown previously.
Non-Exploratory
Exploratory
Compute Science
none of these
www.vu39.com
Data mining evolve as mechanism to cater the limitations of _____ systems to deal massive
data sets with high dimensionality , new data types, multiple heterogeneous data resources
etc..
OLTP
OLAP
DSS
DWH
To identify the __________________ required we need to perform data profiling
Degree of Transformation
Complexity
Cost
Time
Execution can be completed successfully or it may be stopped due to some error. If some error
occurs, execution will be terminated abnormally and all transactions will be ___________
Committed to the database
Rolled back
Companies collect and record their own operational data, but at the same time they also use
39
reference data obtained from _______ sources such as codes, prices etc.
Operational
None of these
Internal
External
vu
Ad-hoc access means to run such queries which are known already.
True
False
____________ in agriculture extension is that pest population beyond which the benefit of
spraying outweighs its cost.
Profit Threshold Level
Economic Threshold Level
Medicine Threshold Level
None of these
People that design and build the data warehouse must be capable of working across the
organization at all levels
True
False
www.vu39.com
The _________ is only a small part in realizing the true business value buried within the
mountain of data collected and stored within organizations business systems and operational
databases.
Independence on technology
Dependence on technology
None of these
Many data warehouse project teams waste enormous amounts of time searching in vain for a
___________________.
Silver Bullet
Golden Bullet
Suitable Hardware
Compatible Product
A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given
key.
One
vu
Two
lg (n)
n
The key idea behind ___________ is to take a big task and break it into subtasks that can be
processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.
Pipeline Parallelism
Overlapped Parallelism
Massive Parallelism
Distributed Parallelism
www.vu39.com
Non uniform distribution, when the data is distributed across the processors, is called ______.
Skew in Partition
Pipeline Distribution
Distributed Distribution
Uncontrolled Distribution
The goal of ideal parallel execution is to completely parallelize those parts of a computation
that are not constrained by data dependencies. The smaller the portion of the program that
must be executed __________, the greater the scalability of the computation.
None of these
Sequentially
In Parallel
Distributed
Data mining is a/an __________ approach, where browsing through data using data mining
techniques may reveal something that might be of interest to the user as information that was
unknown previously.
Exploratory
Non-Exploratory
39
Computer Science
OLTP
OLAP
DSS
DWH
________ is the technique in which existing heterogeneous segments are reshuffled, relocated
into homogeneous segments.
Clustering
Aggregation
Segmentation
Partitioning
To measure or quantify the similarity or dissimilarity, different techniques are available. Which
of the following option represent the name of available techniques?
Pearson correlation is the only technique
Euclidean distance is the only technique
Both Pearson correlation and Euclidean distance
None of these
www.vu39.com
For a DWH project, the key requirement are ________ and product experience.
Tools
Industry
Software
None of these
Relational databases allow you to navigate the data in ____________ that is appropriate using
the primary, foreign key structure within the data model.
Only One Direction
Any Direction
Two Direction
None of these
www.vu39.com
The lack of data integration and standardization
Missing Data
Data Stored in Heterogeneous Sources
DTS allows us to connect through any data source or destination that is supported by
____________
OLE DB
OLAP
OLTP
Data Warehouse
Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and
consolidate data from disparate sources into single or multipledestinations supported by DTS
connectivity.
Tools
Documentations
Guidelines
39
If some error occurs, execution will be terminated abnormally and all transactions will be rolled
back. In this case when we will access the database we will find it in the state that was before
the ____________.
Execution of package
vu
Creation of package
Connection of package
Taken jointly, the extract programs or naturally evolving systems formed a spider web, also
known as
Distributed Systems Architecture
Legacy Systems Architecture
Online Systems Architecture
Intranet Systems Architecture
www.vu39.com
Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.
O (n)
O (n2)
O (n lg n)
O (lg n)
Which statement is true for De-Normalization?
Redundant data is a performance liability at query time, but is a performance benefit at update
time.
Redundant data is a performance benefit at both query time and update time.
Redundant data is a performance liability at both query time and update time.
Redundant data is a performance benefit at query time, but is a performance liability at update
time.
The degree of similarity between two records, often measured by a numerical value between
_______, usually depends on application characteristics.
0 and 1
0 and 10
0 and 100
0 and 99
The purpose of the House of Quality technique is to reduce ______ types of risk.
Two
Three
Four
All
www.vu39.com
There are many variants of the traditional nested-loop join. If the index is built as part of the
query plan and subsequently dropped, it is called
Naive nested-loop join
Index nested-loop join
Temporary index nested-loop join
None of these
The Kimball s iterative data warehouse development approach drew on decades of experience
to develop the _____________.
Business Dimensional Lifecycle
Data Warehouse Dimension
Business Definition Lifecycle
OLAP Dimension
During the application specification activity, we also must give consideration to the organization
of the applications.
39
True
False
The most recent attack is the ________ attack on the cotton crop during 2003- 04, resulting in a
loss of nearly 0.5 million bales.
vu
Boll Worm
Purple Worm
Blue Worm
Cotton Worm
The users of data warehouse are knowledge workers in other words they are_________ in the
organization.
Decision maker
Manager
Database Administrator
DWH Analyst
_________ breaks a table into multiple tables based upon common column values.
Horizontal splitting
Vertical splitting
As apposed to the out come of classification , estimation deal with ____________ valued
outcome.
Discrete
Isolated
www.vu39.com
Continuous
Distinct
The goal of ______is to look at as few block as possible to find the matching records. Indexing
Partitioning
Joining
none of these
nested loop join
none of these
The technique that is used to perform these feats in data mining modeling, and this act of
model building is something that people have been doing for long time, certainly before the
_______ of computers or data mining technology.
Access Advent
Ascent Avowal
39
A data warehouse may include
Legacy systems
Only internal data sources
Privacy restrictions
Small data mart
vu
For a given data set, to get a global view in un-supervised learning we use
One-way Clustering
Bi-clustering
Pearson correlation
Euclidean distance
www.vu39.com
In DWH project, it is assured that ___________ environment is similar to the production
environment.
Designing
Development
Analysis
Implementation
For good decision making, data should be integrated across the organization to cross the LoB
(Line of Business). This is to give the total view of organization from:
Owner’s Perspective
Customer’s Perspective
Decision Maker’s Perspective
Employee's Perspective
Streak
Trough
Vein
With data mining, the best way to accomplish this is by setting aside some of your data in a
________ to isolate it from the mining process; once the mining is complete, the results can be
tested against the isolated data to confirm the model's validity.
Cell
Disk
Folder
Vault
We must try to find the one access tool that will handle all the needs of their users.
True
False
Investing years in architecture and forgetting the primary purpose of solving business problems,
results in inefficient application. This is the example of _________ mistake.
Extreme Technology Design
Extreme Architecture Design
www.vu39.com
The automated, prospective analyses offered by data mining move beyond the analysis of past
events provided by respective tools typical of ___________.
OLTP
OLAP
Decision Support systems
None of these
There are many variants of the traditional nested-loop join, if there is an index is exploited,
then it is called……
Naïve nested loop join index
Nested loop join temporary index
Index nested-loop joins
The performance in a MOLAP cube comes from the O(1) look-up time for the array data
structure.
True
False
SQL
proprietary file
Object oriented
Non- proprietary file
www.vu39.com
Data warehousing and on-line analytical processing (OLAP) are _______ elements of decision
support system.
Unusual
Essential
Optional
None of the given
Virtual cube is used to query two similar cubes by creating a third “virtual” cube by a join
between two cubes.
True
False
The divide&conquer cube partitioning approach helps alleviate the ____________ limitations of
MOLAP implementation.
Flexibility
Maintainability
Security
Scalability
Data Warehouse provides the best support for analysis while OLAP carries out the _________
task.
Mandatory
Whole
Analysis
Prediction
www.vu39.com
DOLAP allows download of “cube” structures to a desktop platform with the need for shared
relational or cube server.
True
False
The STAR schema used for data design is a __________ consisting of fact and dimension tables.
Select correct option:
Network model
Relational model
Hierarchical data model
None of the given
Data Warehouse provides the best support for analysis while OLAP carries out the _________
task.
Select correct option:
39
Mandatory
Whole
Analysis
Prediction
vu
Virtual cube is used to query two similar cubes by creating a third “virtual” cube by a join
between two cubes.
Select correct option:
True
False
Data warehousing and on-line analytical processing (OLAP) are _______ elements of decision
support system.
Select correct option:
Unusual
Essential
Optional
None of the given
www.vu39.com