0% found this document useful (1 vote)

1K views

Data Mining Question Bank

This document contains 25 units with multiple choice questions about data mining, data warehousing, and related topics. The questions cover concepts such as data mining tasks and applications, the relationship between data warehousing and data mining, data preprocessing techniques, association rule mining, classification algorithms like decision trees, clustering, OLAP and multidimensional modeling, data warehouse architecture and design, and ensuring data quality.

Uploaded by

sabakhalid

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

1K views

Data Mining Question Bank

Uploaded by

sabakhalid

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 7

DATA MINING & WAREHOUSING QUESTION BANK

UNIT-1
Q.1 What is data Mining ?

Q.2 Explain the differences between Knowledge discovery and data mining.

Q.3 Explain different data mining tasks.

Q.4 What are the application areas of data Mining?

Q.5 What is the relation between data warehousing and data mining?

Q.6 List out different sources of information.

Q.7 What type of benefit you might hope to get from data mining?

Q.8 What are the key issues in data Mining?

Q.9 How can Data Mining help business analyst?

Q.10 What are the limitations of data Mining?

Q.11 Discuss the need of human intervention in data mining process.

Q.12As a bank manager, how would you decide whether to give loan to an applicant or not?

Q.13 What steps you would follow to identify a fraud for a credit card company.

Q.14 Explain the differences between “ Explorative Data Mining” and “Predictive Data
Mining” and give one example of each.

Q.15 State three different application for which data mining techniques seem appropriate.
Informally explain each application.

Q.16 Explain briefly the differences between “classification” and ‘’clustering” and give an
informal example of an application that would benefit from each techniques.

Q.17 What do you mean by Data Processing?

Q.18 Explain data cleaning.

Q.19 Describe different data cleaning approaches.

Q.20 How can we handle missing values?

Q,21 Explain Noisy Data.

Q.22 Explain Various normalization Techniques.

Q.23 Give Brief description of following:

(a) Binning

(b) regression

(d) Smoothing

(e) Generalization

(f) Aggregation

Q.24 How is data warehouse different from a database? How are they similar?

Q.25 Can you briefly describe the four stages of knowledge discovery(KDD)? Can you
describe the multi-tiered data warehouse architecture?

UNIT-2
Q.1 A data set for analysis includes only one attribute X:

X={ 7,12,5,8,5,9,13,12,19,7,12,12,13,3,4,5,13,8,7,6}

(a) What is the mean of the data set X?

(b) What is the median?
(c) Find the standard deviation for X.

Q.2 Define Frequent sets, confidence, support and association rule.

Q.3 What do you mean by Market Basket analysis and how it can help in a supermarket?

Q.4 Explain whether association rule mining is supervised or unsupervised type of learning.

Q.5 Name some variants of Apriori Algorithm.

Q.6 Discuss the importance of Association Rule Mining.

Q.7 The heights of players of a school’s basket ball team are 72”,74”,70”,78”,75” and 70”.
Find the mean height.

Q.8The batting averages for members of a basket ball team are 0.234, 0.256, 0.321, 0.333,
0.290. Find the median batting average.
Q.9Consider the Data set D. Given the minimum support2, apply apriori algorithm on this dataset.

Transaction ID Items
100 A,C,D
200 B,C,E
300 A,B,C,E
400 B,E

Q.10 Describe example of data set for which apriori check would actually increase the cost?
By describe I mean either show an instance of the data set or describe how would it look like.

Q.11Same question for MaxMiner. When does MaxMiner perform worse than apriori. How
does MaxMiner generate the frequency counts for every itemset which meets support
constraints?

Q.12 Describe a data set for which sampling would actually increase the amount of work. In
other words it would be faster to work on full data set.

Q.13 Is support as defined in correlation rule paper Downward closed? Why?

Q.14 How large is a contingency table for itemset of N items .

Q.15 Under what conditions AVG(Salary) > 100K would be downward closed; upward
closed?

Q.16 Assume that each item in supermarket is bought by 1% of transactions. Assume that
there are 10 million transactions and that items are statistically independent. Assume mid-sup
= 10. What is the expected size of a frequent set? What is the expected number of frequent
sets?

Q.17 Suppose that you have data describing the closing prices of the stock you own for the
last 1000 days. Suppose you are interested in generating all rules which tell you about
chances of your stock going up on a given day provided you know the pattern (up or down)
on K preceding days, with some minsup and minconf defined. How would you model this
problem as association rule mining problem, is there a way to represent this as transactions
with binary attributes like in the supermarket case?

Q.18 (i) With a neat sketch explain the architecture of a data warehouse
(ii) Discuss the typical OLAP operations with an example.

Q.19 (i) Discuss how computations can be performed efficiently on data cubes.
(ii) Write short notes on data warehouse meta data.

Q.20 (i) Explain various methods of data cleaning in detail.

(ii) Give an account on data mining Query language.
Q.21 How is Attribute-Oriented Induction implemented? Explain in detail.

Q.22 (a) Write and explain the algorithm for mining frequent item sets without candidate
generation. Give relevant example.

Q.23 Discuss the approaches for mining multi level association rules from the transactional
databases. Give relevant example.

Q.24 (i) Explain the algorithm for constructing a decision tree from training samples.
(ii) Explain Bayes theorem.

Q.25 Explain the following clustering methods in detail:

(i) BIRCH
(ii) CURE

UNIT-3
Q.1 Classification is supervised learning. Justify.

Q.2 Explain different classification Techniques.

Q.3 Entropy is an important concept in information theory. Explain its significance in mining
context.

Q.4 What are over fitted models? Explain their effects on performance.

Q.5 Explain Naive Baye’s Classification.

Q.6 Describe the essential features of decision trees in context of classification.

Q.7 What are the advantages and disadvantages of decision tress over other classification
methods?

Q.8 Explain ID3 Algorithm.

Q.9 Explain the methods for computing best split.

Q.10 What is Clustering? What are different types of clustering?

Q.11 Explain different data types used in clustering.

Q.12 Define Association Rule Mining

Q.13When we can say the association rules are interesting?

Q.14 Explain Association rule in mathematical notations.
Q.15 Define support and confidence in Association rule mining.
Q.16How are association rules mined from large databases?
Q.17 Describe the different classifications of Association rule mining.
Q.18What is the purpose of Apriori Algorithm?
Q.19Define anti-monotone property.
Q.20How to generate association rules from frequent item sets?
Q.21Give few techniques to improve the efficiency of Apriori algorithm.
Q.22 What are the things suffering the performance of Apriori candidate
generation technique.
Q.23 Describe the method of generating frequent item sets without candidate
generation.
Q,24 Mention few approaches to mining Multilevel Association Rules
Q.25 What are multidimensional association rules?

UNIT-4

Q1. Discuss the components of data warehouse.

Q2. List out the differences between OLTP and OLAP.
Q3.Discuss the various schematic representations in multidimensional model.
Q4. Explain the OLAP operations I multidimensional model.
Q5. Explain the design and construction of a data warehouse.
Q6.Expalin the three-tier data warehouse architecture.
Q7. Explain indexing.
Q8.Write notes on metadata repository.
Q9. Write short notes on VLDB.
Q10.Explain the issues regarding classification and prediction?
Q11.Explain classification by Decision tree induction?
Q12.Write short notes on patterns?
Q13.Explain mining single –dimensional Boolean associated rules from
transactional databases?
Q14.Explain apriori algorithm?
Q.15.Explain how the efficiency of apriori is improved?
Q.16.Explain frequent item set without candidate without candidate generation?
Q.17. Explain mining Multi-dimensional Boolean association rules from transaction
Q.18.Explain constraint-based association mining?

Q.19Specify the 5 criteria for the evaluation of classification & prediction?

Q.20 State two clustering method thst are used in "grid and density based method?
Q.21 Why every data structure in the data warehouse contains the time element.

Q.22 How does a snowflake schema differ from a star schema ? Name two advantages and
two disadvantages of the snowflake schema.

Q.23 What is meant by slice and dice? Give an example.

Q.24 What are the essential differences between the MOLAP and ROLAP models? Also list a
few similarities.

Q.25 Why is the entity-relationship modelling technique not suitable for the data warehouse.

UNIT-5
Q.1 How is Data Mining different from OLAP? Explain Briefly.

Q.2 Is the data warehouse a prerequisite for data mining? Does the Data warehouse helps
data mining. If so in what ways?

Q.3 List out few common provisions to be found in a good security policy.

Q.4 Give reasons why the data warehouse must be back up. How is this different from an
OLTP system.

Q.5 How do the statics help to find tuning the data warehouse.

Q.6 Describe various phases of testing Data Warehouse.

Q.7 List out five reasons why you think data quality is critical in a Data Warehouse.

Q.8 Explain how Data Quality is much more than just Data Accuracy. Give an example.

Q.9 Briefly list three benefits of quality data in a data warehouse.

Q.10 Give examples of four types of data quality problems.

Q.11 How does the data warehouse differ from an operational system in uses and value.

Q.12 State Dr. Codd’s guidelines for OLAP system, giving a brief description for each.

Q 13Name any three advantages of the STAR schema . Can you think of any disadvantages
of STAR Schema.

Q 14 What are hierarchies and categories as applicable to a dimension table.

Q 15 Why is Dimension table wide and the Fact table is deep?

Q16 Describe the composition of primary keys for the dimension and fact table.

Q 17 What is the STAR Schema? What are the Fact tables.

Q. 18 How is Dimensional Modelling different.

Q 19 Discuss The major design issues that need to be addressed before proceeding with the
data design.
Q 20 Name four distinguishing characteristics of DATA WAREHOUSE architecture.
Describe each briefly.

Q21 Describe data Warehouse architecture.

Q22 what are three major areas in the data warehouse. Is this a logical divison, If so , why do
you think so, Relate the architectural components to the three major areas.

Q.23 what are the similarities and differences between data warehouse & Database.

Q.24 What is subject area in data warehouse? What is ETL process?

Q.25 why OLAP is required in data warehouse?

100+ Bigdata Solved MCQs With PDF Download
No ratings yet
100+ Bigdata Solved MCQs With PDF Download
10 pages
Machine Learning Multiple Choice Questions
100% (1)
Machine Learning Multiple Choice Questions
20 pages
Mining Multilevel Association Rules From Transactional Databases
No ratings yet
Mining Multilevel Association Rules From Transactional Databases
46 pages
300+ TOP DATA MINING Multiple Choice Questions and Answers
No ratings yet
300+ TOP DATA MINING Multiple Choice Questions and Answers
10 pages
Data Mining - Classification Using Frequent Pattern
No ratings yet
Data Mining - Classification Using Frequent Pattern
8 pages
Data Mining and Warehousing
100% (3)
Data Mining and Warehousing
30 pages
Data Warehousing and Data Mining JNTU Previous Years Question Papers
No ratings yet
Data Warehousing and Data Mining JNTU Previous Years Question Papers
4 pages
Dmbi Mcqs Mcqs For Data Mining and Business Intelligence
No ratings yet
Dmbi Mcqs Mcqs For Data Mining and Business Intelligence
24 pages
Data Mining Question Bank
No ratings yet
Data Mining Question Bank
4 pages
CS2032 2 Marks & 16 Marks With Answers
100% (1)
CS2032 2 Marks & 16 Marks With Answers
30 pages
Data Warehouse and Data Mining Question Bank R13 PDF
No ratings yet
Data Warehouse and Data Mining Question Bank R13 PDF
12 pages
MCQ On Data Mining With Answers Set-1
No ratings yet
MCQ On Data Mining With Answers Set-1
11 pages
DMW Question Paper
0% (1)
DMW Question Paper
7 pages
DWDM Important Questions
No ratings yet
DWDM Important Questions
2 pages
DM Important Questions
100% (1)
DM Important Questions
2 pages
Data Warehousing & Data Mining (R20) Imp Questions:-Unit-1
100% (1)
Data Warehousing & Data Mining (R20) Imp Questions:-Unit-1
3 pages
MCQ Amt 2
No ratings yet
MCQ Amt 2
9 pages
Data Mining Question Bank U3 & U4
No ratings yet
Data Mining Question Bank U3 & U4
3 pages
MCQ
No ratings yet
MCQ
4 pages
DWDM Online Bits
No ratings yet
DWDM Online Bits
3 pages
Aproiri Qand A
No ratings yet
Aproiri Qand A
9 pages
Data Mining Metrices
No ratings yet
Data Mining Metrices
6 pages
ML MCQ Question Bank
No ratings yet
ML MCQ Question Bank
4 pages
Data Structures and Algorithms MCQ Questions Set 02
100% (1)
Data Structures and Algorithms MCQ Questions Set 02
41 pages
Recommender Systems-Unit I
No ratings yet
Recommender Systems-Unit I
12 pages
Ec 467 Pattern Recognition
No ratings yet
Ec 467 Pattern Recognition
2 pages
2mark With Answer
No ratings yet
2mark With Answer
38 pages
Answer Midterm Exam Data Mining1 2021 - 2022
No ratings yet
Answer Midterm Exam Data Mining1 2021 - 2022
4 pages
Big Data Anlaytics: Unit 1 & 2 - Question Bank MCQ's
100% (1)
Big Data Anlaytics: Unit 1 & 2 - Question Bank MCQ's
4 pages
UNIT 1 Practice Quiz - MCQs - ML
100% (1)
UNIT 1 Practice Quiz - MCQs - ML
10 pages
UNIT-1 Introduction To Data Mining
No ratings yet
UNIT-1 Introduction To Data Mining
29 pages
SPPU 2022 Solved Question Paper DWDM
50% (2)
SPPU 2022 Solved Question Paper DWDM
25 pages
Machine Learning Question Paper 21 22
No ratings yet
Machine Learning Question Paper 21 22
3 pages
Question Paper Code:: (10×2 20 Marks)
No ratings yet
Question Paper Code:: (10×2 20 Marks)
2 pages
CS402 Data Mining and Warehousing Question Bank
No ratings yet
CS402 Data Mining and Warehousing Question Bank
6 pages
DBMS All Five Units MCQS
No ratings yet
DBMS All Five Units MCQS
14 pages
UNIT 4 Mining Object Spatial Multimedia Text and Web Data
No ratings yet
UNIT 4 Mining Object Spatial Multimedia Text and Web Data
30 pages
Data Mining Written Notes 1
No ratings yet
Data Mining Written Notes 1
35 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
4 pages
Optimization Question 2
No ratings yet
Optimization Question 2
2 pages
Data Mining MCQ FINAL
No ratings yet
Data Mining MCQ FINAL
32 pages
IT6702-Data Warehousing and Data Mining
0% (1)
IT6702-Data Warehousing and Data Mining
12 pages
ML MCQ Questions and Answer PDF
No ratings yet
ML MCQ Questions and Answer PDF
10 pages
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
16 pages
Q.1. Why Is Data Preprocessing Required?
100% (1)
Q.1. Why Is Data Preprocessing Required?
26 pages
Data WareHouse Previous Year Question Paper
100% (1)
Data WareHouse Previous Year Question Paper
10 pages
Data Warehousing & Data Mining Important Questions
No ratings yet
Data Warehousing & Data Mining Important Questions
1 page
OSDS-Multiple Choice Questions
100% (1)
OSDS-Multiple Choice Questions
39 pages
Question Bank Python For Data Science
0% (1)
Question Bank Python For Data Science
3 pages
Ch1 Sad & Access MCQ
No ratings yet
Ch1 Sad & Access MCQ
10 pages
XML MCQ - CH - 01 - To - 13
50% (2)
XML MCQ - CH - 01 - To - 13
23 pages
Fdocuments - in - Data Mining MCQ
50% (2)
Fdocuments - in - Data Mining MCQ
34 pages
Questions On Google File System
100% (1)
Questions On Google File System
3 pages
MCQ Ai
No ratings yet
MCQ Ai
40 pages
Answer: B. Making A Machine Intelligent. Explanation: Artificial Intelligence Is A Branch of Computer Science, Which Aims To
No ratings yet
Answer: B. Making A Machine Intelligent. Explanation: Artificial Intelligence Is A Branch of Computer Science, Which Aims To
101 pages
Machine Learning Unit 2 MCQ
No ratings yet
Machine Learning Unit 2 MCQ
17 pages
Ai-Unit2 - QB-VDP
No ratings yet
Ai-Unit2 - QB-VDP
13 pages
T1 Machine Learning MCQ Questions and Answers - Key
No ratings yet
T1 Machine Learning MCQ Questions and Answers - Key
15 pages
Data Mining Question Bank
No ratings yet
Data Mining Question Bank
4 pages
DW Model Questions
No ratings yet
DW Model Questions
8 pages
Operator's Manual: Transmatic Lawn Tractor Models 660 - 688
No ratings yet
Operator's Manual: Transmatic Lawn Tractor Models 660 - 688
32 pages
Fluor Hse Policy
No ratings yet
Fluor Hse Policy
1 page
Mce Igcse Chemistry PPT c08
100% (1)
Mce Igcse Chemistry PPT c08
57 pages
Alinta Energy Sustainability Report 2021-22
No ratings yet
Alinta Energy Sustainability Report 2021-22
130 pages
Landia Final
No ratings yet
Landia Final
101 pages
OBF 15- Amniotic Fluid
No ratings yet
OBF 15- Amniotic Fluid
5 pages
Preface LB - Notes
No ratings yet
Preface LB - Notes
31 pages
Feasibility Study of Arowana Industry
No ratings yet
Feasibility Study of Arowana Industry
9 pages
THANH_ANH_DUONG_HUYNH-SAT_ADV_TEST__-ExamPrintReport
100% (1)
THANH_ANH_DUONG_HUYNH-SAT_ADV_TEST__-ExamPrintReport
30 pages
Gluconeogenesis, Glycogen Metabolism
No ratings yet
Gluconeogenesis, Glycogen Metabolism
35 pages
E - Willingness
No ratings yet
E - Willingness
7 pages
M20 Displacement PDF
No ratings yet
M20 Displacement PDF
3 pages
EC2 Economics Chapter 21 - Oligopoly 155-162
No ratings yet
EC2 Economics Chapter 21 - Oligopoly 155-162
21 pages
GLEC Framework Global Logistics Emission Council (v3)
100% (1)
GLEC Framework Global Logistics Emission Council (v3)
161 pages
Ticket Management Software With AI Assistance Zoho Desk
No ratings yet
Ticket Management Software With AI Assistance Zoho Desk
12 pages
Bhavneet SOP
100% (2)
Bhavneet SOP
3 pages
Western Astrology
No ratings yet
Western Astrology
5 pages
Report On Research Product.-Converted (1) Mahua Pickle.
No ratings yet
Report On Research Product.-Converted (1) Mahua Pickle.
22 pages
Chapter 1.research Methods
100% (1)
Chapter 1.research Methods
168 pages
7-Niranjanamurthy-Analysis of E-Commerce and M-Commerce Advantages
No ratings yet
7-Niranjanamurthy-Analysis of E-Commerce and M-Commerce Advantages
11 pages
Kali Groups: DR - Roy Zachariah
No ratings yet
Kali Groups: DR - Roy Zachariah
24 pages
Apeksha
No ratings yet
Apeksha
8 pages
ECC Revised02
No ratings yet
ECC Revised02
20 pages
Shaar Ha-Shir The New-Hebrew School of Poets of The Spanish-Arabian Epoch Selected Texts With Introduction, Notes and Dictionary (1906)
No ratings yet
Shaar Ha-Shir The New-Hebrew School of Poets of The Spanish-Arabian Epoch Selected Texts With Introduction, Notes and Dictionary (1906)
302 pages
Selection of Taper Roller Bearings
No ratings yet
Selection of Taper Roller Bearings
5 pages
Professional Learning Record-Chart Year 2
No ratings yet
Professional Learning Record-Chart Year 2
11 pages
De Thi Thu DH So 5
No ratings yet
De Thi Thu DH So 5
5 pages
Elective Math 9 Modulette Q1W4
No ratings yet
Elective Math 9 Modulette Q1W4
11 pages
Exceptions in Java
No ratings yet
Exceptions in Java
5 pages
20 HSE Delegate & HSE Committee
No ratings yet
20 HSE Delegate & HSE Committee
3 pages