0% found this document useful (0 votes)
66 views

DMBI Sample Questions

The document contains sample questions from 3 modules related to data warehousing, data mining, and business intelligence. Module 1 covers topics like data warehousing concepts and architectures, OLAP, and data mining techniques. Module 2 covers data preprocessing, transformation, and reduction techniques. Module 3 covers classification, clustering, and association rule mining algorithms.

Uploaded by

Shubham Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

DMBI Sample Questions

The document contains sample questions from 3 modules related to data warehousing, data mining, and business intelligence. Module 1 covers topics like data warehousing concepts and architectures, OLAP, and data mining techniques. Module 2 covers data preprocessing, transformation, and reduction techniques. Module 3 covers classification, clustering, and association rule mining algorithms.

Uploaded by

Shubham Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Sample Questions:

Module 1:
1. What is DWH? Explain DWH characteristics.
2. What are the advantages and applications of DWH?
3. Why is the ER model not suitable for DWH?What are the steps in dimensional
modeling?
4. Define dimension, fact , fact table and dimension table with example.
5. Difference between star and snowflake schema.
6. Design star and snowflake schema for given system.
7. Difference between OLTP and OLAP.
8. What are different OLAP operations?Explain with example.
9. Problems on writing a sequence of OLAP operations for the given query.
10. Explain steps of KDD
11. State any 2 decision making activities for which organizations are using data in DWH.
12. What is concept hierarchy, partial and total order concept hierarchy? Ex[plain with an
example.
13. What is data mining? State applications of data mining.
14. What are the different types of patterns that can be mined?

Module 2:
1. What are the different types of attributes? Explain with examples
2. Problems on basic statistical descriptions of data like finding mean, median, midrange
standard deviation, variance,modes for given data.Drawing q-q plot and boxplot for given
data.
3. What is a five number summary of data?
4. How can we compute dissimilarity between two binary attributes?
5. What is Euclidean distance, Manhattan distance, Minkowski distance? Problems on
computing these distances between given objects.
6. What is cosine similarity?problems on finding similarity between given documents.
7. Problems based on finding dissimilarity matrices between nominal,binary and ordinal
attributes .
8. Explain in brief the major tasks in data preprocessing.
9. What are the different ways to handle missing data?
10. What are the different ways to handle noisy data?
11. Problems on correlation analysis for categorical(Chi square test) and numerical data.
12. What are the different data transformation strategies?
13. Problems on min max ,z score and decimal scaling normalization.
14. State different data reduction strategies.
15. Data transformation techniques
16. Binning different types and problems bases on binning
17. What is noise? Explain data smoothing methods as noise removal technique to divide
given data into bins of size 3
18. Noise removal techniques
Module 3:
Classification:

Supervised and unsupervised learning

What is classification? classification applications

classification model building phases

Classification algorithms:

Explain the Decision tree-building process with an example.

Decision Tree algorithm

Entropy,Information Gain,Gain Ratio and Gini Index

Feature selection measures in building Decision Tree/splitting attribute selection measure.


Different Metrics used for Evaluating Classifier Performance

Confusion matrix:
Decision Tree Pruning

Problems on Decision Tree:

Steps involved in DT and Rules generation from Decision tree

Naive Bayes Algorithm

Problems on Naive Bayes algorithm

State Bayes theorem. How can it be applied for data classification? b) With example explain Bayesian
belief network.

Based on the following data determine the gender of a person having height 6 ft., weight 130 lbs. and
foot size 8 in. (use Naive Bayes algorithm).

Classification:
Supervised and unsupervised learning
What is classification? classification applications
classification model building phases
Classification algorithms:
Explain the Decision tree-building process with an example.
Decision Tree algorithm
Entropy,Information Gain,Gain Ratio and Gini Index
Feature selection measures in building Decision Tree/splitting attribute selection measure.
Different Metrics used for Evaluating Classifier Performance

Confusion matrix:

Different Metrics used for Evaluating imbalanced Classifier.

Decision Tree Pruning


Problems on Decision Tree:
Steps involved in DT and Rules generation from Decision tree

Naive Bayes Algorithm


Problems on Naive Bayes algorithm
Where do we use linear regression? Explain linear regression.
Differentiate classification and Regression
State Bayes theorem. How can it be applied for data classification? b) With example explain Bayesian
belief network.
Based on the following data determine the gender of a person having height 6 ft., weight 130 lbs. and
foot size 8 in. (use Naive Bayes algorithm).

Clustering:
clustering process
Explain different types of clustering techniques
K-means algorithm and problems based on K-means.
What are the weaknesses of hierarchical clustering?
Compare k-means with k-medoids algorithms for clustering.
What is the main objective of clustering? Give the categorization of clustering approaches. Briefly discuss
them.
Differentiate between AGNES and DIANA algorithms. b) How to access the cluster quality?
inter-cluster distance using single linkage,complete linkage and average linkage measure
Hierarchical clustering:
Explain Agglomerative (AGNES) and Divisive (DIANA) algorithm)
Compare Agglomerative (AGNES) and Divisive (DIANA) algorithm)
Dendrogram and cluster formation from dendogram
What is the goal of clustering? How does partitioning around medoids algorithm achieve this goal?
DEBSCAN clustering ,BIRCH
Association Mining:
Find all frequent item sets using Apriori algorithm. List all the strong association rules.
How to compute confidence measure for an association rule?
Consider the transaction database given below. Set minimum support count as 2 and minimum
confidence threshold as 70%. Generate strong association rule

How to compute confidence for an association rule X ◊ Y?


Find all frequent item sets using Apriori algorithm. List all the strong association rules.
The transaction details are given in the following table, what is the confidence and support of the
association rule {Diapers} ⇒ {Coffee, Nuts}?Find all frequent itemsets using Apriori algorithm. List all the
strong association rules.
1)Suppose we have data on a few individuals randomly surveyed. The data gives the responses towards
interests to promotional offers made in the areas of Finanace, Travel, Reading, and Health. Sex is the
output attribute to be predicted. Apply Naïve Bayesian classification algorithm to classify the new
instance (Finance = No,Travel = Yes, Reading = Yes, Health = No).
2)Build Decision Tree from Following Dataset where Sex is target/Output attribute,

The following table shows the midterm and final exam grades obtained for students in a database
course.

Use the method of least squares to find an equation for the prediction of a student’s final exam grade
based on the student’s midterm grade in the course.
Predict the final exam grade of a student who received 86 marks on the midterm exam with the abo
Module BI
What is BI?BI Applications,
Business intelligence architectures;
Development of a business intelligence system using Data Mining for business
Applications like Fraud Detection, Recommendation, Retail etc.

You might also like