Data Mining and Data Warehousing 2023
Data Mining and Data Warehousing 2023
AY 21
No
b. Find the Chi square correlation analysis for the given four entity instances. 8 CO1 K2
(OR)
c. What is normalization? Explain why normalization is performed? 7 CO1 K1
d. Suppose that a hospital tested the age and body fat data for 18 randomly selected 8 CO1 K2
adults with the following results:
(a) Calculate the mean, median, and standard deviation of age and %fat.
(b) Find out the covariance and correlation among these two attributes.
3.a. Define data warehouse. Draw the architecture of data warehouse and explain 8 CO2 K1
the three tiers in detail with a case study.
Page 1 of 2
b. Differentiate between star schema, snowflake schema and fact constellation. 7 CO2 K4
(OR)
c. Distinguish between OLAP and OLTP. List out the various OLAP operations carried 8 CO2 K4
out in Data Warehouse.
d. What is the difference between Virtual data warehouse and enterprise data 7 CO2 K1
warehouse?
4.a. There are five transactions (T1, T2, T3, T4, T5) with items (A, B, C, D) purchased as 8 CO3 K2
T1(B, C), T2(A, C, D), T3(B, C), T4(A, B, C, D), T5(B, D). The min_sup = 2. Show
how FP-growth approach can generate the association rules for the above dataset.
b. Explain Decision tree induction algorithm for classification. Discuss the 7 CO3 K2
with an example.
d. Describe KNN Algorithm for data classification with appropriate example. 7 CO3 K2
5.a. Cluster the following eight points (with (x, y) representing locations) into 8 CO4 K2
three clusters:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)
b. Elaborate the various partitioning methods in detail. 7 CO4 K2
(OR)
c. Differentiate Agglomerative and Divisive Hierarchical Clustering? 8 CO4 K2
d. How can we use data mining in the field of retail and telecommunication 7 CO4 K2
industry?
Page 2 of 2