Data Mining
Data Mining
1. ___________contains a subset of corporate –wide data that is of value to a specific group of users.
3.. ________ is the estimate of the strength of the implication of the rule.
6. 15. The DM systems typically have a default generalization relation threshold value ranging from
__________________ to __________________
7.Multi dimensional Association rules with no repeated predicates are called ______ associated rules.
8.Apriori algorithm employs level-wise search, where k-item sets uses ________ item sets.
a) k b) (k-1)
c) (k+1) d) (K+2)
10..All nonempty subsets of a frequent itemset must also be frequent “. This is _____ property [ b]
11..A_________ contains a subset of corporate –wide data that is of valve to a specific group of
users[ a ]
14.Apriori algorithm employs level-wise search, where k-item sets uses ________ item sets.
a) k b) (k-1)
c) (k+1) d) (K+2)
16. Anti-monotone, monotone, succinct, convertible and inconvertible are five different
categories of constraints.
a) Knowledge type constraints b) Data constraints
c) Interestingness constraints d) Rule constraints
17. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as ________ .
a) 101 b) 010
c) 001 d) 110
18. If a rule concerns associations between the presence or absence of items, it is a ________
rule.
a) Boolean association b) Quantitative association
c) Frequent association d) Transaction association
19.A set of items is referred to as a ________ .
20. If in multi dimensional association rule with repeated predicates, which contains multiple
occurrences of some predicate certain rules are called as ________ .
21. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as ________
a) 101 b) 010
c) 001 d) 110
22. If a rule concerns associations between the presence or absence of items, it is a ________ rule.
23. If a rule concerns associations between the presence or absence of items, it is____ rule.[ ]
a) Boolean association b) Quantitative association
c) Frequent association d) Transaction association
24. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with
high-level concepts. [ ]
a) Analysis oriented induction b) Algorithm oriented induction
c) Attribute oriented induction d) Approach oriented induction
27.. Percent (A," 70, 71 _ _ _ 80") => placement (A, "Infosys") The above rule clearly refer to _ _
_ _ _ _ _ _ _ _ _ _ _ _ rule [ ]
a) Boolean association b) Quantitative association
c) Single dimensional association d) Multi dimensional association
28.. If in multi dimensional association rule with repeated predicates, which contains multiple
occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _
UNIT-4
2. In ------- algorithm , where each cluster is represented by the mean value of the objects in the
cluster. [ ]
a) k- medoids b) k-means c) CURE d) BIRCH
4.________ analysis is can be used to model the relationship between one or more independent or
predictor variable and a dependent or reponse variable
5.The _________ of a classifier on a given test set is the percentage of test set tuples that
are correctly classified by the classifier.
6.In Backpropagation ,the weights and biases are updated after all of the tuples in the training set have
been presented .this strategy is called [ d ]
a) Terminating updating b) Epoch updating c) Case updating d) sample
updating
13._________ occurs when an attribute is repeatedly tested along a given branch of tree [ ]
a)repetition b)replication c)fragmentation d)none
UNIT-5
2.________ hierarchy may formally express existing semantic relationships Between attributes.
a) schema b) set-grouping
4. ________ hierarchy may formally express existing semantic relationships between attributes.
a) schema b) set-grouping
6.A ________________ model consists of radial lines emanating from a central point, where each line
represents a concept hierarchy for a dimension
c) both d) none
9. _____ hierarchy may formally express existing semantic relationships between attributes. [ ]
12.The _ _ _ _ _ _ algorithm where each cluster is represented by one of the objects located
near the center of cluster.
a) CLA b) CLAPP
c) CLARA d) CLULA
15. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two
clusters is [ ]
a) Relative distance b) Relative interconnectivity
c) Relative density d) Relative closeness
16. Which method overcame with the problem of favoring clusters with spherical shape and
similar sizes ______________
1. ______________describes the discovery of useful information from the web contents
2. _______________is concerned with discovering the model underlying the link structure of the
web.
a. ranking hypertext
a. web pages
a.clustering
8. Web usage mining is the application of identifying or discovering interesting usage patterns from large
data sets
10. Web Mining is the process of Data Mining techniques to automatically discover and extract
information from Web documents and services.
11.
SET NO 1
Set no 2
1. Discuss PAM algorithm and issues in K-mean.
2. Discuss about web mining
3. Explain different types of data types used in cluster analysis.
4. Discuss about K-nearest neighbor classifiers and case based reasoning
Set no 3
1. What is meant by outlier analysis? Differentiate between agglomerative and divisive hierarchical
clustering
2. Discuss web content mining, web structure mining and web usage mining.
3. Discuss Bayesian classification
4. Discuss about text mining.