Machine Learning
Machine Learning
Learning
Supervised Learning:
A machine learning technique whereby a system uses a set of
training examples to learn how to correctly perform a task
Clustering in Machine Learning
Clustering: is the assignment
of a set of observations into
subsets (called clusters) so that
observations in the same
cluster are similar in some
sense. Clustering is a method
of unsupervised learning, and a
common technique for
statistical data analysis used in
many fields.
K-means Clustering
We also know before hand that these objects belong to two groups of medicine
(cluster 1 and cluster 2). The problem now is to determine which medicines
belong to cluster 1 and which medicines belong to the other cluster.
K-means Clustering - Example
Group1 and group 2 both has two members, thus the new
centroids are and
8. Iteration-2, Objects-Centroids
distances: Repeat step 2 again, we have
new distance matrix at iteration 2 as
9. Iteration-2, Objects clustering: Again, we
assign each object based on the minimum distance.
We obtain result that . Comparing the
grouping of last iteration and this iteration reveals that
the objects does not move group anymore. Thus, the
computation of the k-mean clustering has reached its
stability and no more iteration is needed. We get the
final grouping as the results
Final Grouping As a Result
Objects Attribute 1 Attribute 2 (Y): Group (Result)
(X):weight pH
index
Medicine A 1 1 1
Medicine B 2 1 1
Medicine C 4 3 2
Medicine D 5 4 2
References:
MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate
Observations, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and
Probability, Berkeley, University of California Press, 1:281-297.
Bezdek, James C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. ISBN 0306406.
Finch, H. (2005). Comparison of distance measures in cluster analysis with dichotomous data.
Journal of Data Science, 3, 85-100
Huberty, C. J., Jordan, E. M., & Brandt, W. C. (2005). Cluster analysis in higher
education research. In J. C. Smart (Ed.), Higher Education: Handbook of Theory and Research
(Vol. 20, pp. 437-457). Great Britain: Springer.
Hattie (2002). Schools Like Mine: Cluster Analysis of New Zealand Schools. Technical Report
14, Project asTTle. University of Auckland.
Cornish, (2007). Cluster Analysis. Mathematics Learning Support Chapter 3.1.
Ines Frber, Stephan Gnnemann, Hans-Peter Kriegel, Peer Krger, Emmanuel
Mller, Erich Schubert, Thomas Seidl, Arthur Zimek (2010).On Using Class-Labels in
Evaluation of Clusterings. In Xiaoli Z. Fern, Ian Davidson, Jennifer Dy. MultiClust: Discovering,
Summarizing, and Using Multiple Clusterings. ACM SIGKDD.