QMM: Exercise Sheet 5 - Clustering: Fabien Baeriswyl, J Er Ome Reboulleau, Tom Ruszkiewicz

This document contains instructions for three exercises using clustering methods on different datasets. Exercise 1 involves hierarchical clustering of car data, comparing single, complete, and average linkage approaches. Exercise 2 uses k-means clustering on country data, exploring the number of clusters and effects of rescaling and multiple runs. Exercise 3 has the student cluster US state data and discuss any patterns in the resulting clusters.

Uploaded by

laurine.hoyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

QMM: Exercise Sheet 5 - Clustering: Fabien Baeriswyl, J Er Ome Reboulleau, Tom Ruszkiewicz

Uploaded by

laurine.hoyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

QMM: Exercise Sheet 5 - Clustering

Fabien Baeriswyl, Jérôme Reboulleau, Tom Ruszkiewicz

Exercise 1. For this exercise, use the Cars.csv datafile. This dataset consists of 32 cars from different
manufacturers, for which we record information about their prices, fuel consumption, and power notably. We want
to cluster the data. In this exercise do not rescale the dataset unless explicitly asked to do so.
a) First, provide a quick exploratory analysis of the data. Use visuals.
b) Create a function that computes the Euclidean distance between two vectors (of same size, but this size can
be anything). Use it to find the distance between the Ford Mondeo LX and the Ford Galaxy LX (the first two
observations of the dataset).
c) Compute the distance matrix between all the 32 observations.
d) Run a single-linkage hierarchical clustering on the data. Plot the result. Discuss any specific feature.
e) Run a complete-linkage hierarchical clustering on the data. Plot the result. Discuss any specific feature.
f) Run an average-linkage hierarchical clustering on the data. Plot the result. Discuss any specific feature.
Comparing it to the two previous models, which one is best? Briefly explain.
g) Under the complete-linkage and the average-linkage cluster models, using 4 clusters, provide the repartition of
the observations in these 4 clusters.
h) Rescale your dataset. Then, run the average-linkage model again and discuss any change.

Exercise 2. For this exercise, use the Country.csv datafile. This dataset consists of 109 countries for which we
have indicators about their gdp, literacy and urban population among others.
a) Rescale the data.
b) Use a visual approach to determine how many clusters you should use in a k-means partitioning method.
c) Run k-means clustering without setting the nstart argument to anything. Observe the changes while running
this function over and over again. Comment briefly.
d) Run k-means clustering setting nstart to 25. Plot the result and discuss briefly.
e) Run k-means clustering using 3 clusters. Discuss on the changes.
f) Plot the gdp versus the literacy and emphasise the clusters from your 3-cluster model from point e).

Exercise 3. For this exercise, use the South.csv datafile. This consists of 16 US states and their characteristics
in 1965, among which their mean temperature and precipitation, income, share of african americans and GOP vote
share (for Grand Old Party, formally the Republican Party) for various elections. Run clustering methods of your
choice to cluster the states using the tools from the previous exercises. Use the resulting clusters to discuss some
cluster particularities.

18-1600 en
100% (1)
18-1600 en
13 pages
Yf300 Service Manual
No ratings yet
Yf300 Service Manual
53 pages
Assignment Clustering
No ratings yet
Assignment Clustering
22 pages
Individual Assignments: Unit 2: Values, Data Types and Data Structures in R, Assignment 1
No ratings yet
Individual Assignments: Unit 2: Values, Data Types and Data Structures in R, Assignment 1
5 pages
Mlsec Solution Exercise Sheet 7
No ratings yet
Mlsec Solution Exercise Sheet 7
6 pages
In-Class Short Exercises For Tutorial 7 - Grouping
No ratings yet
In-Class Short Exercises For Tutorial 7 - Grouping
8 pages
Clustering Assignment
No ratings yet
Clustering Assignment
3 pages
Lp2-Etl Model Assignment No. 2: R (2) C (4) V (2) T (2) Total (10) Dated Sign
No ratings yet
Lp2-Etl Model Assignment No. 2: R (2) C (4) V (2) T (2) Total (10) Dated Sign
7 pages
Ps 3
No ratings yet
Ps 3
3 pages
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
No ratings yet
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
6 pages
TD4
No ratings yet
TD4
1 page
Text
No ratings yet
Text
9 pages
Lecture 3
No ratings yet
Lecture 3
46 pages
Exercise 3
No ratings yet
Exercise 3
4 pages
pandas__prac
No ratings yet
pandas__prac
4 pages
Learn Lab3
No ratings yet
Learn Lab3
12 pages
Exercise 7 - Integrated Analysis with R
No ratings yet
Exercise 7 - Integrated Analysis with R
27 pages
HWK2_324_SS
No ratings yet
HWK2_324_SS
7 pages
K-Means Clustering and Related Algorithms: Ryan P. Adams
No ratings yet
K-Means Clustering and Related Algorithms: Ryan P. Adams
16 pages
Lab Manual _DSR
No ratings yet
Lab Manual _DSR
32 pages
DSC 433/533 - Homework 9: Reading
0% (1)
DSC 433/533 - Homework 9: Reading
4 pages
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
No ratings yet
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
34 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Cluster
No ratings yet
Cluster
2 pages
Data Science Imp Q and A
No ratings yet
Data Science Imp Q and A
29 pages
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
No ratings yet
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
18 pages
Rmarkdown
No ratings yet
Rmarkdown
10 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
ProbList2-24-Sln
No ratings yet
ProbList2-24-Sln
20 pages
vertopal.com_R_practical
No ratings yet
vertopal.com_R_practical
9 pages
Data Mining Business Report 2
No ratings yet
Data Mining Business Report 2
18 pages
HW 1
No ratings yet
HW 1
4 pages
Hmw 09
No ratings yet
Hmw 09
1 page
Homework - Cluster Analysis in R-2
No ratings yet
Homework - Cluster Analysis in R-2
3 pages
New Text Document
No ratings yet
New Text Document
8 pages
Lab2
No ratings yet
Lab2
22 pages
L18_19_Clustering
No ratings yet
L18_19_Clustering
48 pages
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Unsupervised Machine Learning in Python
100% (1)
Unsupervised Machine Learning in Python
89 pages
INAIO_Stage_2_Sample_Problems_MLTheory
No ratings yet
INAIO_Stage_2_Sample_Problems_MLTheory
6 pages
PS2
No ratings yet
PS2
4 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
Paper 1 73
No ratings yet
Paper 1 73
6 pages
UNIT-1 (Preparing To Model)
No ratings yet
UNIT-1 (Preparing To Model)
82 pages
Assignment1
No ratings yet
Assignment1
7 pages
Peer Eval
No ratings yet
Peer Eval
6 pages
Clustering and Visualisation of Data - 2020
No ratings yet
Clustering and Visualisation of Data - 2020
5 pages
ENGG1003 06 DataModelingAndVisualization (1)
No ratings yet
ENGG1003 06 DataModelingAndVisualization (1)
28 pages
K Means
No ratings yet
K Means
3 pages
lec2
No ratings yet
lec2
32 pages
03 23MAT214 MIS4 KMeans Spectral Clustering (1)
No ratings yet
03 23MAT214 MIS4 KMeans Spectral Clustering (1)
52 pages
ENGG1003 07 DataModelingAndVisualization
No ratings yet
ENGG1003 07 DataModelingAndVisualization
29 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
5clustering-2
No ratings yet
5clustering-2
35 pages
Tutorial Exercises Clustering - K-Means, Nearest Neighbor and Hierarchical
No ratings yet
Tutorial Exercises Clustering - K-Means, Nearest Neighbor and Hierarchical
7 pages
STAT452 Project1
No ratings yet
STAT452 Project1
13 pages
DM WK 1
No ratings yet
DM WK 1
13 pages
FullMarks - Clustering StudentSolution 2
No ratings yet
FullMarks - Clustering StudentSolution 2
13 pages
Exercise 1 For Apllied Statistics With R
No ratings yet
Exercise 1 For Apllied Statistics With R
3 pages
Lecture Notes - Clustering
No ratings yet
Lecture Notes - Clustering
13 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
No ratings yet
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
9 pages
Readings Session 1
No ratings yet
Readings Session 1
7 pages
Readings Week 3
No ratings yet
Readings Week 3
4 pages
QMM: Exercise Sheet 9 - Structural Equation Model: Mediation and Moderation
No ratings yet
QMM: Exercise Sheet 9 - Structural Equation Model: Mediation and Moderation
3 pages
QMM: Exercise Sheet 8 - Structural Equation Model: Structural Regression
No ratings yet
QMM: Exercise Sheet 8 - Structural Equation Model: Structural Regression
3 pages
Using Spreadsheets For Steel Design
No ratings yet
Using Spreadsheets For Steel Design
4 pages
Biomanufacturing and Control of Bioprocesses_C
No ratings yet
Biomanufacturing and Control of Bioprocesses_C
18 pages
Representing Emotions New Connections In The Histories Of Art Music And Medicine 1 Helen Hills download
100% (1)
Representing Emotions New Connections In The Histories Of Art Music And Medicine 1 Helen Hills download
89 pages
Content Standard:: I. Learning Competency The Learner
No ratings yet
Content Standard:: I. Learning Competency The Learner
3 pages
Syllabus - Introduction To International Relations
No ratings yet
Syllabus - Introduction To International Relations
6 pages
Ripan Kundu PDF
No ratings yet
Ripan Kundu PDF
2 pages
EST 130 Question Bank Electrical and Electronics-Ktunotes.in
No ratings yet
EST 130 Question Bank Electrical and Electronics-Ktunotes.in
5 pages
1RS 19min PDF
No ratings yet
1RS 19min PDF
13 pages
Bharti Zain Ppt-Final
50% (2)
Bharti Zain Ppt-Final
17 pages
97 107 2 PB
No ratings yet
97 107 2 PB
6 pages
MVC Framework Introduction PDF
No ratings yet
MVC Framework Introduction PDF
2 pages
OceanofPDF - Com Tarot Journaling Using The Celtic Cross T - Corrine Kenner
No ratings yet
OceanofPDF - Com Tarot Journaling Using The Celtic Cross T - Corrine Kenner
249 pages
PDSA Cheat Sheet: What Is A PDSA?
No ratings yet
PDSA Cheat Sheet: What Is A PDSA?
2 pages
Dental Archa Width in Class II Division 2 Deepbite Maloclusion
No ratings yet
Dental Archa Width in Class II Division 2 Deepbite Maloclusion
6 pages
Is 16027 2012
No ratings yet
Is 16027 2012
15 pages
PN3565 Transistor NPN
No ratings yet
PN3565 Transistor NPN
1 page
Chapter 2-Ethics
No ratings yet
Chapter 2-Ethics
17 pages
Kalera and Agrico Investor Presentation Final
No ratings yet
Kalera and Agrico Investor Presentation Final
50 pages
Operating and Service Manual For Hyd. Act. Series 4000,5000,6500
100% (1)
Operating and Service Manual For Hyd. Act. Series 4000,5000,6500
24 pages
Reflected Ceiling Plan For Brain Laboratory in NYC
No ratings yet
Reflected Ceiling Plan For Brain Laboratory in NYC
1 page
A Mini Project Report ON Kenpave Analysis For Low Volume Roads With Reduced Resilient Modulus Values
No ratings yet
A Mini Project Report ON Kenpave Analysis For Low Volume Roads With Reduced Resilient Modulus Values
59 pages
Six Step Commutation - Generate switching sequence for six-step commutation of brushless DC (BLDC) motor - Simulink - MathWorks India
No ratings yet
Six Step Commutation - Generate switching sequence for six-step commutation of brushless DC (BLDC) motor - Simulink - MathWorks India
5 pages
Boiler Tube Failure Analysis
No ratings yet
Boiler Tube Failure Analysis
4 pages
(For Student College Physics Gen - Zoology) (For RT Major Subject)
No ratings yet
(For Student College Physics Gen - Zoology) (For RT Major Subject)
1 page
Ipes Q4 3RD Summative Test
No ratings yet
Ipes Q4 3RD Summative Test
16 pages
1 Tan Your Own Hide PDF
0% (4)
1 Tan Your Own Hide PDF
13 pages
Mock Test Maths
No ratings yet
Mock Test Maths
3 pages
Design of Vierendeel Trusses
No ratings yet
Design of Vierendeel Trusses
52 pages

QMM: Exercise Sheet 5 - Clustering: Fabien Baeriswyl, J Er Ome Reboulleau, Tom Ruszkiewicz

Uploaded by

QMM: Exercise Sheet 5 - Clustering: Fabien Baeriswyl, J Er Ome Reboulleau, Tom Ruszkiewicz

Uploaded by

QMM: Exercise Sheet 5 - Clustering

Fabien Baeriswyl, Jérôme Reboulleau, Tom Ruszkiewicz

You might also like