0% found this document useful (0 votes)
67 views

Euclidean Distance Matrix

The document discusses Euclidean distance matrices (EDMs), which are matrices that show the distances between points in Euclidean space. An EDM is a symmetric, positive semi-definite matrix where the diagonal entries are 0 and the off-diagonal entries represent distances calculated using the Euclidean distance formula. The document provides examples of computing distances between points and constructing an EDM from travel time data between three hypothetical towns to produce a 2D map approximation. Multidimensional scaling is also discussed as a technique for translating pairwise distance information into a spatial representation.

Uploaded by

Manahil Shaikh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Euclidean Distance Matrix

The document discusses Euclidean distance matrices (EDMs), which are matrices that show the distances between points in Euclidean space. An EDM is a symmetric, positive semi-definite matrix where the diagonal entries are 0 and the off-diagonal entries represent distances calculated using the Euclidean distance formula. The document provides examples of computing distances between points and constructing an EDM from travel time data between three hypothetical towns to produce a 2D map approximation. Multidimensional scaling is also discussed as a technique for translating pairwise distance information into a spatial representation.

Uploaded by

Manahil Shaikh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Graph Theory

Euclidean Distance Matrix


Essa Sheikh – 18630

Abstract
Fundamentally, information about points within Euclidean space was incomplete. As the
complexities of the world developed, we started looking for methods to retrieve more
information about the distance within a Euclidean space. A Euclidean distance matrix (EDM) is
a matrix that shows the spacing of “n” points in Euclidean space. A Euclidean space is a finite
dimensional real vector space. The matrix is an nxn matrix representing n points in a space. In no
time, we realized the many underlying applications of interpoint distance calculations which is
conveniently represented by an EDM. EDMs are frequently used in self – localization, 3D
animations, network topology and molecular conformation. In this paper, we will look at how to
construct a map of a geographic area with a certain extent of accuracy. We will be attempting to
convert the minutes into a distance problem using EDMs. For that, we will need an eigenvalue
decomposition of a symmetric matrix. We will be using the travelling time, in minutes, from an
initial point to calculate an approximate distance of that point to the initial point. Basically, a
rough 2D impression.

Key words: Euclidean distance matrix, Multidimensional scaling, maps

What is a matrix?
A set of number or numbers displayed in the format of columns and rows is known as a matrix
(plural: matrices). It can represent any type of related data. They are used to display quick
approximation of problem that would take hours and hours of
calculation. Matrices are conveniently accepted as a data type for
several programming languages and are comparatively more
flexible than a static array.
Distance Matrix
A distance matrix is matrix that shows the distance between two points on an object. There are
two different types of distance matrices, metric and non-metric. For this paper we will be
focusing on the metric distance matrices. Primarily, the matrix
B 43
shows the distance of a point from itself. Consequently, the main
C 21 56 diagonal is 0. For example,
A B C D
A 0D 135 43 21 18 212135
B 43 0 0
56 180
C 21 56 A 0B 212 C
D 13 180 21 0
5 2

There are other alternate ways of displaying a distance matrix for example,

This Photo by Unknown Author is


licensed under CC BY-SA

There are some general properties of a metric distance matrices which can be listed down as;
The hollow matrix
the entries on the main diagonal are all zero i.e. xii = 0 for all 1 ≤ i ≤ N
The non-negative matrix
all the off-diagonal entries are positive (xij > 0 if i ≠ j)
The symmetry of the matrix
the matrix is a symmetric matrix (xij = xji)
The Triangle Inequality
for any i and j, xij ≤ xik + xkj for all k This can be stated in terms of tropical matrix
multiplication.

Development of EDM
The contributions of Carl Menger, Schoenberg, Blumenthal, and Young and Householder
towards the development of EDM is commendable. Their work has made significant contribution
to the development of the applications and tools that use EDM for pivotal parts of the process.
The most important class of EDM was developed for data visualization however, in 1952 the
concept of Multidimensional scaling was introduced by Torgerson.

Euclidean Distance Matrix


Now that we have established what a Distance matrix is, we need to further understand what is a
Euclidean distance matrix. A Euclidean plane is characteristically a two-dimensional or a three-
dimensional plane. (Ivan Dokmanic, 2015 )It is a space of any finite number of dimensions
where points are represented by specific set of co-ordinates. There are some conditions that need
to be met if a matrix can qualify as an EDM;
 Symmetric Hollow Subspace
Denoted by S nh , the symmetric hollow subspace is a proper subspace of
symmetric matrices Sn with a zero diagonal.
S nh =DEF {A ∈ S n | diag(A) = 0},
where diag(·) denotes a column vector with the diagonal entries of its input
matrix.
 Positive Semi-Definite Cone
Denoted by Sn+, the positive semi-definite cone is the set of all symmetric
positive semi-definite matrices of dimension n × n.
S n+ = {A ∈ S n | A 0}
Basically, the matrix constitutes of a table where the rows are the source entities and the
columns are the target entities. In a Euclidean manner, the distance is calculated upon the target.
For this paper, the Euclidean matrix will be a distance table, with travelling time in both rows
and columns in the crossing cells signifying a distance concept. The EDM would be a squared
matrix i.e. if the distance is x, the element in the matrix would be x2.
The distance can be calculated by using the following distance formula;

d(p, q) = d(q, p) = [(q1 – p1)2 + (q1 – p2)2 + …. + (qn + pn)2]1\2

This formula plays a crucial part in its application as the existence and
derivation of any algorithm is useless without it.

Multidimensional Scaling
Multidimensional scaling (MDS) is the method of visual representation of the extent of similarity
of separate units of similar data. MDS is used in translating information about the pairwise
distances among the set of n objects and imprinting it onto a cartesian space. Simply,
multidimensional scaling is a method used to display any kinds of spatially using a map,
generally 2- dimensional. (WISH, 1978)
Hypothetically, if a person is shown a map and asked to construct a matrix of distances between
two cities. He can easily measure the distance and using the conversion, such as 2cm: 15 km of
the map scale and record it. E.g., 7 cm on map means the city is 105 km away. Problem arises
when one must go in reverse. If you are given the distances as vector matrix. Although many
geometrical methods are available, the most convenient of them is using multidimensional
scaling. The easiest of it is to produce a 2-dimensional map but the number of dimensions is not
limited to two.

Computing the distance


If we have the vectors {xi ∈ R d : i ∈ {1, . . . , n}} and we need the n × n matrix, D, of all
pairwise distances between them. Let us take in account all the elements as they represent the
Euclidean distance, squared. It is defined as follows;
Dij = ||xi − xj ||22
or
Dij = (xi − xj )T (xi − xj ) = ||xi ||22 − 2x T i xj + ||xj ||22

We can use the above-mentioned formulas to calculate the distance between two points. Ext, we
set up all the vectors in form of a column and use the equation (1) mentioned above to create a 2-
D view. We can make use of tools such as Numpy, PyTorch and Tenserflow to modify the
simple data. (Albani, 2019) (Grainti)
Converting the Minute matrix into a Map
For our calculations we have assumed 3 imaginary towns, namely, Hurbes, Tanes and Galvat.
We have also assumed some values as travelling time between 3 town Hurbes, Tanes and Galvat.
The travelling time between the towns is.
From Hurbes to Hurbes: 0 minutes
From Hurbes to Tanes: 33 minutes
From Hurbes to Galvat: 128 minutes
From Tanes to Galvat: 158 minutes

The matrix representing the travelling minutes is given by;

H T G
T =
H 0 33 128
T 33 0 158
G 128 158 0

Before we begin with our calculations, lets establish the fact that out of many ways of doing it
we will be utilizing the classical MDS method. The Algorithm used is;
The first step is to geometrically center the matrix for that we use the formula:
1
J=I– 1 1T
n

[]
1 0 0 1
J= 0 1 0 - 1/3 1 . [ 1 1 1 ]
0 0 1 1

Which gives us:

[ ]
0.67 −0.33 −0 . 33
J = −0 .33 0. 67 −0.33
−0.33 −0.33 0.67
Now we calculate the Gram matrix:
G= -1/2 JDJ

[ ]
−52.8 −56.39 −67.21
G= −56.39 −56.16 −70.63
−67.21 −70.63 −66.82

We use the eigenvalue decomposition to compute its eigenvalues ‘人’ and eigenvectors ‘v’

[ ]
−188.81 0 0
Λ= 0 1.87 0
0 0 11.15

[ ] [ ] [ ]
0.86 −12.17 −0.48
v1 = 0.9 , v2 = 10.61 , v3 = −0.64
1 1 1

Lastly, we normalize the vectors

[ ]
0.53 −0.75 −0.37
µ = 0.56 0.65 −0.49
0.62 0.06 0.78

Now we can finally reconstruct the map, we use the normalized vectors as anchors and scale
them up or down to match the actual map.

Cluster Analysis using Euclidean Distance Matrix


Using our understanding of the Euclidean distance matrix we will be analyzing a data set and
clustering like data points together and look at its implementations within the realm of data
sciences. We will be using a single linkage method in our example and limit the number of
clusters to 2. We start of by defining a set of data and calculating the distance between each data
point using the Euclidean distance matrix:
dij = [(y2 – y1)2+(x2-x1)2]1/2        
4

0
1 2 3 4 5 6 7 8 9

0 √2 3 2 √10 √34
√ 2 0 √ 5 √ 26 2 √5
0 √ 5 √ 26 2 √5
3 √5 0 √ 13 √ 13 √ 5 0 √13 √13
2 √ 10 √ 26
√ 26 √ 13 0 √2
√ 13 0 √2 2 √5 √ 13 √ 2 0
√34 2 √5 √ 13 √ 2 0
Once we have completed are distance matrix we observe the smallest distance between two data
points (d1 and d2) and cluster them together, once again we form a new distance matrix by
substituting the two data points with a cluster ‘A’ and repeat this process to form another cluster
‘B’. We observe the remaining data points and add them to the already existing clusters till no
further data points are left and only the clusters remain. We can represent are results in the form
of a dendrogram to compare similarities between the data points.
While this process might looks trivial, cluster analysis is quite an intuitive tool in machine
learning. A good example for its use in machine learning would be if you’re trying to teach a
computer to differentiate between two objects like cats and dogs, if you plot them as data points
on a graph with weight and height on x and y axis, you can observer that data points representing
cats would be clustered in one corner and data points representing dogs would be clustered in the
other. While a computer might not be able to differentiate between the two, using cluster analysis
we can teach the computer to look for the cluster nearest to the data point to make a decision. 
Bibliography
Albani, S. (2019). Euclidean Distance Matrix Trick. University of Oxford.

Grainti, A. (n.d.). Euclidean Distance Matrix. The StartUP .

Ivan Dokmanic, R. P. (2015 ). Euclidean Distance Matrices - Essential Theory, Algorithms and
Applications.

WISH, J. B. (1978). Multidimensional Scaling. Chicago: Bell Publishers .

You might also like