Vectors
Vectors
VECTOR
➢ Vector
➢ Linear independence
➢ Applications
Vector
Notation
Examples
Inner product
Complexity
Vectors
► A vector is an ordered list of numbers
► Written as
► subtraction is similar
Properties of vector addition
► commutative: a+b=b+a
► associative:(a + b) + c = a + (b + c)
(so we can write both as a + b + c)
►a +0=0+a=a
►a − a=0
b
a+b
a
Scalar-vector multiplication
► scalar β and n-vector a can be multiplied
βa = ( βa1 , . . . , βan )
► also denoted aβ
► example:
Properties of scalar-vector multiplication
b
a2 1.5a2
a1
0.75a1
Inner product
► example:
General examples
Net present value (NPV) is a method used to determine the current value of all future cash
flows generated by a project, including the initial capital investment. It is widely used in
capital budgeting to establish which projects are likely to turn the greatest profit.
The formula for NPV varies depending on the number and consistency of future cash
flows.
Regression model
ŷ = xT β + v
800
House 4
400
House 1
200
House 2
House 3
0
0 200 400 600 800
Actual price y (thousand dollars)
2. Norm and distance
Norm
CS115 - Math for CompSci
Norm
► so we have
Distance
dist(a,b) = ǁa − bǁ
▪ Solution: Calculate
7 3 4
𝐮−𝐯= − =
1 2 −1
𝐮 − 𝐯 = 42 + (−1)2 = 17
Triangle inequality
► by triangle inequality
ǁa − cǁ = ǁ(a − b) + (b − c)ǁ ≤ ǁa − bǁ + ǁb − cǁ
ǁa −cǁ
ǁb −cǁ
a b
ǁa −bǁ
Feature distance and nearest neighbors
► if x and y are feature vectors for two entities, ǁx − yǁ is the feature distance
ǁx − zj ǁ ≤ ǁx − zi ǁ, i = 1,. . . ,m
z4
x z6
z5
z3
z2
z1
θ = ∠(a,b)
L1 distance:
add
Applications
K-Nearest Neighbors
K-Nearest Neighbors (KNN)
Classifier: Nearest Neighbor
Memorize all
data and labels
Clustering
Algorithm
Examples
Applications
Clustering
► patient clustering
– xi are patient attributes, test results, symptoms
► financial sectors
– xi are n-vectors of financial attributes of company i
Clustering objective
► clustering objective is
► this is the mean (or average or centroid) of the points in the partition:
X
zj = (1/ |Gj |) xi
i∈Gj
k-means algorithm
► Jclust goes down in each step, until the zj’s stop changing
► but (in general) the k-means algorithm does not find the partition that
minimizes Jclust
► the final partition (and its value of Jclust) can depend on the initial
representatives
► common approach:
– run k-means 10 times, with different (often random) initial representatives
– take as final partition the one with the smallest value of Jclust
Data
Iteration 1
Iteration 2
Iteration 3
Iteration 10
Final clustering
Convergence
1.5
J clust 1
0.5
1 3 5 7 9 11 13 15
Iteration
Outline
Clustering
Algorithm
Examples
Applications
Handwritten digit image set
42
40
J clust
38
36
1 5 9 13 17 21 25
Iteration
Group representatives, best clustering
Topic discovery
·10−3
8
J clust
7.5
7
1 5 10 15 20
Iteration
Topics discovered (clusters 1–3)