0% found this document useful (0 votes)
44 views

L21 Mining Social Network Graphs

The document discusses techniques for mining social network graphs, including representing social networks as graphs, clustering algorithms like Girvan-Newman that use betweenness as a distance measure, and spectral analysis which models networks as matrices to identify communities. Examples of social networks are given along with illustrating concepts like locality and betweenness through examples.

Uploaded by

Tariq Saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

L21 Mining Social Network Graphs

The document discusses techniques for mining social network graphs, including representing social networks as graphs, clustering algorithms like Girvan-Newman that use betweenness as a distance measure, and spectral analysis which models networks as matrices to identify communities. Examples of social networks are given along with illustrating concepts like locality and betweenness through examples.

Uploaded by

Tariq Saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Mining Social-Network Graphs

Lecture 8

Mining Social-Network Graphs

December 4, 2017 1
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Outline
1. Social Networks as Graphs
a) Examples
b) Representation
c) Properties

2. Clustering
a) Distance Measures
b) Girvan-Newman

3. Spectral Analysis
a) Networks as Matrices
b) Connectivity
c) Partitioning
d) Clustering

Mining Social-Network Graphs

December 4, 2017 2
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Social Networks (obvious)

Mining Social-Network Graphs

December 4, 2017 3
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Social Networks (obvious)

Mining Social-Network Graphs

December 4, 2017 4
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Less Obvious
• Phone/E-mail (within K units of time)
• Collaboration (academic papers, patents)
• Biological (proteins, genes)

Mining Social-Network Graphs

December 4, 2017 5
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Representation
• Nodes = entities
– People, papers, …
– Single vs. multiple
• del.icio.us: people, websites, tags

• Edges = relationship
– Discrete vs continuous (i.e. weighted)
– Directed (e.g. following) vs undirected

Mining Social-Network Graphs

December 4, 2017 6
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• Given a graph of 7 nodes (A-G), how
many undirected edges are possible?

Mining Social-Network Graphs

December 4, 2017 7
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Answer
• Given a graph of 7 nodes (A-G), how
many undirected edges are possible?
✓ ◆
7 7! 7(7 1)
= = = 21
2 2!(7 2)! 2

Mining Social-Network Graphs

December 4, 2017 8
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Locality
• Typically assumed that if node A is
connected to both B and C, then it is
more likely than random that B and C are
connected

• Our focus: Community Detection


– Finding groups of densely connected nodes
– (Tightly related to the locality assumption –
i.e. would not work well on random graphs)
Mining Social-Network Graphs

December 4, 2017 9
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example Graph
A B D E

G F

Mining Social-Network Graphs

December 4, 2017 10
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Communities to Be Detected
A B D E

G F

Mining Social-Network Graphs

December 4, 2017 11
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Nodes, Edges, Average Connectivity?


A B D E

G F

• Nodes: 7
• Edges: 9
• Avg Connectivity: 9/21 ~ 0.43

Mining Social-Network Graphs

December 4, 2017 12
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Given A-B, A-C; Expected B-C?


A B D E

G F

• Nodes: 7
• Edges: 9
• Avg Connectivity: 9/21 ~ 0.43
• Expected Locality: 7/19 ~ 0.37

Mining Social-Network Graphs

December 4, 2017 13
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Actual?
A B D E

G F

• Nodes: 7
• Edges: 9
• Avg Connectivity: 9/21 ~ 0.43
• Expected Locality: 7/19 ~ 0.37
• What are all the triples?

Mining Social-Network Graphs

December 4, 2017 14
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Evaluate Node Triples


• A
– BC ✅ A B D E
• B
– AC ✅, AD ❌
– CD ❌ C
• C
G F
– AB ✅
• D
– BE ❌, BF ❌, BG ❌ • F
– EF ✅, EG ❌ – DE ✅, DG ✅
– FG ✅ – EG ❌
• E • G
– DF ✅ – DF ✅

• ✅ = 9, ❌ = 7
• ✅ / (✅ + ❌) = 9/16 ~ 0.56

Mining Social-Network Graphs

December 4, 2017 15
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Local!
A B D E

G F

• Nodes: 7
• Edges: 9
• Avg Connectivity: 9/21 ~ 0.43
• Expected Locality: 7/19 ~ 0.37 Let’s Cluster!
• Actual Locality: 9/16 ~ 0.56 (>> Expected!)

Mining Social-Network Graphs

December 4, 2017 16
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Distance Measure?
• Distance measures on social-network graphs
can be tricky

• KISS ex: D(x,y)=0 if edge, 1 otherwise


– D(A, B)=0
– D(A, C)=0 A B
– D(B, C) = 1 > D(A, B) + D(A, C)

• Also leads to unfortunate ties that C


lead to unfortunate results in
typical clustering algorithms

Mining Social-Network Graphs

December 4, 2017 17
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Betweenness
• One of the simplest measures, based on
finding the edges that are the least likely
to be inside a community
– Clustering: remove high betweenness first!

• B(a, b) = for all pairs of nodes (x, y),


fraction of shortest paths that include
edge ab

Mining Social-Network Graphs

December 4, 2017 18
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Betweenness Intuition (1)

Mining Social-Network Graphs

December 4, 2017 19
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Betweenness Intuition (2)

5 12 4.5
A B D E

1 5
4.5 4 1.5
C

G F
1.5

Mining Social-Network Graphs

December 4, 2017 20
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Betweenness Intuition (3)

A B D E

1
1.5
C

G F
1.5

Mining Social-Network Graphs

December 4, 2017 21
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Computing Graph Betweenness


1. For each node
a) Breadth-First Search (BFS)
• WHY?
b) For each edge, compute betweenness

2. For each edge


a) Sum contributions from trees above
b) Divide by 2 (prevents duplication)

Mining Social-Network Graphs

December 4, 2017 22
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, BFS.0)


E A B D E

G F

Mining Social-Network Graphs

December 4, 2017 23
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, BFS.1)


E A B D E

D F C

G F

Mining Social-Network Graphs

December 4, 2017 24
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, BFS.2)


E A B D E

D F C

G F
B G

Mining Social-Network Graphs

December 4, 2017 25
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, BFS.3)


E A B D E

D F C

G F
B G

A C

Mining Social-Network Graphs

December 4, 2017 26
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, BFS.Done)


E A B D E

D F C

G F
B G

A C

Mining Social-Network Graphs

December 4, 2017 27
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Computing BFS Betweenness


• Recursive credit definition
– Node gets 1 + sum of edge credits below
– Edge gets node below divided by outgoing
• Ignore between nodes at the same level
(never used for shortest paths)

• Start from leaves, move up

Mining Social-Network Graphs

December 4, 2017 28
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.1)


E A B D E

D F C

G F
B G 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 29
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.2)


E A B D E

D F C
0.5 0.5
G F
B G 1
1 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 30
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.3)


E A B D E

D F 1.5 C
0.5 0.5
G F
3 B G 1
1 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 31
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.4)


E A B D E
1.5

D F 1.5 C
3 0.5 0.5
G F
3 B G 1
1 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 32
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.5)


E A B D E
1.5

4.5 D F 1.5 C
3 0.5 0.5
G F
3 B G 1
1 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 33
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (E, Credits.6)


E A B D E
4.5 1.5

4.5 D F 1.5 C
3 0.5 0.5
G F
3 B G 1
1 1

1 A C 1

Mining Social-Network Graphs

December 4, 2017 34
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (A, Credits)


A A B D E
5 1

5 B C 1 C
4
4 G F
D
1 1
1
1 E F G 1
1

Mining Social-Network Graphs

December 4, 2017 35
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (B, Credits)


B A B D E
1 4
1
1
1 A C D 4 C
1 1
1 1 G F
1 E F G 1

Mining Social-Network Graphs

December 4, 2017 36
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (C, Credits)


C A B D E
5 1

5 B A 1 C
4
4 G F
D
1 1
1
1 E F G 1
1

Mining Social-Network Graphs

December 4, 2017 37
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (D, Credits)


D A B D E
1 3
1
1
E F G B 3 C
1 1 1 1
1 G F
1 A C 1

Mining Social-Network Graphs

December 4, 2017 38
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (F, Credits)


F A B D E
1 1
4
4
1 E D G 1 C
3
G F
B 3
1 1

A C
1 1

Mining Social-Network Graphs

December 4, 2017 39
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (G, Credits)


G A B D E
1.5 4.5

1.5 F D 4.5 C
0.5
0.5 3
G F
1 E B 3

1 1
1 A C 1

Mining Social-Network Graphs

December 4, 2017 40
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (A)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B
C
D
E
F
G

Mining Social-Network Graphs

December 4, 2017 41
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (B)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C
D
E
F
G

Mining Social-Network Graphs

December 4, 2017 42
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (C)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D
E
F
G

Mining Social-Network Graphs

December 4, 2017 43
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (D)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E
F
G

Mining Social-Network Graphs

December 4, 2017 44
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (E)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E 1 1 3 4.5 0.5 1.5 0.5
F
G

Mining Social-Network Graphs

December 4, 2017 45
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (F)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E 1 1 3 4.5 0.5 1.5 0.5
F 1 1 3 4 1 1
G

Mining Social-Network Graphs

December 4, 2017 46
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (G)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E 1 1 3 4.5 0.5 1.5 0.5
F 1 1 3 4 1 1
G 1 1 3 0.5 4.5 0.5 1.5

Mining Social-Network Graphs

December 4, 2017 47
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (+)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E 1 1 3 4.5 0.5 1.5 0.5
F 1 1 3 4 1 1
G 1 1 3 0.5 4.5 0.5 1.5
+ 10 2 10 24 9 9 8 3 3

Mining Social-Network Graphs

December 4, 2017 48
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Sum Contributions (/2)


AB AC BC BD DE DG DF EF GF
A 5 1 4 1 1 1
B 1 1 4 1 1 1
C 1 5 4 1 1 1
D 1 1 3 1 1 1
E 1 1 3 4.5 0.5 1.5 0.5
F 1 1 3 4 1 1
G 1 1 3 0.5 4.5 0.5 1.5
+ 10 2 10 24 9 9 8 3 3
/2 5 1 5 12 4.5 4.5 4 1.5 1.5

Mining Social-Network Graphs

December 4, 2017 49
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Edge Contributions
AB AC BC BD DE DG DF EF GF
5 1 5 12 4.5 4.5 4 1.5 1.5

5 12 4.5
A B D E

1 5
4.5 4 1.5
C

G F
1.5

Mining Social-Network Graphs

December 4, 2017 50
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Girvan-Newman
1. Repeat until no edges left
a) Calculate betweenness of edges
b) Remove edge(s) with highest betweenness

Mining Social-Network Graphs

December 4, 2017 51
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (0)

Mining Social-Network Graphs

December 4, 2017 52
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (1)
Step 1: Step 2:

Step 3: Hierarchical network decomposition:

Mining Social-Network Graphs

December 4, 2017 53
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

How to Select # of Clusters?


• Similar to agglomerative, with a different
metric: Modularity (Q; [-1, 1])

• Idea: compare fraction of edges within a


group to the fraction that would be
observed for random connections

Mining Social-Network Graphs

December 4, 2017 54
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Selecting # of Clusters

Mining Social-Network Graphs

December 4, 2017 55
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Direct Partitioning
• We now look at an approach to divide a
graph into two disjoint groups, the task of
bi-partitioning, via spectral analysis

• To do so, we must express the graph as a


matrix

Mining Social-Network Graphs

December 4, 2017 56
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Adjacency Matrix (A)


Aij =1 if (i,j) is an edge, else 0

A B D E

C
2 3
0 1 1 0 0 0 0
61 0 1 1 0 0 07
6 7 G F
61 1 0 0 0 0 07
6 7
60 1 0 0 1 1 17
6 7
60 0 0 1 0 1 07
6 7
40 0 0 1 1 0 15
0 0 0 1 0 1 0
Mining Social-Network Graphs

December 4, 2017 57
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Degree Matrix (D)


Dii = deg(i), else 0

A B D E

C
2 3
2 0 0 0 0 0 0
60 3 0 0 0 0 07
6 7 G F
60 0 2 0 0 0 07
6 7
60 0 0 4 0 0 07
6 7
60 0 0 0 2 0 07
6 7
40 0 0 0 0 3 05
0 0 0 0 0 0 2
Mining Social-Network Graphs

December 4, 2017 58
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Laplacian (L)
L=D-A

A B D E

2 3
2 1 1 0 0 0 0
6 1 3 1 1 0 0 07 G F
6 7
6 1 1 2 0 0 0 07
6 7
6 0 1 0 4 1 1 17
6 7
6 0 0 0 1 2 1 07
6 7
4 0 0 0 1 1 3 15
0 0 0 1 0 1 2

Mining Social-Network Graphs

December 4, 2017 59
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• Rows sum to… ?
0 (degree – edges)

• Columns sum to… ? 2 3


2 1 1 0 0 0 0
0 (degree – edges) 6
6 1 3 1 1 0 0 077
6 1 1 2 0 0 0 07
6 7
• Symmetric? 6
6 0 1 0 4 1 1 17
7
Yes! (D is diagonal, A is 6
6 0 0 0 1 2 1 077
symmetric for undirected 4 0 0 0 1 1 3 15
graph) 0 0 0 1 0 1 2

• Diagonal Dominant?
Yes! (edge case = lonely)

Mining Social-Network Graphs

December 4, 2017 60
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

The Plan
• We will now analyze the eigen-
decomposition (Lv = 𝜆v) of the Laplacian

• It turns out that the smallest eigen-


values/vectors can tell us a lot!
– Note: L is PSD, so 𝜆 ≥ 0
– 0: connectivity
– Next smallest: partitioning

Mining Social-Network Graphs

December 4, 2017 61
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• What is a non-trivial solution to the
equation Lv = 𝜆v for 𝜆=0 (i.e. Lv=0)?

2 3
2 1 1 0 0 0 0
6 1 3 1 1 0 0 07
6 7
6 1 1 2 0 0 0 07
6 7
6 0 1 0 4 1 1 17
6 7
6 0 0 0 1 2 1 07
6 7
4 0 0 0 1 1 3 15
0 0 0 1 0 1 2
Mining Social-Network Graphs

December 4, 2017 62
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Answer
• What is a non-trivial solution to the
equation Lv = 𝜆v for 𝜆=0 (i.e. Lv=0)?

• You already solved it: 1’s (column vector


of constants) 2 3
2 1 1 0 0 0 0
6 1 3 1 1 0 0 07
6 7
6 1 1 2 0 0 0 07
• ACTUALLY, not the 6
6
6 0 1 0 4 1 1
7
17
7
whole story… 6
6
4
0
0
0
0
0
0
1
1
2
1 3
1 077
15
0 0 0 1 0 1 2
Mining Social-Network Graphs

December 4, 2017 63
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• Laplacian of the following graph?

W X

Y Z

Mining Social-Network Graphs

December 4, 2017 64
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Answer
• Laplacian of the following graph?

W X
2 3
1 1 0 0
6 1 1 0 07
6 7
40 0 1 15
Y Z
0 0 1 1
Mining Social-Network Graphs

December 4, 2017 65
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• Null-space of L? (i.e. Lv=0)

W X
2 3
1 1 0 0
6 1 1 0 07
6 7
40 0 1 15
Y Z
0 0 1 1
Mining Social-Network Graphs

December 4, 2017 66
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Answer
• Null-space of L? (i.e. Lv=0)
[1 1 0 0]T
[0 0 1 1]T

W X
2 3
1 1 0 0
6 1 1 0 07
6 7
40 0 1 15
Y Z
0 0 1 1
Mining Social-Network Graphs

December 4, 2017 67
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Connectivity via 0th Eigenvalue


• If nodes i and j are connected, their values in the
corresponding eigenvector must be equal
– This “cancels out” across all rows

• So if the graph is connected, the 0th eigenvector


[c c c … ]T
– Transitivity!

• Otherwise, null-space of L provides connected


components
– Could be detected by a simple clustering algorithm
(e.g. k-Means)

Mining Social-Network Graphs

December 4, 2017 68
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Second-Smallest Eigenvalue
• Let’s call this 𝜆1 (with corresponding v1)

• If 𝜆1 is 0, what do we know?
That there’s not much sense in bi-partitioning :)

• Foreshadowing: from here, we’ll construct


an objective function for a bi-partitioning,
and it will turn out to be minimized
precisely by v1
Mining Social-Network Graphs

December 4, 2017 69
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Setup
• For a given graph, we want to assign each
node to one of two groups
– Let’s represent this assignment, per node, as
the value -1 or +1 for variable ni
– So, a vector, n, of length |V| of either -1 or 1

• And do so in a way that minimizes the


number of graph edges between the
groups - so let’s minimize…
(ni – nj)2 for all edges (i, j)

Mining Social-Network Graphs

December 4, 2017 70
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
What is (ni – nj)2 for two connected nodes that …

• Are in the same group?


(1 - 1)2 = 0; (-1 - -1)2 = 0

• Are in different groups?


(1 - -1)2 = 4; (-1 – 1)2 = 4

So minimizing the sum across edges also


minimizes cross-group edges – woohoo!
Mining Social-Network Graphs

December 4, 2017 71
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Converting to Matrices
• So we have ∑(ni – nj)2
– Or: ∑ (ni2 – 2ninj + nj2)

• The way to express this in terms of the


Laplacian is… nTLn
– Each node will have ni2d(ni) term (diagonal), so
one ni2 per edge = ni2 + nj2
– Each edge will have two (-1)(ninj) terms
(symmetric) = -2ninj
– Each non-existent edge will multiply by zero from
Aij being zero
Mining Social-Network Graphs

December 4, 2017 72
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Checkup
• What is the easiest way to minimize the
following function: ni2 – 2ninj + nj2

Mining Social-Network Graphs

December 4, 2017 73
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Answer
• What is the easiest way to minimize the
following function: ni2 – 2ninj + nj2

• n = [0 0 … 0]
– So we need to force values…

Mining Social-Network Graphs

December 4, 2017 74
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Forcing Non-Trivial Solutions


• Easy: ∑n2 = |V|
– The sum of the assignments must equal the
total number of nodes

• Or, in vector notation nTn=|V|

Mining Social-Network Graphs

December 4, 2017 75
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Objective Function – Minimize!


| |
n Ln + (n n |V |)

Ln n=0

SO, to find a good partitioning we need to...


• Find an eigenvector…
• with the second-smallest eigenvalue
– Why doesn’t 0 work?
Mining Social-Network Graphs

December 4, 2017 76
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (1): Graph

A B

C D

Mining Social-Network Graphs

December 4, 2017 77
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (1): Laplacian

2 3
2 1 1 0
6 1 2 0 17
6 7
4 1 0 1 05
0 1 0 1

Mining Social-Network Graphs

December 4, 2017 78
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (1): Eigendecomposition

Sign = Partitioning (+ vs -)
+ {B, D}
- {A, C}

Mining Social-Network Graphs

December 4, 2017 79
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (1): Partitioning

A B

C D

Mining Social-Network Graphs

December 4, 2017 80
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (2): Graph

Mining Social-Network Graphs

December 4, 2017 81
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (2): Laplacian

Mining Social-Network Graphs

December 4, 2017 82
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (2): Eigendecomposition

Mining Social-Network Graphs

December 4, 2017 83
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Example (2): Partition

Mining Social-Network Graphs

December 4, 2017 84
CS6220 – Data Mining Techniques・ ・・ Fall 2017・ ・・ Derbinsky

Many Other Topics


• Overlapping communities

• Network/Node properties
– Triangles, neighborhoods
– Centrality, influence

• Network formation

• Problems
– Link prediction
Mining Social-Network Graphs

December 4, 2017 85

You might also like