Ift6802 - Avril 2005 - by jean vaucher. I read somewhere that everybody on this planet is separated by only six other people. Despite having clustered social networks, there seem to exist short paths.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
42 views
JV Small World
Ift6802 - Avril 2005 - by jean vaucher. I read somewhere that everybody on this planet is separated by only six other people. Despite having clustered social networks, there seem to exist short paths.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 74
1
Small World Networks
Jean Vaucher Ift6802 - Avril 2005 ift6802 2 Contents Pertinence of topic Characterization of networks Regular, Random or Natural Properties of networks Diameter, clustering coefficient Watts network models (alpha & beta) Power Law networks Clustered networks with short paths Can these short paths be found ? ift6802 3 Duncan J. Watts Six degrees - the science of a connected age, 2003, W.W. Norton. I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everybody on this planet. Six degrees of separation by John Guare ift6802 4 Networks Networks are everywhere Internet Neurons is brains Social networks Transportation Networks have been studied long time Euler (1736): Bridges of Knigsberg theory of graphs, which is now a major (and difficult! or almost obvious) branch in mathematics ift6802 5 So what is new? Global interconnections Internet Power grids Mass travel, mass culture
FAILURES Computer Viruses Power Blackouts Epidemics
Modeling & analysis ift6802 6 Milgrams Experiment Found short chains of acquaintances linking pairs of people in USA who didnt know each other; Source person in Nebraska Target person in Massachusetts. Sends message by forwarding to people they knew personally (who should be closer to target) Average length of the chains that were completed was between 5 and 6 steps Six degrees of separation principle ift6802 7 Correct question WHY are there short chains of acquaintances linking together arbitrary pairs of strangers???
Or
Why is this surprising ift6802 8 Random networks In a random network, if everybody has 100 friends distributed randomly in the world population, this isnt strange In 6 hops, you can reach 100 6 people - a million million > 6,000 million (world pop.)
BUT: our social networks tend to be clustered. ift6802 9 Social networks Not random But Clustered Most of our friends come from our geographical or professional neighbourhood. Our friends tend to have the same friends BUT In spite of having clustered social networks, there seem to exist short paths between any random nodes. ift6802 10 Social network research
Devise various classes of networks
Study their properties
ift6802 11 Network parameters Network type Regular Random Natural
Size: # of nodes Number of connexions: average & distribution Selection of neighbours ift6802 12 STAR TREE GRID BUS RING REGULAR Network Topologies ift6802 13 Connectivity in Random graphs Nodes connected by links in a purely random fashion How large is the largest connected component? (as a fraction of all nodes) Depends on the number of links per node (Erds, Rnyi 1959) ift6802 14 Connecting Nodes ift6802 15 Random Network (1) add random paths ift6802 16 paths
trees Random Network (2) ift6802 17 paths
trees
networks Random Network (3) ift6802 18 paths
trees
networks ..
Random Network (3+) ift6802 19 paths
trees
networks
fully connected Network Connectivity (4) ift6802 20 Connectivity of a random graph 1 1 Average number of links per node F r a c t i o n
o f
a l l
n o d e s
i n
l a r g e s t
c o m p o n e n t
0 D i s c o n n e c t e d
p h a s e
C o n e c t e d
p h a s e
ift6802 21 Regular or Ordered Network ift6802 22 Network measures Connectivity is not main measure. Characteristic Path Length (L) : the average length of the shortest path connecting each pair of agents (nodes). Clustering Coefficient (C) is a measure of local interconnection if agent i has k i immediate neighbors, Ci, is the fraction of the total possible k i *(k i -1) / 2 connections that are realized between i's neighbors. C, is just the average of the Ci's. Diameter: maximum value of path length ift6802 23 Regular vs Random Networks Average number of connections/node Diameter Number of connections needed to fully connect few, clustered Random Regular fewer, spread large moderate many fewer (<2/3) ift6802 24 Natural networks Between regular grids and totally random graphs Need for parametrized models: Regular -> natural -> random Watts Alpha model ( not intuitive) Beta rewiring model ift6802 25 Clustering Clustering measures the fraction of neighbors of a node that are connected themselves Regular Graphs have a high clustering coefficient but also a high diameter Random Graphs have a low clustering coefficient but a low diameter Both models do match the properties expected from real networks! Random Graph (k=4) Short path length L~log k N Almost no clustering C~k/n Regular Graph (k=4) Long paths L ~ n/(2k) Highly clustered C~3/4 Base metwork is circle ift6802 26 Small-World Networks Random rewiring of regular graph (by Watts and Strogatz) With probability p (or |) rewire each link in a regular graph to a randomly selected node Resulting graph has properties, both of regular and random graphs High clustering and short path length FreeNet has been shown to result in small world graphs
ift6802 27 Example: 4096 node ring Regular graph: n nodes, k nearest neighbors path length ~ n/2k 4096/16 = 256 Random graph: path length ~ log (n)/log(k) ~ 4 Rewired graph (1% of nodes): path length ~ random graph clustering ~ regular graph Small World Graph K=4 ift6802 28 Small- world networks Beta network Rewiring probability | 0 1 0 1 L C ift6802 29
More exactly . (p = |) Small world behaviour C L ift6802 30 Effect of short-cuts Huge effect of just a few short-cuts. First 5 rewirings reduces the path length by half, regardless of size of network Further 50% gain requires 50 more short-cuts ift6802 31 The strength of weak ties Granovetter (1973): effective social coordination does not arise from densely interlocking strong ties, but derives from the occasional weak ties this is because valuable information comes from these relations (it is valuable if/because it is not available to other individuals in your immediate network) ift6802 32 Two ways of constructing ift6802 33 Alpha model Watts first Model (1999) Inspired by Asimovs I, Robot novels R. Daneel Olivaw Elijah Baley Caves of Steel (Earth) Solaria ift6802 34 Two extreme types of social networks Cavemans world people live in isolated communities probability meeting a random person is high if you have mutual friends and very low if you dont Solaria people live isolated from each other but with supreme communication capabilities your social history is irrelevant to your future ift6802 35 Alpha network Alpha (o) distance parameter
o=0 : if A and B have a friend in common, they know each other (Caveman world) o= : A & B dont know each other, no matter how many common friends they have (Solarian world) ift6802 36 Number of mutual friends shared by A and B L i k e l i h o o d
t h a t
A
m e e t s
B
Caveman world Solaria world o=0 o= o=1 ift6802 37 Fragmented networks Small- world net- works Alpha network P a t h
l e n g t h
L
o critical o C l u s t e r i n g
c o e f f i c i e n t
C
L drops because we only count nodes that are connected ift6802 38 How about real networks All nodes in alpha and beta networks are equal in the sense that the number of connections each nodes has is not very far from the average Watts and Strogatz had used normal distribution
Real world is not like that Sizes of cities, Wealth of individuals in USA, Hubs in transportation systems Barabsi and Albert (1999) Scale-free networks, whose connectivity is defined by a power-law distribution ift6802 39 Random Networks Each node is connected to a few other nodes. The number of connections per node forms a Poisson distribution, with a small average of number of connections per node. This & three following graphics from: Linked: The New Science of Networks by Albert-Laszlo Barabasi; 2002 ift6802 40 Scale-Free Networks Each node is connected to at least one other; most are connected to only one, while a few are connected to many. The number of connections per node forms a hyperbolic distribution, with no meaningful average number of connections per node. ift6802 41 Random Scale-Free Scale-free networks are associated with networks that grow by natural processes in which the number of nodes increases with time not just the number of connections. ift6802 42 Power law phenomena Average & median are far apart Whales and minnows Average from a few large nodes Median governed by majority of small nodes
ift6802 43 Performance Real power law networks also have short distances Existence of central backbone of highly connected HUBS nodes Similar phenomena noted in linguistics and economics Zipf Pareto ift6802 44 Zipf's law - linguistics Zipf, a Harvard linguistics professor, sought to determine the frequency of use of the 3rd or 8th or 100th most common words in English text. Zipf's law states that the frequency y is inversely proportional to it's rank r: Y ~ r -b , with b close to unity.
Zipf Presentations ift6802 45 The Pareto Income Distribution The Pareto distribution gives the probability that a person's income is greater than or equal to x and is expressed as | | ( ) parameter shape is income minimum is , 0 , 0 , / k m m x k m x m x X P k > > > = > ift6802 46 Vilfredo Pareto, 1848-1923 Italian economist Born in Paris Polytechnic Institute in Turin in 1869, Worked for the railroads. Pareto did not study economics seriously until he was 42. In 1893 he succeeded his mentor, Walras, as chair of economics at the University of Lausanne. QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. ift6802 47 Paretos contributions Pareto optimality. A Pareto-optimal allocation of resources is achieved when it is not possible to make anyone better off without making someone else worse off. Pareto's law of income distribution. In 1906, Italian economist Vilfredo Pareto created a mathematical formula to describe the unequal distribution of wealth in his country, observing that 20% of the people owned 80% of the wealth. ift6802 48 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 10000 60000 110000 160000 210000 x p ( X > = x ) Pareto distribution, m=10000, k=1 0,01 0,1 1 10000 100000 1000000 x p ( X > = x ) log-log plot Pareto distribution is said to be scale-free because it lacks a characteristic length scale ift6802 49 Building Power-law networks It is easy to create PL networks
Build network node by node Connect new node to an existing node Probability of connection proportional to its number of links The rich get richer The poor get poorer ift6802 50 Structure and dynamics The case of centrality centers are in networks by design (central control, dictatorship) by non-design (unnoticed critical resources, informal groups) or they emerge as a consequence of certain events he was at the right place at a right time clapping in unison ift6802 51 Further applications Search in networks Short paths are not enough Epidemics: medical & software Danger of short-cuts Paths + infectiousness Infection by ideas Fads & Economic Bubbles Individual rationality Peer pressure
ift6802 52 Getting practical: search in networks A node may be linked to another node via a short path but what does it matter if you cannot find the path? In alpha and beta networks there is no notion of distance, therefore directed searches cannot recognize shortcuts Kleinbergs (gamma) networks (2000) ift6802 53 Kleinbergs Small-World Model Embed the graph into an r-dimensional grid (2D in examples) constant number p of short range links (neighborhood) q long range links: choose long-range links such that the probability to have a long range contact is proportional to 1/d r Importance of r ! Decentralized (greedy) routing performs best iff. r = dimension of space (here=2)
r = 2 ift6802 54 Influence of r (1) Each peer u has link to the peer v with probability proportional to where d(u,v) is the distance between u and v.
Optimal value: r = dim = dimension of the space If r < dim we tend to choose more far away neighbors (decentralized algorithm can quickly approach the neighborhood of target, but then slows down till finally reaches target itself). If r > dim we tend to choose more close neighbors (algorithm finds quickly target in its neighborhood, but reaches it slowly if it is far away). When r = 0 long range contacts are chosen uniformly. Random graph theory proves that there exist short paths between every pair of vertices, BUT there is no decentralized algorithm capable finding these paths r v u d ) , ( 1 ift6802 55 r (log scale) p(r) (log scale) increasing =0
T y p i c a l
l e n g t h
o f
d i r e c t e d
s e a r c h
2 short paths cannot be found no short paths ift6802 56 Influence of r (or ) Given node u if we can partition the remaining peers into sets A 1 , A 2 , A 3 , , A logN , where A i , consists of all nodes whose distance from u is between 2 i and 2 i+1, i=0..logN-1. Then given r = dim each long range contact of u is nearly equally likely to belong to any of the sets A i
A 4 A 3 A 2 A 1 ift6802 57 The New Yorker View When gamma is at its critical value two, the resulting network has the peculiar property that nodes possess the same number of ties at all length scales (in 2D world) ift6802 58 DHTs (distributed hash tables) and Kleinberg model
P-Grids model
Kleinbergs model
Balanced n-ary search ift6802 59 More hierarchy Kleinbergs model has only one distance measure, geographical (2D) In human society the social distance is multidimensional if A is close to B and C is close to B but in different dimension then A and C can be very far from each other violation of the triangle inequality but multidimensionality may enable messages to be transmitted in networks very efficiently ift6802 60 Watts et al (2002) search in social networks Searchable networks H 1 10 0 6 o Kleinberg condition o= homophily, the tendency of like to associate with like
H=number of dimensions along which individuals measure similarity ift6802 61 Small Worlds & Epidemic diseases Nodes are living entities Link is contact 3 States Uninfected Infected Recovered (or dead)
ift6802 62 Epidemic diseases Level of infectiousness needed to start an epidemic varies with presence of shortcuts In regular grid, disease may die out due to lack of victims In small world, pandemics are facilitated SRAS Mad cow disease in England
0 Fraction of random shortcuts 1 Threshold infectiousness ift6802 63 Failures in networks Fault propagation or viruses Scale-free networks are far more resistant to random failures than ordinary random networks because of most nodes are leaves But failure of hubs can be catastrophic vulnerable or targets of deliberate attacks which may make scale-free networks more vulnerable to deliberate attacks Cascades of failures 64 Back to Social Networks ift6802 65 Spread of ideas Messages in social networks Fads & fashions Body piercing, baseball caps Harry Potter, Amlie Poulin Innovation, scientific revolutions Solar-centric universe Plate tectonics Is it like the spread of disease ?
ift6802 66 Effect of peers & pundits Peoples decisions are affected by what others do and think Presure to conform ?
Efficient strategy when insufficient knowledge or expertise Ex: picking a restaurant
ift6802 67 Economic models Selfish agents Individual rationality Markets Equilibrium ??? Many agents are trend followers Speculation crashes
ift6802 68 Social Experiments Factors which affect decisions Milgram Asch ift6802 69 Stanley Milgram (1933-1984) Controversial social psychologist Yale & Harvard Small world experiment, 1967 6 degrees of separation Obedience to authority - 1963 ift6802 70 Validity of Milgrams experiment Global connectivity ? US: Omaha Boston stockbroker Only 96 valid subjects (out of 300) 100 from Boston 100 big investors 96 picked at random in Nebraska Success? 18 out of 96 Other experiments: 3 out of 60 Worse.
ift6802 71 Conformity Other presentation ift6802 72 Threshold models of decisions Number of infected neighbors 1 P r o b a b i l i t y
o f
i n f e c t i o n
0 Fraction of neighbors choosing A over B 1 P r o b a b i l i t y
o f
c h o o s i n g
o p t i o n
A
0 Critical Threshold Standard disease spreading model Social decision making ift6802 73 Global Cascades Idea catches on. ift6802 74 Fin
Troanary Photonic Storage Blueprint - How Light Based Logic can Redefine Computation and Data Storage: Volume 10 Troanary Photonic Storage Blueprint, #1