Network Modelling and Variational Bayesian Inference For Structure Analysis of Signed Networks
Network Modelling and Variational Bayesian Inference For Structure Analysis of Signed Networks
Presentation on
GROUP – 5 Guided By :
Dr. Pradumn K. Pandey
AMEYA GUJAR – 21535002
Assistant Professor,
RAMAN PATEL – 21535024
Department of Computer Science & Engg ,
IIT Roorkee - 247667
Signed Networks
● Signed networks consist of the nodes, positive links and negative links
● nodes represent the individuals, the positive links represent like, trust or support relationship and
the negative links represent dislike, distrust or oppose relationship.
● In contrast to unsigned networks only describing whether the relationship between two
individuals exists or not , signed networks may contain more information by extending the single
relationship to the positive and negative relationships.
● The signed networks usually fall into two categories according to whether the link has a direction
or not: undirected signed networks and directed signed networks.
Signed Networks
Balance Theory
Status Theory
● current methods can only efficiently analyze the signed networks with the single community
structure
● for the real-world networks, in general, networks could not only contain the single community
structure, but the mixed structure of community and other structure such as peripheral nodes,
bipartite or multipartite and so on.
● Current methods are unable to analyze the signed networks with the coexisting structure of
communities and peripheral nodes, bipartite, or other structures
Solution Approach
● The proposed method, namely VBS, mainly includes two keys, which are network model and its
learning algorithm
● For network model, The paper present a new probabilistic model which can efficiently model the
signed networks structure with the coexisting structure.
● For its learning algorithm, in the variational Bayesian framework, they deduce the approximate
distribution of model parameters and the latent variable and a model selection criterion.
Model
X = (K, z,ω,π ),
ωk denotes the probability that a node is assigned to the group k, and K k=1 ωk = 1.
π is a K × K × 3 matrix, where πlq1, πlq2 and πlq3 denote the probability that there is a positive link, no link or
negative link between a pair of nodes in the group l and q, respectivel.
In addition, the proposed model contains an indicating variable (or latent variable) z, which is the n × K matrix
containing the group information of nodes. zik = 1 if the node i is assigned to the group k, otherwise zik = 0. Given
the parameter ω, the probability distribution of z is as follows
VBS
● Proposition 1. Given the distributions q(ω) and q(π ) of parameters ω and π, the optimal
distribution q(zi) of zi is the following multinomial distribution q(zi) = M(zi; 1, τi1, . . ., τiK ), (11)
where τ ik is the probability of node i belonging to group k, and satisfies:
● Proposition 2. Given the distribution q(z), the optimal distribution q(ω) of the parameter ω is the
following Dirichlet distribution, which is the same form as its prior p(ω) q(ω) = Dir(ω;ρ), ρq = ρ0 q
+sum(tao iq)
● Proposition 3. Given the approximate distribution q(z), the optimal approximate distribution q(π )
of the parameter π is the following distribution, each factor of which is a Dirichlet distribution
● q(π ) = l,q Dir(πlq;ηlq )
● Then we can use steps iterate to update these equations to convergence
NMI
● We use the normalized mutual information (NMI) to evaluate the performance of the algorithms.
● The range for the NMI value is from 0 to 1.
● The larger the NMI value is, the better the performance of the algorithm is.
● NMI is a good measure for determining the quality of clustering.
● Since it’s normalized we can measure and compare the NMI between different clusterings having
different number of clusters.
Random Synthetic Signed Network Generation
● Most of the synthetic networks are generated by the generate model in which is defined as follows
Model = M(c, n, k, pin, p−, p+)
○ where c, n and k respectively denote the number of communities, the number of nodes in
each community and the average degree of the nodes
○ pin is the probability of the node connecting to other nodes in the same community,
accordingly, 1 − pin is the probability of the node connecting to other nodes in the different
communities,
○ p− and p+ are respectively the probability of negative links within communities and positive
links between communities, which are also called the noise parameters
Results
For a random generated graph when we are changing p+
Generate(4,32,32,0.5,0.5,p*0.05 )
P+ -> 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
nmi = 1.0000 0.8043 1.0000 0.7831 0.7638 0.7572 1.0000 0.4133 0.8077 0.4689
nmi = 0.8660 1.0000 0.7166 0.7729 0.5163 0.8660 0.8338 0.4254 0.0486 0.3867
nmi = 0.8519 1.0000 1.0000 0.4716 0.4715 1.0000 0.7714 0.7968 0.4593 0.5516
nmi = 1.0000 0.5424 1.0000 1.0000 0.4740 1.0000 0.6738 0.5220 0.8447 0.7511
nmi = 0.8660 0.8128 0.7732 0.7647 0.5164 0.8427 0.8519 0.7576 0.3943 0.0801
The type of networks is unbalanced, the larger the value of p+ is, the more the positive links between the
communities are. As the P+ value is increasing the algorithm is finding it harder to form clusters.
As p+ value is changing
0 0.1
0.2 0.3
0.4
As p- value is changing
P- -> 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
NMI = 0.8660 0.8660 0.7508 0.8393 0.8051 0.8660 0.8427 0.8245 0.7490 0.8372
NMI = 0.8660 0.8660 0.8348 1.0000 1.0000 0.8660 0.6929 0.7814 0.6849 0.5056
NMI = 0.8660 1.0000 1.0000 1.0000 0.8044 1.0000 0.7986 0.7719 0.7601 0.8101
The type of networks is unbalanced, the noises lie not only in the communities but between the communities. That means there are
some negative links in the communities and some positive links between the communities. The larger the value of p− is, the more the
negative links in the communities are. The Algorithm can correctly find the communities when p− < 0.3, the accuracy begins to decline
when p− > 0.3.
0 0.1
0.2 0.3
0.4
Thank You!!