SMA Exp 2
SMA Exp 2
EXPERIMENT 2
Libraries Used:
Pandas: Pandas is a Python library used for working with data sets.It has functions for
analyzing, cleaning, exploring, and manipulating data.pandas is a fast, powerful, flexible
and easy to use open source data analysis and manipulation tool, built on top of the
Python programming language.
Networkx: In Python, we can create, manipulate and analyze networks or graphs with
the help of the NetworkX library. This library provides functions to work with multigraphs
and directed, undirected and weighted graphs. NetworkX is widely used in
transportation, infrastructure planning, and biological networks.
Scipy : The SciPy is an open-source scientific library of Python that is distributed under
a BSD license. It is used to solve the complex scientific and mathematical problems. It
is built on top of the Numpy extension, which means if we import the SciPy, there is no
need to import Numpy. The Scipy is pronounced as Sigh pi, and it depends on the
Numpy, including the appropriate and fast N-dimension array manipulation.It provides
many user-friendly and effective numerical functions for numerical integration and
optimization.
1. Graph Creation:
Nodes in this graph represent Facebook users, and the relationships between them are
formed based on their attributes, specifically age and date of birth. When two users
have similar attributes, a connection (edge) is established between them.
This approach allows us to study the structure of the Facebook friendship network
based on user attributes.
Users with a higher degree centrality have more connections, indicating that they are
potentially more influential or well-connected within the network.
Degree centrality is useful for identifying key individuals who may play important roles in
information spread or network dynamics.
3. Visualization:
The code attempts to visualize the Facebook network using nx.draw(), but there appear
to be issues with sorting and visualization.
Proper visualization can help researchers and analysts identify patterns, clusters, and
important nodes within the network.
4. Closeness Centrality:
Closeness centrality measures how close a node is to all other nodes in the network. It
indicates the efficiency of a node in reaching other nodes.
In the context of a Facebook network, users with high closeness centrality can quickly
reach a wide range of other users, potentially making them influential in information flow
or network dynamics.
5. Bridges:
Bridges are edges in a network whose removal would disconnect the network or split it
into separate components.
Identifying bridges is crucial for understanding the network's vulnerability and ensuring
its connectivity.
The code checks for bridges and visualizes them, which can help in understanding
potential points of network vulnerability.
6. Clustering Coefficient:
The clustering coefficient measures the degree to which nodes in a network tend to
cluster together.
A higher clustering coefficient suggests that nodes are more likely to form tightly-knit
groups or communities.
Data quality is essential for accurate network analysis. Errors or inconsistencies in the
dataset can lead to incorrect insights.
Proper data cleaning, validation, and formatting are necessary to ensure the reliability of
the analysis.
Error handling is important to address issues that may arise during data processing or
analysis.
8. Interpretation:
The results of these analyses can provide valuable insights into the Facebook network:
Conclusion: