batch-05-ml-ppt-1 (1)
batch-05-ml-ppt-1 (1)
CLUSTERING
T2 REVIEW
BATCH-05
PRESENTED BY :221FA04319-Yashwanth Reddy
221FA04330-Sravya
221FA04425-Charishma
221FA04617-Yamini
SUBMITTED TO :Mr.Sourav Mondal
T1-Question
Problem Statement
a.)How would you apply clustering algorithms to identify and categorize different types of network traffic
patterns and system behaviors indicative of potential intrusions? Discuss innovative approaches for
clustering network data streams, system logs, and user activities to uncover hidden patterns and anomalies
in the network traffic
b.) Apply Fuzzy C-Means clustering algorithm on the given dataset for intrusion detection.
c. Propose novel approaches distinguish between normal network activities and suspicious or
malicious behavior. Furthermore, explore the integration of advanced machine learning
algorithms, anomaly detection techniques, and real-time data analysis to enhance the system's
ability to adapt to emerging threats and minimize false positives. Your solution should prioritize
both detection accuracy and scalability to effectively safeguard critical network infrastructure
against evolving cyber threats.
Abstract :
Fuzzy C Means is a soft clustering technique in which every data point is assigned a cluster along
with the probability of it being in the cluster.
soft clustering
■ Soft clustering, also known as fuzzy clustering or probabilistic clustering, assigns each data point
a degree of membership/probability values that indicate the likelihood of a data point belonging
to each cluster.
■ Soft clustering allows the representation of data points that may belong to multiple clusters.
Fuzzy C Means and Gaussian Mixed Models are examples of Soft clustering.
3. Update cluster centroid values using a weighted average of the data points:
4. Keep updating the membership values and the cluster centers until the membership values and cluster centers
stop changing significantly or when a predefined number of iterations is reached.
5. Assign each data point to the cluster or multiple clusters for which it has the highest membership value.
Description :