batch-05-ml-ppt-1 (1)

The document discusses the application of Fuzzy C-Means (FCM) clustering for intrusion detection in network traffic, emphasizing its ability to assign varying degrees of membership to data points. It outlines the steps for implementing FCM, including data normalization, membership value calculation, and cluster centroid updates, while also proposing a hybrid AI-driven Intrusion Detection System that integrates both signature-based and anomaly-based detection methods. The document highlights the importance of advanced machine learning algorithms and real-time data analysis to enhance detection accuracy and reduce false positives in cybersecurity.

Uploaded by

elurinareshkumar505

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views23 pages

batch-05-ml-ppt-1 (1)

Uploaded by

elurinareshkumar505

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

FUZZY C-MEANS

CLUSTERING
T2 REVIEW
BATCH-05
PRESENTED BY :221FA04319-Yashwanth Reddy
221FA04330-Sravya
221FA04425-Charishma
221FA04617-Yamini
SUBMITTED TO :Mr.Sourav Mondal
T1-Question
Problem Statement
a.)How would you apply clustering algorithms to identify and categorize different types of network traffic
patterns and system behaviors indicative of potential intrusions? Discuss innovative approaches for
clustering network data streams, system logs, and user activities to uncover hidden patterns and anomalies
in the network traffic
b.) Apply Fuzzy C-Means clustering algorithm on the given dataset for intrusion detection.
c. Propose novel approaches distinguish between normal network activities and suspicious or
malicious behavior. Furthermore, explore the integration of advanced machine learning
algorithms, anomaly detection techniques, and real-time data analysis to enhance the system's
ability to adapt to emerging threats and minimize false positives. Your solution should prioritize
both detection accuracy and scalability to effectively safeguard critical network infrastructure
against evolving cyber threats.
Abstract :

Fuzzy C Means is a soft clustering technique in which every data point is assigned a cluster along
with the probability of it being in the cluster.
soft clustering
■ Soft clustering, also known as fuzzy clustering or probabilistic clustering, assigns each data point
a degree of membership/probability values that indicate the likelihood of a data point belonging
to each cluster.
■ Soft clustering allows the representation of data points that may belong to multiple clusters.
Fuzzy C Means and Gaussian Mixed Models are examples of Soft clustering.

How to Run the FCM Algorithm

Initialization: Randomly choose and initialize cluster centroids from the data set and specify a
fuzziness parameter (m) to control the degree of fuzziness in the clustering.
Membership Update: Calculate the degree of membership for each data point to each cluster based
on its distance to the cluster centroids using a distance metric (ex: Euclidean distance).
Centroid Update: Update the centroid value and recalculate the cluster centroids based on the
updated membership values.
Convergence Check: Repeat steps 2 and 3 until a specified number of iterations is reached or the
membership values and centroids converge to stable values.
The Maths Behind Fuzzy C Means
■ In a traditional k-means algorithm, we mathematically solve it via the following steps:
■ Randomly initialize the cluster centers, based on the k-value.
■ Calculate the distance to each centroid using a distance metric. Ex: Euclidean distance, Manhattan distance.
■ Assign the clusters to each data point and then form k-clusters.
■ For each cluster, compute the mean of the data points belonging to that cluster and then update the centroid
of each cluster.
■ Update until the centroids don’t change or a pre-defined number of iterations are over.
1. Our objective is to minimize the objective function which is as follows:
Here:
n = number of data point
c = number of clusters
x = ‘i’ data point
v = centroid of ‘j’ cluster
w = membership value of data point of i to cluster j
m = fuzziness parameter (m>1)
2.Update the membership values using the formula:

3. Update cluster centroid values using a weighted average of the data points:

4. Keep updating the membership values and the cluster centers until the membership values and cluster centers
stop changing significantly or when a predefined number of iterations is reached.

5. Assign each data point to the cluster or multiple clusters for which it has the highest membership value.
Description :

Fuzzy C-Means (FCM):

Fuzzy C-Means (FCM) is a clustering algorithm that assigns data points to multiple clusters
with varying degrees of membership rather than forcing a strict classification. Unlike K-
Means, where each data point belongs to only one cluster, FCM allows for partial
membership, making it more flexible for detecting anomalies in network traffic.
Working Mechanism:
1. Initialize cluster centroids randomly.
2. Compute the membership matrix using:uij=
3. Update cluster centroids:
cj=
4.Repeat steps 2 and 3 until convergence (when centroids stabilize).
Solution :
a) Applying Clustering Algorithms for Intrusion Detection
 Intrusion detection in networks involves identifying unusual patterns in traffic that may indicate
cyber threats like hacking attempts, malware, or data breaches. Clustering algorithms can help by
grouping similar types of network behaviors, making it easier to detect and categorize normal vs.
suspicious activities.
Steps to analysis network traffic and intrusion detection:
Step 1: Understanding Network Traffic and System Behavior
Network traffic consists of different types of data moving between devices. Each communication(or
"packet") has attributes like:
• Source & destination IP addresses (who is sending/receiving data)
• Protocol used (HTTP, FTP, SSH, etc.)
• Data size & transmission speed
• Request frequency (how often a device connects)
Step 2: Using Clustering Algorithms for Intrusion Detection
1. Choosing the Right Clustering Algorithm
Different clustering algorithms can be used based on the type of data and required accuracy:
• K-Means Clustering – Groups network traffic into distinct clusters based on similarity (e.g., normal
users vs. suspicious activities).
• Fuzzy C-Means (FCM) – Allows overlapping clusters, making it useful for cases where behaviors
are not completely normal or completely malicious.
• DBSCAN (Density-Based Clustering) – Detects clusters based on data density, making it good for
spotting outliers like rare cyber attacks.
2. Steps to Apply Clustering:
1. Collect Data – Gather system logs, network packets, and user activities from servers and firewalls.
2. Preprocess Data – Remove irrelevant data, normalize values (e.g., converting IP addresses into
numerical values), and extract key features like connection frequency, request type, and response
time.
3. Benefits of Clustering in Intrusion Detection
Feature Description
Unsupervised Learning No need for labeled attack data
Outlier Detection Detects previously unseen attacks
Behavior Profiling Models normal user/machine behavior
Adaptability Adjusts to changing network patterns
4. Visualization & Interpretation
Use t-SNE or PCA to visualize high-dimensional clusters.
Use heatmaps, scatter plots, and cluster timelines to interpret behavior visually.
Example Use-Case
You cluster user login attempts by time of day, IP location, and device ID.
Anomalies: Multiple login attempts from different continents in a short time = suspicious behavior.
1. Apply Clustering Algorithm – Use a clustering algorithm like FCM to classify
activities into different groups (e.g., normal traffic, suspicious activity, and malicious
attacks).
2. Analyze the Results – Check if any cluster has unusual patterns like sudden spikes in
traffic, repeated failed login attempts, or unexpected data transfers.
3. Detect & Respond – If a cluster is labeled as suspicious, trigger an alert for network
administrators to investigate further.
b) Given network traffic data:

Timesta Source Destinat Bytes Packets Intrusion

mp IP ion IP Transferr Transferr Detected
ed ed
2024-03-
34567 80 5000 20 No
01 08:00
2024-03-
80 34567 10000 30 No
01 08:05
2024-03-
12345 22 2000 10 Yes
01 08:10
2024- 03- 203.0.11 192.168. 4000 15 Yes
01 3.2 1.20
Normalize the Data
Since FCM works best with normalized data, we scale each feature between 0 and 1 using Min-Max
Normalization:
Xnormalized=

Feature Min Value Max Value

Source Port 80 34567
Destination Port 22 34567
Bytes Transferred 2000 1000
Packets Transferred 10 30

Normalized Values Calculation:

For each value, apply the Min-Max formula:

For Data Point A (34567, 80, 5000, 20)

Source Port: =1 .0
Destination Port:= = 0.0017
Bytes Transferred: = =0.375
Packets Transferred: =0.5
In same way calculate for given dataset.
Step 3: Initialize Clusters
We assume C = 2 clusters (Normal and Intrusion) and initialize cluster centers randomly:
• Cluster 1 (C1): (0.9, 0.5, 0.4, 0.6)
• Cluster 2 (C2): (0.2, 0.2, 0.1, 0.1)
Step 4: Compute Membership Values
■ Membership values uiju_{ij}uijare calculated using:
uij=
(We assume m=2m = 2m=2 for simplicity.)
Euclidean Distance Calculation
Distance between data point and cluster centroids:
D=
For A (1.000, 0.0017, 0.375, 0.5):
• Distance to Cluster 1:
• DA1=
• = 0.543
• Distance to Cluster 2:
• DA2=
• = 0.947
Now, calculating membership values:
uA1=
=0.752
uA2=1−uA1=0.248
Following similar calculations for B and C, we get:

Membership in Cluster Membership in Cluster

Data Point
1 2
A (Normal) 0.752 0.248
B (Normal) 0.905 0.095
C (Intrusion) 0.205 0.795
Step 5: Update Cluster Centers and Repeat
Using:
■ cj=
■ New centroids are computed and iterations continue until convergence.
Final Result:
• A and B are classified as Normal Traffic
• C is classified as Intrusion
■ Thus, Fuzzy C-Means successfully detected the intrusion based on network activity patterns.
c) To effectively distinguish between normal network activities and malicious behavior, we can
propose a hybrid AI-driven Intrusion Detection System (IDS) that integrates advanced machine
learning, anomaly detection, and real-time data analysis. Below is a structured approach:
1. Hybrid Detection Approach
A combination of signature-based detection (for known threats) and
anomaly-based detection (for new and evolving threats) ensures a
comprehensive security system.
a. Signature-Based Detection:
• Utilizes predefined attack signatures to detect known cyber threats.
• Incorporates continuously updated threat intelligence feeds to stay relevant.
b. Anomaly-Based Detection:
• Uses machine learning (ML) models to detect deviations from normal behavior.
• Identifies zero-day attacks and advanced persistent threats (APTs).
2. Advanced Machine Learning Algorithms for Threat Detection
a. Supervised Learning (For Classification of Attacks)
• Random Forest / XGBoost: Robust in handling imbalanced datasets.
• Deep Learning (CNN, LSTMs): Useful for time-series and sequential attack
patterns.
1. Hybrid Learning System (Unsupervised + Supervised + Online Learning)
Component Purpose Algorithm
Examples
Unsupervised Learn hidden patterns from raw data FCM,DBSCAN,Autoencoders
Supervised Classify known threats accurately XGBoost,Random Forest, SVM
Online ML Adapt to new behaviors in real time HoeffdingTrees,OnlinekNN

2. Real-Time Anomaly Detection with Autoencoders

Autoencoder (AE) learns a compressed representation of normal traffic.
If reconstruction error > threshold, it flags as anomaly.
Best for: DDoS detection, port scans, insider anomalies
3. Graph-Based Behavior Modeling
Represent network entities (IP, MAC, user, port, service) as nodes
Connections and interactions are edges
Use Graph Neural Networks (GNNs) or DeepWalk to detect abnormal flows or lateral movement.
Detects: Privilege escalation, lateral movement, command & control (C2) traffic.
4) Ensemble Techniques for Reduced False Positives
Combine the predictions from multiple models:
Supervised (RF, XGBoost)
Unsupervised (DBSCAN, Isolation Forest)
Neural models (LSTM, CNN, AE)
Use majority voting, stacking, or meta-classifiers
Boosts accuracy, balances precision vs. recall, minimizes overfitting.
OUTPUT
Thank you

Lec. 15-Final. ClusAdvanced
No ratings yet
Lec. 15-Final. ClusAdvanced
103 pages
Fuzzy C-Means Clustering: Mahdi Amiri
100% (1)
Fuzzy C-Means Clustering: Mahdi Amiri
33 pages
10 Fuzzy Clustering PDF
100% (1)
10 Fuzzy Clustering PDF
14 pages
An Improved Fuzzy Clustering Technique For User's Browsing Behaviors
No ratings yet
An Improved Fuzzy Clustering Technique For User's Browsing Behaviors
4 pages
Fuzzy Clustering Toolbox
No ratings yet
Fuzzy Clustering Toolbox
77 pages
UNEC__1734186881
No ratings yet
UNEC__1734186881
50 pages
Unit 3-Fuzzy Clustering
No ratings yet
Unit 3-Fuzzy Clustering
34 pages
A D M A & A W I D / P F R N: Meenakshi - RM, Mr.E.Saravanan
No ratings yet
A D M A & A W I D / P F R N: Meenakshi - RM, Mr.E.Saravanan
4 pages
(Balasko, Dkk. 2007) Fuzzy Clustering
No ratings yet
(Balasko, Dkk. 2007) Fuzzy Clustering
77 pages
ch05 Fuzzycluster
No ratings yet
ch05 Fuzzycluster
8 pages
Comp of Clustering Method
No ratings yet
Comp of Clustering Method
117 pages
Fuzzy_c-means_clustering_identification_method_of_urban_road_traffic_state
No ratings yet
Fuzzy_c-means_clustering_identification_method_of_urban_road_traffic_state
6 pages
AF-DBSCAN Presentation
No ratings yet
AF-DBSCAN Presentation
30 pages
Mod 4 - CLustering
No ratings yet
Mod 4 - CLustering
55 pages
Expert Systems With Applications: D. Binu
No ratings yet
Expert Systems With Applications: D. Binu
12 pages
Fuzzy C-Mean Clustering Algorithm Modification and Adaptation For Applications
No ratings yet
Fuzzy C-Mean Clustering Algorithm Modification and Adaptation For Applications
4 pages
Fuzzy Image Processing: Fuzzy C-Means Clustering Farah Al-Tufaili
No ratings yet
Fuzzy Image Processing: Fuzzy C-Means Clustering Farah Al-Tufaili
17 pages
Fuzzy Image Processing: Fuzzy C-Means Clustering Farah Al-Tufaili
No ratings yet
Fuzzy Image Processing: Fuzzy C-Means Clustering Farah Al-Tufaili
17 pages
Fuzzy C-Means Clustering
No ratings yet
Fuzzy C-Means Clustering
22 pages
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
No ratings yet
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
43 pages
Agglomerative Mean-Shift Clustering
No ratings yet
Agglomerative Mean-Shift Clustering
7 pages
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-155-202
No ratings yet
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-155-202
48 pages
Big Data Clustering Using Improvised Fuz
No ratings yet
Big Data Clustering Using Improvised Fuz
8 pages
ML Module 4 Unsupervised Learning - Updated
No ratings yet
ML Module 4 Unsupervised Learning - Updated
55 pages
Performance Analysis of Various Fuzzy Clustering Algorithms: A Review
No ratings yet
Performance Analysis of Various Fuzzy Clustering Algorithms: A Review
12 pages
Fuzzy Clustering
No ratings yet
Fuzzy Clustering
6 pages
Fuzzy Clustering: Presented by CH - Srikanth (07991A1268)
No ratings yet
Fuzzy Clustering: Presented by CH - Srikanth (07991A1268)
11 pages
CSE4261 Lecture-8
No ratings yet
CSE4261 Lecture-8
49 pages
Concepts and Techniques: - Chapter 11
No ratings yet
Concepts and Techniques: - Chapter 11
103 pages
Network Administrator Assistance System Based On Fuzzy C-Means Analysis
No ratings yet
Network Administrator Assistance System Based On Fuzzy C-Means Analysis
2 pages
K MEANS
No ratings yet
K MEANS
40 pages
Fuzzy C Means
No ratings yet
Fuzzy C Means
2 pages
Lect 12
No ratings yet
Lect 12
80 pages
Fuzzypaper May No K
No ratings yet
Fuzzypaper May No K
20 pages
A Novel Kernelized Fuzzy Clustering Algorithm For Data Classification
No ratings yet
A Novel Kernelized Fuzzy Clustering Algorithm For Data Classification
6 pages
Unsupervised Optimal Fuzzy Clustering: I.Gath and A. B. Geva. IEEE Transactions On Pattern
No ratings yet
Unsupervised Optimal Fuzzy Clustering: I.Gath and A. B. Geva. IEEE Transactions On Pattern
34 pages
Data mining and machine learning
No ratings yet
Data mining and machine learning
48 pages
Fuzzy c Means
No ratings yet
Fuzzy c Means
4 pages
Keynote Speaker Snsi06 Unsupervised Classification by Soft Computing Techniques
No ratings yet
Keynote Speaker Snsi06 Unsupervised Classification by Soft Computing Techniques
4 pages
Clustering
No ratings yet
Clustering
104 pages
A Hybrid Algorithm Based On KFCM-HACO-FAPSO For Clustering ECG Beat
No ratings yet
A Hybrid Algorithm Based On KFCM-HACO-FAPSO For Clustering ECG Beat
6 pages
Data Mining Algorithms in R - Clustering - Fuzzy Clustering - Fuzzy C-Means - Wikibooks, Open Books For An Open World
No ratings yet
Data Mining Algorithms in R - Clustering - Fuzzy Clustering - Fuzzy C-Means - Wikibooks, Open Books For An Open World
8 pages
Intrusion Detection Using Data Mining Along Fuzzy Logic and Genetic Algorithms
No ratings yet
Intrusion Detection Using Data Mining Along Fuzzy Logic and Genetic Algorithms
6 pages
8 Esh Narayan 734 Research Article CSIT June 2012
No ratings yet
8 Esh Narayan 734 Research Article CSIT June 2012
9 pages
2025-02-12
No ratings yet
2025-02-12
108 pages
Clustering - Fuzzy C-Means
No ratings yet
Clustering - Fuzzy C-Means
5 pages
Fuzzy C Mean
No ratings yet
Fuzzy C Mean
6 pages
Neurocomputing: Yi Ding, Xian Fu
No ratings yet
Neurocomputing: Yi Ding, Xian Fu
3 pages
Machine Learning Chapter 3
No ratings yet
Machine Learning Chapter 3
12 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
Importance of Clustering
No ratings yet
Importance of Clustering
5 pages
Conference Paper1
No ratings yet
Conference Paper1
5 pages
Assignment#3 AI
No ratings yet
Assignment#3 AI
5 pages
MCQ
No ratings yet
MCQ
60 pages
A Comparative Study of Fuzzy C-Means Algorithm and Entropy-Based Fuzzy Clustering Algorithms Subhagata Chattopadhyay
No ratings yet
A Comparative Study of Fuzzy C-Means Algorithm and Entropy-Based Fuzzy Clustering Algorithms Subhagata Chattopadhyay
20 pages
Dispatch Term
No ratings yet
Dispatch Term
17 pages
Fuzzy C-Means - Review
No ratings yet
Fuzzy C-Means - Review
3 pages
IMECS2009 pp177-182
No ratings yet
IMECS2009 pp177-182
6 pages
LP3 Soft Computing 4 Practical
No ratings yet
LP3 Soft Computing 4 Practical
7 pages
Fuzzy C Means (Overlapping Clustering)
No ratings yet
Fuzzy C Means (Overlapping Clustering)
13 pages
Access Beginning Behavioral Research A Conceptual Primer 7th Edition Rosnow Test Bank All Chapters Immediate PDF Download
100% (27)
Access Beginning Behavioral Research A Conceptual Primer 7th Edition Rosnow Test Bank All Chapters Immediate PDF Download
7 pages
WA1200 3coresalesupdated
No ratings yet
WA1200 3coresalesupdated
86 pages
Mil DTL 7788G
No ratings yet
Mil DTL 7788G
46 pages
AcheivingCompetitiveAdvantagethroughCostLeadershipStrategy
No ratings yet
AcheivingCompetitiveAdvantagethroughCostLeadershipStrategy
18 pages
2017 Asia Catalogue - Final
No ratings yet
2017 Asia Catalogue - Final
224 pages
MC Ceb Implementation Guide
No ratings yet
MC Ceb Implementation Guide
16 pages
435-Dynamic Analysis of Multibody Systems Using Component Modes
No ratings yet
435-Dynamic Analysis of Multibody Systems Using Component Modes
10 pages
Celwel - 60 (Adore Electrodes)
No ratings yet
Celwel - 60 (Adore Electrodes)
1 page
WD Assignment
No ratings yet
WD Assignment
8 pages
Yzf-r15 r15 v2 Electrical 1
No ratings yet
Yzf-r15 r15 v2 Electrical 1
1 page
Cash Denomination Calculator
No ratings yet
Cash Denomination Calculator
8 pages
TSA BIM Ready Complete
No ratings yet
TSA BIM Ready Complete
19 pages
8259 Programmable Interrupt Controller Application: Experiment #10
No ratings yet
8259 Programmable Interrupt Controller Application: Experiment #10
10 pages
Centrecon Series - Ameron International Pole Products Division
No ratings yet
Centrecon Series - Ameron International Pole Products Division
2 pages
Tecnofoam G-2025
No ratings yet
Tecnofoam G-2025
5 pages
Willetton SHS Year 7 2024 FINAL
No ratings yet
Willetton SHS Year 7 2024 FINAL
4 pages
0 0 2112123612151TCSPKG-1
No ratings yet
0 0 2112123612151TCSPKG-1
7 pages
Affinity Designer Shortcuts Ipad
No ratings yet
Affinity Designer Shortcuts Ipad
3 pages
Ficha Técnica Mesa de Refrigeracion Industrial TWT 72 Ada HC
No ratings yet
Ficha Técnica Mesa de Refrigeracion Industrial TWT 72 Ada HC
2 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Windows 10 Cannot Find Windows PowerShell - LNK On WinX Start Right Click Menu - Tech Journey
No ratings yet
Windows 10 Cannot Find Windows PowerShell - LNK On WinX Start Right Click Menu - Tech Journey
1 page
Multichannel Headends: Power Supplies
No ratings yet
Multichannel Headends: Power Supplies
1 page
Digital Techniques VIVA ANSWERS 3rd SEM
No ratings yet
Digital Techniques VIVA ANSWERS 3rd SEM
6 pages
Data Visualization With Python: BCS358D
No ratings yet
Data Visualization With Python: BCS358D
5 pages
Aire Acondicionado Split Mural X Frig TK 10992786 Techsheetsup
No ratings yet
Aire Acondicionado Split Mural X Frig TK 10992786 Techsheetsup
1 page
Analisa Harga: Pekerjaan: Elektrikal
No ratings yet
Analisa Harga: Pekerjaan: Elektrikal
2 pages
Ghostbusters
100% (8)
Ghostbusters
105 pages
Cisco Certified Network Associate CCNA 200-301
From Everand
Cisco Certified Network Associate CCNA 200-301
Manish Soni
No ratings yet
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
SC-200: Microsoft Security Operations Analyst Preparation
From Everand
SC-200: Microsoft Security Operations Analyst Preparation
Georgio Daccache
No ratings yet

batch-05-ml-ppt-1 (1)

Uploaded by

batch-05-ml-ppt-1 (1)

Uploaded by

FUZZY C-MEANS

How to Run the FCM Algorithm

Fuzzy C-Means (FCM):

Timesta Source Destinat Bytes Packets Intrusion

Feature Min Value Max Value

Normalized Values Calculation:

For Data Point A (34567, 80, 5000, 20)

Membership in Cluster Membership in Cluster

2. Real-Time Anomaly Detection with Autoencoders

You might also like