0% found this document useful (0 votes)
5 views

Data Classification

The document discusses data mining and its significance, particularly focusing on data classification, which organizes data into categories based on factors like sensitivity and usage. It outlines sensitivity levels for data classification, including Confidential, Sensitive, and Public, and describes techniques for classification such as manual, automated, and hybrid methods. Additionally, it highlights the importance of protection measures and real-world applications in sectors like healthcare and finance.

Uploaded by

elhamerayoub53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Data Classification

The document discusses data mining and its significance, particularly focusing on data classification, which organizes data into categories based on factors like sensitivity and usage. It outlines sensitivity levels for data classification, including Confidential, Sensitive, and Public, and describes techniques for classification such as manual, automated, and hybrid methods. Additionally, it highlights the importance of protection measures and real-world applications in sectors like healthcare and finance.

Uploaded by

elhamerayoub53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 4

1)introduction :

Data mining is the process of discovering patterns, trends, and useful information from large sets of
data using techniques like machine learning, statistics, and database systems. It helps organizations
make data-driven decisions, improve strategies, and find hidden insights , this process contains a lot
of tasks one the most important ones is data Classification , so what is this task and why is it
important ?

2) data Classification:
Data classification is the process of organizing data into categories or classes based on specific
factors, The goal is to structure data in a way that makes it easier to analyze, retrieve, and use .

The different factors that Data classes are often based on : sensitivityusage, source, legal restrictions,
volume … etc

3)Data Classification Based on


Sensitivity :
 Data classification based on sensitivity is a method of categorizing data into different levels
based on its importance, confidentiality, and potential impact if disclosed, altered, or
destroyed.

 This classification ensures that sensitive data receives the highest level of protection, while
less sensitive data can be handled with less stringent controls

4)Sensitivity Levels in Data


Classification:
The typical sensitivity levels for data classification follow a hierarchical model, where the
highest sensitivity level corresponds to the most protected data. Here are the common
sensitivity levels:
1. Confidential (high risk level) : This category includes the most sensitive data. If
disclosed, modified, or destroyed, it could have catastrophic consequences for an
organization, individuals, or even national security , Examples : (personal health
information , financial data in banking systems…. )

2. Sensitive(medium risk level) : Data that is considered internal and is only intended
for use within the organization. This data is less sensitive than "Confidential," but
still needs protection to ensure it isn’t exposed to unauthorized parties, examples :
(project plans , Internal business strategies ….)

3. Public ( no risk) : This data is considered non-sensitive and can be freely shared
without significant consequences if disclosed, examples : ( any publicly available
information)

5) Techniques for Data Sensitivity


Classification :(hadok les algo 7ws
3lihm chof ida sa7 ydiro lkhdma wla
st w7dkhrin dir mn 3ndk);
1. Manual Classification
 In manual classification, data is classified according to its sensitivity based on the
judgment of an individual or team responsible for the data.
2. Automated Classification
 Automated tools use algorithms and machine learning models to classify data based on
predefined rules and patterns. This approach is useful for processing large volumes of data
quickly and accurately.

 Natural Language Processing (NLP): Used for classifying text-based data (emails, documents,
etc.) by recognizing key terms or patterns indicative of sensitivity.

 Data Tagging: Software can automatically tag data with sensitivity labels based on content,
metadata, or context (e.g., documents with "confidential" in the title might be flagged as
sensitive).

3. Hybrid Classification
 A combination of manual and automated techniques, where automated systems classify data
based on common patterns, but human oversight is used for more complex or ambiguous
cases.
6) protection measures :
1. Encryption

2. Access Control

3. Data Masking

4. Data Backup and Recovery

7)the importance:
(dir mn 3ndk m3a chwia m3a shor explanation z3ma kifh tsa3d fi had el thing)

8) Real-World Applications :
(hado tan ghi short explanation kifh y3wn classification hna )

1. Healthcare

2. Financial Services

3. Government and Military

9) Conclusion(optional dbr rasek) :


Remarque : kyn 7aja asmha “ compliance with
legal regulations” lazm nhdro 3liha 9bl protection
measures m3rftch kifh ndkhlha chof kifh tdir
m3aha ;

You might also like