0% found this document useful (0 votes)
11 views

Lecture 4

The document discusses different ways to classify data mining systems. It describes classifying systems based on the database technology used, the type of knowledge mined (such as characterization, discrimination, association), the techniques employed (degree of user interaction, analysis methods), and applications adapted to (finance, telecommunications, DNA). Major issues in data mining discussed include mining different types of knowledge, interactive mining at multiple abstraction levels, incorporating background knowledge, and handling noisy/incomplete data. Knowledge discovery in databases is defined as involving data cleaning, integration, selection, transformation, mining, evaluation, and presentation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture 4

The document discusses different ways to classify data mining systems. It describes classifying systems based on the database technology used, the type of knowledge mined (such as characterization, discrimination, association), the techniques employed (degree of user interaction, analysis methods), and applications adapted to (finance, telecommunications, DNA). Major issues in data mining discussed include mining different types of knowledge, interactive mining at multiple abstraction levels, incorporating background knowledge, and handling noisy/incomplete data. Knowledge discovery in databases is defined as involving data cleaning, integration, selection, transformation, mining, evaluation, and presentation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Data Warehouse and Data Mining – Fourth Lecture

2.6 Classification of Data Mining Systems

The data mining system can be classified according to the following


criteria:

 Database Technology
 Statistics
 Machine Learning
 Information Science
 Visualization
 Other Disciplines

Some Other Classification Criteria:

 Classification according to kind of databases mined


 Classification according to kind of knowledge mined
 Classification according to kinds of techniques utilized
 Classification according to applications adapted

Classification according to kind of databases mined

We can classify the data mining system according to kind of databases


mined. Database system can be classified according to different criteria such
as data models, types of data etc. And the data mining system can be
classified accordingly. For example, if we classify the database according to
Data Warehouse and Data Mining 2023-2024

data model then we may have a relational, transactional, object- relational,


or data warehouse mining system.

Classification according to kind of knowledge mined

We can classify the data mining system according to kind of knowledge


mined. It is means data mining system are classified on the basis of
functionalities such as:

 Characterization
 Discrimination
 Association and Correlation Analysis
 Classification
 Prediction
 Clustering
 Outlier Analysis
 Evolution Analysis

Classification according to kinds of techniques utilized

We can classify the data mining system according to kind of techniques


used. We can describe these techniques according to degree of user
interaction involved or the methods of analysis employed.

Prepared by Dr. Dunia H. Hameed Page 35


Data Warehouse and Data Mining 2023-2024

Classification according to applications adapted

We can classify the data mining system according to application adapted.


These applications are as follows:

 Finance
 Telecommunications
 DNA
 Stock Markets
 E-mail

2.7 Major Issues in Data Mining

 Mining different kinds of knowledge in databases. - The need of


different users is not the same. And Different user may be in
interested in different kind of knowledge. Therefore, it is necessary
for data mining to cover broad range of knowledge discovery task.
 Interactive mining of knowledge at multiple levels of abstraction. -
The data mining process needs to be interactive because it allows
users to focus the search for patterns, providing and refining data
mining requests based on returned results.
 Incorporation of background knowledge. - To guide discovery
process and to express the discovered patterns, the background
knowledge can be used. Background knowledge may be used to
express the discovered patterns not only in concise terms but at
multiple level of abstraction.

Prepared by Dr. Dunia H. Hameed Page 36


Data Warehouse and Data Mining 2023-2024

 Data mining query languages and ad hoc data mining. - Data


Mining Query language that allows the user to describe ad hoc mining
tasks, should be integrated with a data warehouse query language and
optimized for efficient and flexible data mining.
 Presentation and visualization of data mining results. - Once the
patterns are discovered it needs to be expressed in high level
languages, visual representations. These representations should be
easily understandable by the users.
 Handling noisy or incomplete data. - The data cleaning methods are
required that can handle the noise, incomplete objects while mining
the data regularities. If data cleaning methods are not there, then the
accuracy of the discovered patterns will be poor.
 Pattern evaluation. - It refers to interestingness of the problem. The
patterns discovered should be interesting because either they represent
common knowledge or lack novelty.
 Efficiency and scalability of data mining algorithms. - In order to
effectively extract the information from huge amount of data in
databases, data mining algorithm must be efficient and scalable.
 Parallel, distributed, and incremental mining algorithms. - The
factors such as huge size of databases, wide distribution of data, and
complexity of data mining methods motivate the development of
parallel and distributed data mining algorithms. These algorithms
divide the data into partitions which is further processed parallel.
Then the results from the partitions is merged. The incremental
algorithms, updates databases without having mine the data again
from scratch.

Prepared by Dr. Dunia H. Hameed Page 37


Data Warehouse and Data Mining 2023-2024

2.8 Knowledge Discovery in Databases(KDD)

Some people treat data mining same as Knowledge discovery while some
people view data mining essential step in process of knowledge discovery.
Here is the list of steps involved in knowledge discovery process:

 Data Cleaning - In this step the noise and inconsistent data is


removed.
 Data Integration - In this step multiple data sources are combined.
 Data Selection - In this step relevant to the analysis task are retrieved
from the database.
 Data Transformation - In this step data are transformed or
consolidated into forms appropriate for mining by performing
summary or aggregation operations.
 Data Mining - In this step intelligent methods are applied in order to
extract data patterns.
 Pattern Evaluation - In this step, data patterns are evaluated.
 Knowledge Presentation - In this step, knowledge is represented.

Prepared by Dr. Dunia H. Hameed Page 38


Data Warehouse and Data Mining 2023-2024

The following diagram shows the process of knowledge discovery process:

Architecture of KDD

Prepared by Dr. Dunia H. Hameed Page 39

You might also like