0% found this document useful (0 votes)
99 views

Data Mining Notes C1

This document provides an introduction and overview of data mining. It defines data mining as the process of automatically discovering useful information from large data repositories, which is part of the knowledge discovery in databases (KDD) process. The document outlines two main types of data mining tasks - predictive tasks aimed at predicting attribute values and descriptive tasks aimed at deriving patterns from data. Specific tasks discussed include predictive modeling, association analysis, cluster analysis, and anomaly detection.

Uploaded by

wuziqi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views

Data Mining Notes C1

This document provides an introduction and overview of data mining. It defines data mining as the process of automatically discovering useful information from large data repositories, which is part of the knowledge discovery in databases (KDD) process. The document outlines two main types of data mining tasks - predictive tasks aimed at predicting attribute values and descriptive tasks aimed at deriving patterns from data. Specific tasks discussed include predictive modeling, association analysis, cluster analysis, and anomaly detection.

Uploaded by

wuziqi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Notes on Introduction to Data Mining:

Chapter1 Introduction
wuziqing
17th October 2020

1 Data Mining Definition


Data mining is the process of automatically discovering useful information in
large data repositories.
It is a part of knowledge discovery in databases (KDD). A general process of
the knowledge discovery is shown in Figure 1:

Figure 1: Knowledge Discovery in Databases process

2 Data Mining Tasks


The types of data mining tasks can be generally divided into two groups:

1. Predictive Tasks: The objective of these tasks is to predict the value of


a particular attribute based on the values of other attributes.
2. Descriptive Tasks: The objective is to derive pat terns (correlations,
trends, clusters, trajectories, and anomalies) that summarize the underly-
ing relationships in data.

Four common specific data mining tasks are:

1. Predictive Modeling: It refers to the task of building a model for the


target variable as a function of the explanatory variables. It includes
regression and classification.

1
2. Association analysis: It is used to discover patterns that describe
strongly associated features in the data. For example, it can be used
to explore what are the items users usually purchase together at the su-
permarket.
3. Cluster analysis: It seeks to find groups of closely related observations
so that observations that belong to the same cluster are more similar to
each other than observations that belong to other clusters.
4. Anomaly detection: It is the task of identifying observations whose
characteristics are significantly different from the rest of the data.

You might also like