0% found this document useful (0 votes)
74 views

1.3 What Kind of Data Can Be Mined?

This document discusses various topics related to data mining including: the types of data that can be mined from databases, data warehouses, and transactional data; the kinds of patterns that can be mined including classifications, associations, clusters, and outliers; major issues in data mining like methodology, efficiency, and diversity of database types; an overview of data preprocessing tasks like cleaning, integration, reduction, transformation, and discretization; methods for mining frequent patterns and associations; classification algorithms like decision trees, naive Bayes, rules, neural networks, and support vector machines; clustering methods such as k-means, hierarchical, and density-based; applications of data mining in domains like finance, retail, science, security, and recommender systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

1.3 What Kind of Data Can Be Mined?

This document discusses various topics related to data mining including: the types of data that can be mined from databases, data warehouses, and transactional data; the kinds of patterns that can be mined including classifications, associations, clusters, and outliers; major issues in data mining like methodology, efficiency, and diversity of database types; an overview of data preprocessing tasks like cleaning, integration, reduction, transformation, and discretization; methods for mining frequent patterns and associations; classification algorithms like decision trees, naive Bayes, rules, neural networks, and support vector machines; clustering methods such as k-means, hierarchical, and density-based; applications of data mining in domains like finance, retail, science, security, and recommender systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

1.3 What kind of data can be Mined?

1.3.1 Database Data

1.3.2 Data Warehouses

1.3.3 Transactional Data

1.3.4 Other Kinds of Data

1.4 What Kinds of Patterns can be Mined?

1.4.1 Class/Concept Description:Characterization and Discrimination

1.4.2 Mining Frequent Patterns,Associations and Corrections

1.4.3 Classification and Regression for Predictive Analysis

1.4.4 Cluster Analysis

1.4.5 Outlier Analysis

1.4.6 Are All Patterns Interesting

1.7. Major Issues in Data Mining

1.7.1 Mining Methodology

1.7.2 User Interaction

1.7.3 Efficiency and Scalability

1.7.4 Diversity of Database Types

1.7.5 Data Mining and Society

3.Data Preprocessing

3.1 Data Preprocessing : An Overview

3.1.1 Data Quality:Why Preprocess the Data?

3.1.2 Major Tasks in Data Preprocessing

3.2 Data Cleaning

3.2.1 Missing Values

3.2.2 Noisy Data

3.2.3 Data Cleaning as a process


3.3 Data Intergration

3.3.1 Entity Identification Problem

3.3.2 Redundancy and correlation Analysis

3.3.3 Tuple Duplication

3.3.4 Data Value Conflict Detection and Resolution

3.4 Data Reduction

3.4.1 Overview of Data Reduction Strategies

3.4.2 Wavelet Transforms

3.4.3 Principal Components Analysis

3.4.4 Attribute Subset Selection

3.4.5 Regression and Log-Linear Models: Parametric Data Reduction

3.4.6 Histograms

3.4.7 Clustering

3.4.8 Sampling

3.4.9 Data Cube Aggregation

3.5 Data Transformation and Data Discretization

3.5.1 Data Transformation Strategies Overview

3.5.2 Data Transformation by Normalization

3.5.3 Discretization by Binning

3.5.4 Discretization by Histogram Analysis

3.5.5 Discretization by Cluster,Decision Tree,and correlation Analyses

3.5.6 Concept Hierachy Generation for Nominal Data

6.Mining Frequent Patterns ,Associations, and Correlation:Basic Concepts and Methods

6.1 Basic Concepts

6.1.1 Market Basket Analysis: A Motivating Example

6.1.2 Frequent Item sets, Closed Itemsets, and Association Rules

6.2 Frequent Item set Mining Methods

6.2.1 Apriori Algorithm: Finding Frequent Item sets by confined Candidate Generation
6.3 Which Patterns Are Interesting?-Pattern Evaluation Methods

6.3.1 Strong Rules Are Not Necessarily Interesting

6.3.2 From Association Analysis to Correlation Analysis

7.2 Pattern Mining in Multilevel, Multidimensional Space

7.2.1 Mining Multilevel Associations

7.2.2 Mining Multidimensional Associations

8 Classification :Basic Concepts

8.1 Basic Concepts

8.1.1 What is Classification?

8.1.2 Genreral Approach to Calssification

8.2 Decision Tree Induction

8.2.1 Decision Tree Induction

8.2.2 Attribute Selection Measures

8.3 Bayes Classification Methods

8.3.1 Bayes’ Theorem

8.3.2 Naïve Bayesian Classification

8.4 Rule –Based Classification

8.4.1 Using IF-THEN Rules for Classification

8.4.2 Rule Extraction from a Decision Tree

9.2 Classification by Backpropagation

9.2.1 A Multilayer Feed-Forward Neural Network

9.2.2 Defining A Network Topology

9.2.3 Backpropagation

9.3 Support Vector Machines

9.3.1 The Case When the Data Are Linearly Separable

9.3.2 The Case When the Data Are Linearly Inseparable

9.4 Classification Using Frequent Patterns

9.4.1 Associative Classification


9.5 Lazy Learners(or Learning from neighbors)

9.5.1 k-Nearest –Neighbor Classifier

9.6 Other Classification Methods

9.6.1 Genetic Algorithms

9.6.2 Rough Set Approach

9.6.3 Fuzzy Set Approaches

10.1 Cluster Analysis

10.1.1 What is Cluster Analysis?

10.1.2 Requirements for ClusterAnalysis

10.1.3 Overview Of Basic Clustering Methods

10.2 Partitioning Methods

10.2.1 k-Means:A Centroid-Base Technique

10.3 Hierarchical Methods

10.3.1 Agglomerative versus Divisive Hierarchical Clustering

10.4 Density:Based Methods

10.5.1 STING: STatistical Information Grid

11.1 Probabilistic Model_Based Clustering

11.1.1 Fuzzy Clusters

11.1.2 Probabilistic Model –Based Clusters

11.1.3 Expectation –Maximization Algorithm

11.2 Clustering High Dimensional Data

11.2.1 Clustering High Dimensional Data: Problems, Challenges, and Major Methodologies

11.2.2 Subspace Clustering Methods

11.2.3 Biclustering

11.2.4 Dimensionality Reduction Methods and Spectral Clustering

11.4 Clustering with Constraints

11.4.1 Categorization of Constraints

11.4.2 Methods for Clustering with Constraints


12.1 Outliers and Outlier Analysis

12.1.1 What are Outliers?

12.1.2 Types of Outliers

12.1.3 Challenges of Outlier Detection

13.3 Data Mining Applications

13.3.1 Data Mining for Financial Data Analysis

13.3.2 Data Mining for Retail and Telecommunication Industries

13.3.3 Data Mining in Science and Engineering

13.3.4 Data Mining for Intrusion Detetion and Prevention

13.3.5 Data Mining and Recommender Systems

You might also like