HTCB Unit 2
HTCB Unit 2
Application: Retailers use data mining to understand customer purchasing patterns and
predict future buying behaviors.
Data Collection
Types of Data
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Data can be categorized into various types based on its characteristics.
Application: Different types of data require different techniques for processing and
analysis.
Data Preprocessing
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Preparing raw data for analysis by cleaning, transforming, and organizing it.
Application: Ensures data quality and improves the accuracy of mining results.
Outlier Detection
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Identifying data points that deviate significantly from the rest of the dataset.
Data Integration
Data Transformation
Data Reduction
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Reducing the volume of data while maintaining its integrity.
Data Generation
Data Summarization
Example: Creating summary statistics like mean, median, and standard deviation.
Data Presentation
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Various tasks that data mining can perform, such as classification,
clustering, association rule mining, etc.
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: The framework and structure within which data mining operations are
carried out.
Example: A data mining system with components for data preprocessing, mining
algorithms, and result evaluation.
Application: Helps in organizing and managing the data mining process efficiently.
Application: Allows users to specify what they want to analyze and how.
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Basic operations and tasks that can be performed in data mining.
𝘿𝙀𝙁𝙄𝙉𝙄𝙏𝙄𝙊𝙉: Combining data mining tools with a data warehouse to enhance data
analysis.
Diagrams
Here are a few simple textual descriptions of diagrams that can be used:
```
Data Selection -> Data Cleaning -> Data Transformation -> Data Mining ->
Interpretation/Evaluation
```
```
Raw Data -> Cleaning -> Transformation -> Integration -> Reduced Data
```
```
Data Mining
|-- Classification
|-- Clustering