Selective Course Assignment
Selective Course Assignment
DONE BY:
i|Page
Table of Contents
Definition of Data mining................................................................................................................................ 1
Here we can see some role of data mining as example [2]: ..................................................................... 1
Tasks in Data Mining: ...................................................................................................................................... 2
Definition Of Data Warehouse ........................................................................................................................ 3
Key characteristics of a data warehouse: .................................................................................................. 3
OLTP vs. OLAP: Powerhouses for Different Data Processing Needs [5]: ......................................................... 4
Purpose:...................................................................................................................................................... 4
Data Model: ................................................................................................................................................ 4
Performance: .............................................................................................................................................. 4
Users: .......................................................................................................................................................... 4
OLTP vs. OLAP: A Tale of Two Data Processing Systems ................................................................................. 5
CONCLUSSION ................................................................................................................................................. 6
References ...................................................................................................................................................... 6
i|Page
Definition of Data mining
Data mining is the process of extracting meaningful patterns and trends from large datasets.
Imagine sifting through a mountain of unrefined ore to discover valuable minerals. Similarly,
data mining helps unearth valuable insights hidden within vast quantities of data [1].
From this definition data mining means gathering searching or digging of meaningful data.
Page | 1
Temporal data mining: Data may contain attributes generated and recorded at different
times. In this case finding meaningful relationships in the data may require considering
the temporal order of the attributes. A temporal relationship may indicate a causal
relationship, or simply an association.
Sensor data mining: Wireless sensor networks can be used for facilitating the collection
of data for spatial data mining for a variety of applications such as air pollution
monitoring. A characteristic of such networks is that nearby sensor nodes monitoring an
environmental feature typically register similar values. This kind of data redundancy due
to the spatial correlation between sensor observations inspires the techniques for in
network data aggregation and mining. By measuring the spatial correlation between data
sampled by different sensors, a wide class of specialized algorithms can be developed to
develop more efficient spatial data mining algorithms.
Music data mining: Data mining techniques, and in particular co-occurrence analysis, has
been used to discover relevant similarities among music corpora (radio lists, CD
databases) for purposes including classifying music into genres in a more objective
manner.
Data mining involves a series of steps to uncover these hidden patterns. Here's a glimpse into or
to get a brief look at some key tasks [2].
o Data Preparation: This crucial step involves cleaning and pre-processing the data to
ensure its accuracy and consistency. Think of organizing your messy mining site before
you start digging! o Data Integration: Data from various sources might need to be
combined and reconciled for a holistic view. Imagine combining data from different mine
shafts to get a complete picture.
o Data Selection: Focusing on relevant subsets of data can be more efficient for specific
tasks. Just like targeting specific areas within the mine for the minerals you seek. o Data
Page | 2
Transformation: Data might need to be transformed into a format suitable for analysis
techniques. This could involve scaling numerical values or converting data types.
Similar to crushing and grinding the ore to make it easier to process.
o Modeling and Pattern Discovery: Here's where the magic happens! Various algorithms
are applied to identify patterns and relationships within the data. Imagine using
specialized tools to separate the valuable minerals from the rest of the material.
o Evaluation and Interpretation: The discovered patterns need to be validated and assessed
for their business significance. Just like evaluating the quality and quantity of the
extracted minerals before you celebrate!
Simply from this point of definition I simply concluded that Data Warehouse is a centralized hub
for data analysis [3].
Subject-oriented: Data is organized around business subjects (e.g., sales, customers, products)
rather than by application source.
Integrated: Data from multiple sources is transformed and cleansed to ensure consistency.
Read-optimized: Designed for querying and analysis, not for real-time transactions.
Data warehouses are a cornerstone of Business Intelligence (BI) systems, providing a foundation
for data exploration, reporting, and data mining activities.
Page | 3
OLTP vs. OLAP: Powerhouses for Different Data Processing Needs [5]:
LTP (On-Line Transaction Processing) and OLAP (On-Line Analytical Processing) are two
fundamental data processing systems, but they serve distinct purposes.
Purpose:
OLTP: Focuses on handling a high volume of short, concurrent transactions in real-time.
Imagine processing an online bank transfer or updating inventory levels after a sale.
OLAP: Concentrates on analyzing large datasets to identify trends and patterns. Think
about analyzing sales data over years to understand customer buying habits.
Data Model:
OLTP: Employs normalized database structures to minimize data redundancy and ensure
data integrity for transactions. This avoids inconsistencies when multiple users modify
the same data.
OLAP: Often utilizes denormalized or multidimensional data models. These models may
contain some redundancy but allow faster retrieval and analysis of complex relationships
within the data.
Performance:
OLTP: Prioritizes fast response times for individual transactions. This ensures users
experience minimal delays when performing tasks like placing orders or checking
account balances.
OLAP: Focuses on efficient retrieval of large datasets for analysis. Query times might
be longer compared to OLTP, but the goal is to provide comprehensive insights.
Users:
OLTP: Supports operational tasks. Users include tellers, customer service
representatives, and anyone involved in day-to-day transactions.
OLAP: Empowers data analysis. Users include business analysts, data scientists, and
managers who seek insights from historical data to make informed decisions.
Example:
Online Banking:
OLTP: When you transfer funds online, the system performs an OLTP transaction,
debiting your account and crediting the recipients in real-time.
OLAP: Later, a data analyst might use OLAP to analyze historical transaction data to
identify trends in customer spending habits or detect potential fraudulent activity.
In essence, OLTP and OLAP work together. OLTP systems provide the raw data for everyday
operations, which is then fed into data warehouses for OLAP analysis. Understanding these
Page | 4
differences is crucial for designing data processing solutions tailored to an organization's specific
needs.
Page | 5
CONCLUSSION
Finaly, I try to conclude what is data mining and data warehouse and about OLTP AND OLAP.
Data mining with simple understanding means gathering or extracting or searching or digging of
valuable data something like that. While data warehouse means repository part of stored data
from a certain source in organization the last concept but not the least is about OLTP and OLAP:
LTP (On-Line Transaction Processing) and OLAP (On-Line Analytical Processing) are two
fundamental data processing systems.
References
[1] M. S. V. K. Pang-Ning Tan, Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data
Mining (Second Edition).
Page | 6