0% found this document useful (0 votes)

186 views10 pages

Data Mining and Warehousing Insights

Kisala Micheal is a student with index number 2019-FEB-BIT-B224739-WKD taking the course DATA MINING AND WAREHOUSING under lecturer Dr. Goloba. The document discusses terms related to data warehouses including subject oriented, integrated, time-variant and non-volatile. It also discusses the differences between online transaction processing (OLTP) systems and online analytical processing (OLAP) systems. Finally, it describes processes for data extraction, cleansing, and transformation when designing a data warehouse for the ministry of health.

Uploaded by

VAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

186 views10 pages

Data Mining and Warehousing Insights

Uploaded by

VAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

2019-FEB-BIT-B224739-WKD

NAME: KISALA MICHEAL

INDEX NO. 2019-FEB-BIT-B224739-WKD

COURSE UNIT: DATA MINING AND WAREHOUSING

LECTURER: DR GOLOBA

STUDENTS NO. 1800103221

QTN I
1) In the context of Uganda’s ministry of economic planning, discuss what is meant by the
following terms when describing the characteristics of data in data warehouse.
A warehouse is a subject-oriented, integrated, time-variant and non- volatile
collection of data in support of management's decision making process".

Subject Oriented: Data that gives information about a particular subject instead of
about a company's ongoing operations.

Integrated: Data that is gathered into the data warehouse from a variety of sources and
merged into a coherent whole.

Time-variant: All data in the data warehouse is identified with a particular time period.

Non-volatile: Data is stable in a data warehouse. More data is added but data is never removed

11) if you were to develop a data warehouse for the Uganda’s ministry of economic planning,
explain to management how online transaction processing (OLTP) systems would differ from data
online analytical processing (OLAP)
The major distinguishing features between OLTP and OLAP are summarized as
follows.

1. Users and system orientation: An OLTP system is customer-oriented and is used for
transaction and query processing by clerks, clients, and information technology professionals.
An OLAP system is market-oriented and is used for data analysis by knowledge workers,
including managers, executives, and analysts.

2. Data contents: An OLTP system manages current data that, typically, are too detailed to be
easily used for decision making. An OLAP system manages large amounts of historical data,
provides facilities for summarization and aggregation, and stores and manages information at
different levels of granularity. These features make the data easier for use in informed decision
making.

3. Database design: An OLTP system usually adopts an entity-relationship (ER) data model
and an application oriented database design. An OLAP system typically adopts either a
star or snowflake model and a subject-oriented database design.

4. View: An OLTP system focuses mainly on the current data within an enterprise or
department, without referring to historical data or data in different organizations. In
contrast, an OLAP

system often spans multiple versions of a database schema. OLAP systems also deal
with information that originates from different organizations, integrating information from
many data stores. Because of their huge volume, OLAP data are stored on multiple storage
media.

5. Access patterns: The access patterns of an OLTP system consist mainly of short, atomic
transactions. Such a system requires concurrency control and recovery mechanisms.
However, accesses to OLAP systems are mostly read-only operations although many could be
complex queries.

111) Discuss the three main tasks which will be associated with the administration and
management of Uganda’s ministry of economic planning data warehouse’

QUESTION TWO

1) Data warehouse architecture consists of many components. Explain the role of each
component in case you were designing a data warehouse for ministry of health.
A database, data warehouse, or other information repository, which consists of
the set of d a t a b a s e s , data warehouses, spreadsheets, or o t h e r kinds of
information repositories containing the student and course information.
2. A database or data warehouse server which fetches the relevant data based on
users’ data mining requests.
3. A knowledge base that contains the domain knowledge used to guide the search
or to evaluate the interestingness of resulting patterns. For example,
the knowledge base may contain metadata which describes data from
multiple heterogeneous sources.
4. A data mining engine, which consists of a set of functional modules for tasks
such as classification, association, classification, cluster analysis, and
evolution and deviation analysis.
5. A pattern evaluation module that works in tandem with the data
mining modules by employing interestingness measures to help focus the
search towards interestingness patterns.
6. A graphical user interface that allows the user an interactive approach

11) Describe the processes which will be associated with data extraction, cleansing, and
transformation when designing a data warehouse for ministry of health.

EXTRACT
Some of the data elements in the operational database can be reasonably be expected to be
useful in the decision making, but others are of less value for that purpose. For this
reason, it is necessary to extract the relevant data from the operational database before
bringing into the data warehouse. Many commercial tools are available to help with the
extraction process. Data Junction is one of the commercial products. The user of one of these
tools typically has an easy- to-use windowed interface by which to specify the following:

(i) Which files and tables are to be accessed in the source database?
(ii) Which fields are to be extracted from them? This is often done internally by
SQL Select statement.
(iii) What are those to be called in the resulting database?
(iv) What is the target machine and database format of the output?
(v) On what schedule should the extraction process be repeated?
T
R
A
N
S
F
O
R
M

The operational databases developed can be based on any set of priorities, which keeps
changing with the requirements. Therefore, those who develop data warehouse based on these
databases are typically faced with inconsistency among their data sources. Transformation
process deals with rectifying any inconsistency (if any).

One of the most common transformation issues is. Attribute Naming Inconsistency‘. It is
common for the given data element to be referred to by different data names in different
databases. Employee Name may be EMP_NAME in one database, ENAME in the other.
Thus one set of Data Names are picked and used consistently in the data warehouse. Once all
the data elements have right names, they must be converted to common formats. The
conversion may encompass the following:

(i) Characters must be converted ASCII to EBCDIC or vice

versa. (ii) Mixed Text may be converted to all uppercase for
consistency. (iii) Numerical data must be converted in to a
common format.
(iv) Data Format has to be
standardized.
(v) Measurement may have to convert. (Rs/
$)
(vi) Coded data (Male/ Female, M/F) must be converted into a common
format.
All these transformation activities are automated and many commercial products are available
to
perform the tasks. Data MAPPER from Applied Database Technologies is one such
comprehensive tool.

CLEANSING
Information quality is the key consideration in determining the value of the information. The
developer of the data warehouse is not usually in a position to change the quality of its
underlying historic data, though a data warehousing project can put spotlight on the data quality
issues and lead to improvements for the future. It is, therefore, usually necessary to go
through the data entered into the data warehouse and make it as error free as possible. This
process is known as Data Cleansing.
Data Cleansing must deal with many types of possible errors. These include missing data and
incorrect data at one source; inconsistent data and conflicting data when two or more source are
involved. There are several algorithms followed to clean the data, which will be discussed in
the coming lecture notes.

111) Describe the real time and near-real-time data warehouse in the context of ministry of health

Real-time data is data that’s collected, processed, and analyzed on a continual basis. It’s
information that’s available for use immediately after being generated. Near real-time data is a
snapshot of historical data, so teams are left viewing a situation as it existed in the recent past
rather than as it is now. Batched data is even slower and may be days old by the time it’s ready for
use.

There’s no industry-standard definition of how much time needs to elapse before real-time data
transitions to near real-time data. But as a general rule, real-time data is measured in seconds,
whereas near real-time data may be days old by the time the BI team works through their queue to
provide a report. And with spreadsheet extracts, the data may be even older by the time the
decision-maker receives it. Powered by a modern, cloud data platform with its centralized data
stores and ability to provide nearly unlimited computing power, there’s no need to settle for
batched or even near-time data when you need to conduct analyses

1v) Discuss how Nkumba University data marts would differ from data warehouses and identify
the main reasons for implementing data marts for Nkumba University.

Size: a data mart is typically less than 100 GB; a data warehouse is typically larger than 100 GB
and often a terabyte or more.

>Range: a data mart is limited to a single focus for one line of business; a data warehouse is
typically enterprise-wide and ranges across multiple areas.

Sources: a data mart includes data from just a few sources; a data warehouse stores data from
multiple sources

QUESTION FIVE

(1) Data Mining Examples: Most Common Applications of Data Mining 2021

 Mobile Service Providers.

 Retail Sector.
 Artificial Intelligence.
 Ecommerce.
 Science and Engineering.
 Crime Prevention.
 Research.
 Farming.

(11) Data mining is the process of extracting valid, previously unknown, comprehensible, and
actionable information from large databases and using it to make crucial business decisions.

There are four main operations associated with data mining techniques which include:

• Predictive modeling

• Database segmentation

• Link analysis

• Deviation detection.

Techniques are specific implementations of the· data mining operations. However, each operation
has its own strengths and weaknesses. With this in mind, data mining tools sometimes offer a
choice of operations to implement a technique.

Predictive Modeling

It is designed on a similar pattern of the human learning experience in using observations to form a
model of the important characteristics of some task. It corresponds to the ‘real world’. It ‘is
developed using a supervised learning approach, which has to phases: training and testing.
Training phase is based on a large sample of historical data called a training set, while testing
involves trying out the model on new, previously unseen data to determine its accuracy and
physical performance characteristics.

It is commonly used in customer retention management, credit approval, cross-selling, and direct
marketing. There are two techniques associated with predictive modeling. These are:

• Classification

• Value prediction

Classification
Classification is used to classify the records to form a finite set of possible class values. There are
two specializations of classification: tree induction and neural induction. An example of
classification using tree induction

In this example, we are interested in predicting whether a customer who is currently renting
property is likely to be interested in buying property. A predictive model has determined that only
two variables are of interest: the length· of the customer has rented property and the age of the
customer. The model predicts that those customers who have rented for more than two years and
are over 25 years old are the most likely to .be interested in buying property. An example of
classification using neural induction is shown in Figure.

A neural network contains collections of connected nodes with input, output, and processing at
each node. Between the visible input and output layers may be a number of hidden processing
layers. Each processing unit (circle) in one layer is connected to each processing unit in the next
layer by a weighted value, expressing the strength of the relationship. This approach is an attempt
to copy the way the human brain works· in recognizing patterns by arithmetically combining all the
variables associated with a given data point.

Value prediction

It uses the traditional statistical techniques of linear regression and nonlinear regression. These
techniques are easy to use and understand. Linear regression attempts to fit a straight line through a
plot of the data, such that the line is the best representation of the average of all observations at that
point in the plot. The problem with linear regression is that the technique only works well with
linear data and is sensitive to those data values which do not conform to the expected norm.
Although nonlinear regression avoids the main problems of linear regression, it is still not flexible
enough to handle all possible shapes of the data plot. This is where the traditional statistical
analysis methods and data mining methods begin to diverge. Applications of value prediction
include credit card fraud detection and target mailing list identification.

Database Segmentation

Segmentation is a group of similar records that share a number of properties. The aim of database
segmentation is to partition a database into an unknown number of segments, or clusters.

This approach uses unsupervised learning to discover homogeneous sub-populations in a database

to improve the accuracy of the profiles. Applications of database segmentation include customer
profiling, direct marketing, and cross-selling

As shown in figure, using database segmentation, we identify the cluster that corresponds to legal
tender and forgeries. Note that there are two clusters of forgeries, which is attributed to at least two
gangs of forgers working on falsifying the banknotes.

Link Analysis
Link analysis aims to establish links, called associations, between the individual record sets of
records, in a database. There are three specializations of link analysis. These are:

• Associations discovery

• Sequential pattern discovery

• Similar time sequence discovery.

Association’s discovery finds items that imply the presence of other items in the same event. There
are association rules which are used to define association. For example, ‘when a customer rents
property for more than two years and is more than 25 years old, in 40% of cases, the customer will
buy a property. This association happens in 35% of all customers who rent properties’.

Sequential pattern discovery finds patterns between events such that the presence of one set of item
is followed by another set of items in a database of events over a period of the. For example, this
approach can be used to understand long-term customer buying behavior.

Time sequence discovery is used in the discovery of links between two sets of data that are time-
dependent. For example, within three months of buying property, new home owners will purchase
goo

ds such as cookers, freezers, and washing machines.

Applications of link analysis include product affinity analysis, direct marketing, and stock price
movement.

Deviation Detection

Deviation detection is a relatively new technique in terms of commercially available data mining
tools. However, deviation detection is often a source of true discovery because it identifies outliers,
which express deviation from some previously known expectation “and norm. This operation can
be performed using statistics and visualization techniques.

Applications of deviation detection include fraud detection in the use of credit cards and insurance
claims, quality control, and defects tracing.

1v)

Data mining benefits include:

 It helps companies gather reliable information.

 It's an efficient, cost-effective solution compared to other data applications.
 It helps businesses make profitable production and operational adjustments.
 Data mining uses both new and legacy systems.
 It helps businesses make informed decisions

Common questions

Data mining techniques are applied across various industries to extract valuable insights from large datasets. In retail, data mining enhances customer profiling and market basket analysis to inform strategic selling . In healthcare, it supports patient data analysis for improved diagnostics and treatment customization. In finance, it aids in fraud detection and risk management, providing predictive insights for credit scoring and investment decisions . Overall, data mining offers benefits such as the ability to make informed decisions, optimize operations, and gain competitive advantages by uncovering hidden patterns and correlations within data .

The processes involved in setting up a data warehouse include data extraction, cleansing, and transformation. Data extraction involves selecting relevant data from operational databases, often using commercial tools like Data Junction . Transformation deals with standardizing data formats and resolving inconsistencies, such as attribute naming discrepancies . It involves converting data elements to common formats and resolving measurement differences . Data cleansing ensures that information entered into the warehouse is error-free, addressing issues like missing and conflicting data from various sources . These steps are crucial for ensuring high data quality, which is key to deriving reliable insights for decision-making.

Real-time data warehouses process and provide data available for use immediately upon generation, allowing for up-to-date analytics and decision-making . In contrast, near-real-time warehouses provide snapshots of data that are close to real-time but typically involve some delay, leading to decisions based on slightly older data . The distinction is significant in health contexts where immediate access to data can improve outcomes by facilitating timely responses to changing conditions. Real-time analysis is crucial for scenarios requiring instant insight, while near-real-time systems may suffice for monitoring trends with less urgency.

Predictive modeling and value prediction are both data mining techniques with distinct approaches. Predictive modeling mimics human learning through supervised learning, developing models based on historical data (training set) to predict outcomes on new data . It is useful for applications like customer retention and credit approval. Conversely, value prediction uses statistical regression (linear and nonlinear) to forecast continuous outcomes such as sales or market trends . While predictive modeling focuses on categorizing or classifying data, value prediction emphasizes estimating numerical values, providing quantitative forecasts essential in financial and resource planning applications.

Data cleansing in a data warehouse context involves dealing with a wide range of errors, such as missing, incorrect, inconsistent, and conflicting data across different sources . The challenges include reconciling these discrepancies without altering the integrity of the original data sources, which requires careful algorithmic approaches. Poor data cleansing can lead to datasets that misrepresent reality, result in inaccurate analyses, and ultimately affect decision-making processes negatively by providing unreliable insights . Effective data cleansing ensures high-quality data entry, which is crucial for maintaining the fidelity of analyses and supporting accurate business intelligence.

In a data warehouse architecture for the Ministry of Health, various components play distinct roles. A central data repository aggregates data from multiple sources into a unified database or warehouse . Data warehouse servers process user data requests . A knowledge base provides domain knowledge, guiding data interpretation . The data mining engine performs key functions such as classification, association, and cluster analysis . The pattern evaluation module helps prioritize findings by measuring their interestingness . Finally, a graphical user interface enables users to interact with data and perform analyses efficiently . Together, these components facilitate the transformation of raw health data into actionable knowledge for better decision-making.

Link analysis in data mining establishes relationships between data items, identifying associations, sequential patterns, and time sequences that inform behavior predictions and trend analysis . It is instrumental in uncovering complex relationships, such as in product affinity and customer behavior analysis. Deviation detection focuses on identifying anomalies in data that diverge from established norms . In fraud detection, these techniques reveal unusual patterns of activity or associations that could indicate fraudulent behaviors, such as atypical transactional patterns in financial data. By highlighting outliers, deviation detection aids proactive risk management and security measures.

A data warehouse is described as a subject-oriented, integrated, time-variant, and non-volatile collection of data that supports an organization's decision-making process. 'Subject oriented' means data is organized around major subjects of the business rather than day-to-day operations . 'Integrated' indicates data from various sources is unified for consistent analysis . 'Time-variant' implies the data is identified and stored with time relevance, allowing for historical trend analyses . 'Non-volatile' means the data is stable; it is read-only for analyses purposes and does not change except to add new data . These characteristics enable decision-makers to glean insights and trends over time, essential for strategic planning and policy formulation.

Data marts differ from data warehouses in size, scope, and data sources. Typically, data marts are below 100 GB and are limited to specific subjects within a single business line, whereas data warehouses are more extensive, often exceeding 100 GB, and cover multiple business areas . Data marts derive data from fewer sources than data warehouses, which integrate multiple data sources . Nkumba University might implement data marts to provide targeted insights for specific departments or functions, enabling focused analysis without the complexity and cost of developing a full-scale data warehouse. This allows for more efficient data management aligned with particular organizational needs.

An OLTP (Online Transaction Processing) system is customer-oriented, focusing on transaction processing and handling current, detailed data, typically using an ER data model . In contrast, an OLAP (Online Analytical Processing) system is market-oriented, designed for complex analytical queries that help knowledge workers make informed decisions. OLAP manages vast amounts of historical data enabling summarization and aggregation . While OLTP focuses on current data contained within a department, OLAP spans multiple database schemas and integrates information from diverse sources to provide comprehensive insights . These differences impact how decisions are informed by providing breadth in data analysis and the ability to uncover historical trends and patterns critical for strategic planning.

CCS341 Data Warehousing Overview
No ratings yet
CCS341 Data Warehousing Overview
14 pages
Data Warehouse Concepts Explained
No ratings yet
Data Warehouse Concepts Explained
7 pages
Data Warehousing Concepts and ETL
No ratings yet
Data Warehousing Concepts and ETL
113 pages
Relational Databases vs Data Warehouses Explained
No ratings yet
Relational Databases vs Data Warehouses Explained
4 pages
Data Warehouse and Mining Course Overview
No ratings yet
Data Warehouse and Mining Course Overview
99 pages
Data Warehousing Question Bank
No ratings yet
Data Warehousing Question Bank
9 pages
Data Warehouse and Mart Overview
No ratings yet
Data Warehouse and Mart Overview
22 pages
Understanding Data Warehousing Basics
No ratings yet
Understanding Data Warehousing Basics
6 pages
Data Warehouse Models and Examples
No ratings yet
Data Warehouse Models and Examples
19 pages
Mapping Data Warehouse Architecture
No ratings yet
Mapping Data Warehouse Architecture
34 pages
Data Warehousing and Mining Concepts
No ratings yet
Data Warehousing and Mining Concepts
9 pages
Data Warehousing and OLAP Overview
No ratings yet
Data Warehousing and OLAP Overview
71 pages
Operational vs Strategic Information Analysis
No ratings yet
Operational vs Strategic Information Analysis
46 pages
CCS341 Data Warehousing Question Bank
No ratings yet
CCS341 Data Warehousing Question Bank
10 pages
Data Warehouse Basic Concepts Explained
No ratings yet
Data Warehouse Basic Concepts Explained
18 pages
OLAP Operations and Data Warehouse Schemas
No ratings yet
OLAP Operations and Data Warehouse Schemas
18 pages
Data Warehousing and Mining Course Overview
No ratings yet
Data Warehousing and Mining Course Overview
78 pages
Data Warehousing & Mining Question Bank
No ratings yet
Data Warehousing & Mining Question Bank
25 pages
CCS341 Data Warehousing Question Bank
No ratings yet
CCS341 Data Warehousing Question Bank
15 pages
Data Warehouse Architecture Overview
No ratings yet
Data Warehouse Architecture Overview
22 pages
Current Trends in Data Warehousing
No ratings yet
Current Trends in Data Warehousing
21 pages
Star Schema and OLAP in Data Warehousing
No ratings yet
Star Schema and OLAP in Data Warehousing
65 pages
Data Warehousing and OLAP Overview
No ratings yet
Data Warehousing and OLAP Overview
222 pages
Data Warehousing Concepts Explained
100% (1)
Data Warehousing Concepts Explained
40 pages
Evolution of Databases and Data Mining
No ratings yet
Evolution of Databases and Data Mining
10 pages
Data Warehousing & Mining Overview
100% (6)
Data Warehousing & Mining Overview
143 pages
Data Warehouse and OLAP Architecture Explained
No ratings yet
Data Warehouse and OLAP Architecture Explained
15 pages
Data Warehouse Tools and Processes
No ratings yet
Data Warehouse Tools and Processes
13 pages
Data Warehousing Concepts and Architectures
No ratings yet
Data Warehousing Concepts and Architectures
32 pages
Overview of Data Warehousing Concepts
100% (1)
Overview of Data Warehousing Concepts
45 pages
Data Warehousing and Mining Overview
No ratings yet
Data Warehousing and Mining Overview
15 pages
Data Mining and Data Warehouse Overview
No ratings yet
Data Mining and Data Warehouse Overview
9 pages
Data Warehouse Concepts and ETL Process
0% (1)
Data Warehouse Concepts and ETL Process
35 pages
Data Warehouse Administration Overview
100% (1)
Data Warehouse Administration Overview
14 pages
Data Warehousing Concepts Overview
No ratings yet
Data Warehousing Concepts Overview
39 pages
Data Warehouse Fundamentals and Models
No ratings yet
Data Warehouse Fundamentals and Models
10 pages
Data Warehousing and Mining Overview
No ratings yet
Data Warehousing and Mining Overview
70 pages
Fall 2013 Assignment Program: Bachelor of Computer Application Semester 6Th Sem Subject Code & Name Bc0058 - Data Warehousing
No ratings yet
Fall 2013 Assignment Program: Bachelor of Computer Application Semester 6Th Sem Subject Code & Name Bc0058 - Data Warehousing
9 pages
Data Warehousing Benefits and Features
No ratings yet
Data Warehousing Benefits and Features
26 pages
Data Warehousing for Data Mining Insights
No ratings yet
Data Warehousing for Data Mining Insights
86 pages
Feature Engineering in Data Mining
No ratings yet
Feature Engineering in Data Mining
146 pages
Data Warehouse Interview Insights
No ratings yet
Data Warehouse Interview Insights
9 pages
Data Mining PDF
No ratings yet
Data Mining PDF
67 pages
Data Warehousing & Mining Lecture Notes
No ratings yet
Data Warehousing & Mining Lecture Notes
86 pages
Data Warehousing and OLAP Overview
No ratings yet
Data Warehousing and OLAP Overview
82 pages
Understanding Data Warehousing Basics
No ratings yet
Understanding Data Warehousing Basics
8 pages
Data Mart vs. Data Warehouse Features
No ratings yet
Data Mart vs. Data Warehouse Features
4 pages
Data Mining Concepts and Techniques Guide
No ratings yet
Data Mining Concepts and Techniques Guide
108 pages
Data Warehouse Concepts and Architecture
No ratings yet
Data Warehouse Concepts and Architecture
46 pages
Data Warehousing Question Bank for CCS341
No ratings yet
Data Warehousing Question Bank for CCS341
11 pages
Data Warehouse Concepts and Functions
No ratings yet
Data Warehouse Concepts and Functions
34 pages
Data Warehousing for Business Intelligence
No ratings yet
Data Warehousing for Business Intelligence
8 pages
Data Warehouse Concepts and Modeling Guide
No ratings yet
Data Warehouse Concepts and Modeling Guide
26 pages
Data Warehousing & Mining Overview
100% (1)
Data Warehousing & Mining Overview
143 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
144 pages
Defining Data Mining and Data Warehousing
No ratings yet
Defining Data Mining and Data Warehousing
10 pages
Data Mining and Warehouse Techniques
No ratings yet
Data Mining and Warehouse Techniques
15 pages
Allware Systems Data Warehouse Design
No ratings yet
Allware Systems Data Warehouse Design
43 pages
Data Warehousing & Mining Exam Answers
No ratings yet
Data Warehousing & Mining Exam Answers
87 pages
Work With Multidimensional
No ratings yet
Work With Multidimensional
162 pages
Db2oledbv4 PDF
No ratings yet
Db2oledbv4 PDF
113 pages
Understanding ERP Systems and Risks
No ratings yet
Understanding ERP Systems and Risks
32 pages
Database and DTS Package Creation Guide
No ratings yet
Database and DTS Package Creation Guide
15 pages
2102 Enterprise Portal and BOBJ Integration 101 PDF
No ratings yet
2102 Enterprise Portal and BOBJ Integration 101 PDF
57 pages
SAP Frontend Installation Guide 7.0
No ratings yet
SAP Frontend Installation Guide 7.0
20 pages
Data Mining in Retail Industry
No ratings yet
Data Mining in Retail Industry
20 pages
Data Warehouse and Business Intelligence Overview
No ratings yet
Data Warehouse and Business Intelligence Overview
12 pages
Cognos 8 BI 8.3 Customer Reported Issues Corrected Since Cognos 8 v8.2 November 20, 2007
No ratings yet
Cognos 8 BI 8.3 Customer Reported Issues Corrected Since Cognos 8 v8.2 November 20, 2007
36 pages
Advanced Analytics Unlocking The Power of Insight
No ratings yet
Advanced Analytics Unlocking The Power of Insight
15 pages
Data Mining Fundamentals Overview
No ratings yet
Data Mining Fundamentals Overview
36 pages
Google Cloud Platform: Getting Started Guide
No ratings yet
Google Cloud Platform: Getting Started Guide
254 pages
Key Questions on Recent IT Trends
No ratings yet
Key Questions on Recent IT Trends
2 pages
Understanding Enterprise Resource Planning
No ratings yet
Understanding Enterprise Resource Planning
19 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
5 pages
Three Areas of Web Mining Explained
No ratings yet
Three Areas of Web Mining Explained
37 pages
Data Mart and Warehouse Design Insights
No ratings yet
Data Mart and Warehouse Design Insights
73 pages
Data Mining
100% (3)
Data Mining
18 pages
Graphical Components of Regression Analysis
No ratings yet
Graphical Components of Regression Analysis
7 pages
Real-Time Application Architecture Guide
No ratings yet
Real-Time Application Architecture Guide
11 pages
Data Warehouse vs. Database Explained
No ratings yet
Data Warehouse vs. Database Explained
31 pages
Data Warehousing Concepts and Architecture
No ratings yet
Data Warehousing Concepts and Architecture
57 pages
OLAP Case Study: Types and Use Cases
No ratings yet
OLAP Case Study: Types and Use Cases
16 pages
OLAP for Analyzing Electronic Texts
No ratings yet
OLAP for Analyzing Electronic Texts
13 pages
PHP and Cybersecurity Exam Questions
No ratings yet
PHP and Cybersecurity Exam Questions
21 pages
Mifos Loan Management System Overview
No ratings yet
Mifos Loan Management System Overview
49 pages
Mckinney Psi4 Inppt 12
No ratings yet
Mckinney Psi4 Inppt 12
35 pages
Computer Fundamentals and Applications
No ratings yet
Computer Fundamentals and Applications
12 pages
Decision Support Systems Overview
No ratings yet
Decision Support Systems Overview
47 pages
Exploring Web-OLAP Solutions
No ratings yet
Exploring Web-OLAP Solutions
5 pages

Data Mining and Warehousing Insights

Uploaded by

Data Mining and Warehousing Insights

Uploaded by

2019-FEB-BIT-B224739-WKD

NAME: KISALA MICHEAL

INDEX NO. 2019-FEB-BIT-B224739-WKD

COURSE UNIT: DATA MINING AND WAREHOUSING

STUDENTS NO. 1800103221

(i) Characters must be converted ASCII to EBCDIC or vice

 Mobile Service Providers.

This approach uses unsupervised learning to discover homogeneous sub-populations in a database

• Sequential pattern discovery

• Similar time sequence discovery.

ds such as cookers, freezers, and washing machines.

Data mining benefits include:

 It helps companies gather reliable information.

Common questions

What are the applications and benefits of implementing data mining techniques in various industries such as retail, healthcare, and finance?

What processes are involved in data extraction, cleansing, and transformation when setting up a data warehouse for the Ministry of Health?

How do real-time and near-real-time data warehouses differ in their application, particularly within the Ministry of Health?

How do predictive modeling and value prediction techniques differ in their approach and application in data mining?

What are the challenges associated with data cleansing during the development of a data warehouse, and how can these impact the overall data quality?

What are the roles of the components in data warehouse architecture, specifically when designing a data warehouse for the Ministry of Health?

What roles do link analysis and deviation detection play in data mining, and how do they benefit fraud detection processes?

What are the essential characteristics of a data warehouse, and how do they support the decision-making process in an organization like Uganda's Ministry of Economic Planning?

What are the primary differences between data marts and data warehouses, and what would motivate Nkumba University to implement data marts?

How does an OLAP system differ from an OLTP system, particularly within the context of data warehousing for the Ministry of Economic Planning in Uganda?

You might also like