Revision Lecture
Revision Lecture
Science
Revision
Lecture
Copyright University of
Outline
• Topics
•Data Integration & Visualisation
Intro
•Data Visualisation
Fundamentals 1
•Data Visualisation
Fundamentals 2
•Building your data visualisation
•Dashboards
•ETL 1
•ETL 2
•Trends 2
Data Integration & Visualisation Intro
• Data Integration and Visualisation
General Concepts:
•Data Warehouse
•ETL
•Reports and Dashboards
• Tableau
• Questions and Answers
3
Data Warehouse
CR
M DW
H Reportin
g
ER
P
ET
Dashboar
L ds
Data
Base
Data
Mining
Flat
Files 4
Data Visualisation Fundamentals 1
• Data Literacy
• What is Data?
• Quantitative and Qualitative
Variables
• How is data collected?
• Raw Data
• Data Sources
• High Quality Data
• Well-structured data
• Poorly formatted data
5
Poorly formatted data
•Variables (fields) are not in one
column each, with a column
header.
•Each different observation of
the variable (values) is not in a
different row.
•Titles are formatted as rows
above the column headers or
as extra columns.
•Extra columns and rows.
•Column headers formatted as
subtitles and not in the first
row. 6
Data Visualisation Fundamentals 1
• Variable Types Recap
• Types of Qualitative Data
• Types of Quantitative Data
• Variables and Visualisations
• Aggregations
• Types of Graphs
• Distributions for discrete
variables
• Distributions for continuous
variables
7
Data Visualisation Fundamentals 2
• Business Intelligence
Overview
• Create your story
• Identify your audience
• KPIs
• Building storyboard
8
Create your story - BIDF storyboard sections
Section 1 – Current State
Section 2 – Trends
Section 3 – Forecast
11
Three Traits of an Effective Visual
Three Traits of and Effective Visual
Data is Make sure that the data is clear,
clear both in purpose and display.
Visualisation fits the Whether you choose a chart or
data a text be sure that you’re using
the right visual for the job.
Exceptions are easy to Whether you’re highlighting a
spot comparison or outliers in the
data you should make it easy for
your user to identify exceptions
in the data
• The preceding table was influenced by Edward Tufte, who is considered to be the
godfather of data visualization. His book The Visual Display of Quantitative Information,
2nd Edition (Graphics Press), is one of the most highly regarded books in the data
visualization field.
12
Dashboards
• Dashboard definition
• Purpose
• Best Practices
• Examples
• Final Considerations and
Pitfalls
13
ETL 1
• Data Warehouse
Recap
• Data Source
• ETL
14
ETL 2
• Data
Integration
• Data
Warehouse
elements
• Data
Warehouse
Architecture
15
ETL 2
• Dimensional
Modelling
•Facts
•Dimensions
• Schema
•Star
•Snowflake
•Cubes
• ER vs
Dimensional
Modelling
• Design Process 16
Four Steps Dimensional Design Process
Select Business Process
• Establishes exactly what a single fact table row represents. Becomes a binding contract
on the design. It must be declared before choosing dimensions or facts because every
candidate dimension or fact must be consistent with the grain.
Identify Dimensions
• Provide the “who, what, where, when, why, and how” context surrounding a business
process event. Contain the descriptive attributes used by BI applications for filtering
and grouping the facts.
Identify Facts
• Are the measurements that result from a business process event and are almost
17
always numeric.
Trends
• Trends
•Data
Integration
•DWH
•Data Literacy
18
Examination Organisation
• Answer any TWO out of THREE Questions.
• If a word limit is not specified next to a QUESTION then EACH
QUESTION (e.g. Q1, Q2, Q3, etc.) has a word limit of 1000 words.
This limit excludes scanned images of diagrams or hand-written
formulas but includes images with hand- written text.
• Submit your answers to EACH QUESTION SEPARATELY to the
relevant submission point on Blackboard.
19