0% found this document useful (0 votes)

1 views

6

The document discusses schemas for the multidimensional model used in data warehousing, including star, snowflake, and fact constellation schemas, each with distinct characteristics and applications. It differentiates between data warehouses and data marts, explaining their scopes and typical schemas used. Additionally, it covers the Data Mining Query Language (DMQL) for defining data warehouses and concept hierarchies that map low-level concepts to higher-level ones.

Uploaded by

gihanisa.singersl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

6

Uploaded by

gihanisa.singersl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

6.

Schemas for Multidimensional Model

Lesson Introduction

The entity-relationship data model is commonly used in the design of relational databases, where a
database schema consists of a set of entities and the relationships between them. Such a data model is
appropriate for online transaction processing. A data warehouse, however, requires a concise, subject-
oriented schema that facilitates on-line data analysis. The most popular data model for a data warehouse
is a multidimensional model. Such a model can exist in the form of a star schema, a snowflake schema, or
a fact constellation schema.

Learning Outcomes:
After successful completion of this lesson, the students will be able to understand what schemas are
available for the multidimensional model, the difference between a data warehouse and a data mart,
how to define a data warehouse using DMQL and what is a concept hierarchy.

Lesson Outline:
 What are the available Schemas for the multidimensional model?
 What is a data mart?
 How to Define a data warehouse using Data Mining Query Language (DMQL)?
 What is a concept hierarchy?

6.1 Star schema.

A Star schema consists of a fact table and related dimensions. Star schema is the simplest of the available
schemas for multidimensional modeling. A star schema for sales is shown in Figure 6-1. Sales are
considered along four dimensions, namely, time, item, branch, and location. The schema contains a
central fact table for sales that includes keys to each of the four dimensions, along with two measures:
dollars sold and units sold.
Figure 6-1: Star Schema

Notice that in the star schema, each dimension is represented by only one table, and each table contains
a set of attributes. For example, the location dimension table lists the attribute set location key, street,
city, province or state, country. This constraint may introduce some redundancy. For example, “Vancouver”
and “Victoria” are both cities in the Canadian province of British Columbia. Entries for such cities in the
location dimension table will create redundancy among the attributes province or state and country, that
is, (..., Vancouver, British Columbia, Canada) and (..., Victoria, British Columbia, Canada).

6.2 Snowflake Schema

When the star schema is normalized, it will be converted to a Snowflake schema. The resulting schema
graph forms a shape similar to a snowflake. The significant difference between the snowflake and star
schema models is that the dimension tables of the snowflake model may be kept in a normalized form to
reduce redundancies. Such a table is easy to maintain and saves storage space. However, this saving of
space is negligible in comparison to the typical magnitude of the fact table. Furthermore, the snowflake
structure can reduce the effectiveness of browsing, since more joins will be needed to execute a query.

A snowflake schema for sales is given in Figure 6-2. The single dimension table for the item in the star
schema is normalized in the snowflake schema, resulting in new item and supplier tables. For example,
the item dimension table now contains the attributes item key, item name, brand, type, and supplier key,
where supplier key is linked to the supplier dimension table, containing supplier key and supplier type
information. Similarly, the single dimension table for location in the star schema can be normalized into
two new tables: location and city. The city key in the new location table links to the city dimension.

Figure 6-2: Snowflake Schema

6.3 Fact constellation.

A fact constellation schema consists of multiple fact tables and multiple dimensions. A fact constellation
schema is shown in Figure 6-3. This schema specifies two fact tables, sales, and shipping. The sales table
definition is identical to that of the star schema (Figure 6-1). The shipping table has five dimensions, or
keys: item key, time key, shipper key, from location, and to location, and two measures: dollars cost, and
units shipped. A fact constellation schema allows dimension tables to be shared between fact tables. For
example, the dimensions tables for time, item, and location are shared between both the sales and
shipping fact tables.

Figure 6-3: Fact Constellation Schema

6.4 Data Warehouse vs. Data Mart

In data warehousing, there is a distinction between a data warehouse and a data mart. A data warehouse
collects information about subjects that span the entire organization, such as customers, items, sales,
assets, and personnel, and thus its scope is enterprise-wide. For data warehouses, the fact constellation
schema is commonly used, since it can model multiple, interrelated subjects. A data mart, on the other
hand, is a department subset of the data warehouse that focuses on selected subjects, and thus its scope
is departmentwide. For data marts, the star or snowflake schema are commonly used, since both are
geared toward modeling single subjects, although the star schema is more popular and efficient.
A summary of the comparison between a data warehouse and a data mart is given in table 6-1.

Table 6-1: Data Warehouse vs Data Mart

6.5 Data Mining Query Language (DMQL)

Just as relational query languages like SQL can be used to specify relational queries, a data mining query
language can be used to determine data mining tasks. Data warehouses and data marts can be defined
using two language primitives, one for cube definition and one for dimension definition.

The cube definition statement has the following syntax:

define cube <cube_name> [<dimension_list>]: <measure_list>

The dimension definition statement has the following syntax:

define dimension <dimension_name> as (<attribute_or_subdimension_list>)

Star schema definition for the Figure 6-1:

define cube sales_star [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier_type)

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city, province_or_state, country)

The define cube statement defines a data cube called sales_star, which corresponds to the central sales
fact table. This command specifies the dimensions and the two measures, dollars-sold and units_sold. The
data cube has four dimensions, namely, time, item, branch, and location. A define dimension statement
is used to define each of the dimensions.

Snowflake schema definition for the Figure 6-2:

define cube sales_snowflake [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier(supplier_key,

supplier_type))

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city(city_key, province_or_state, country))

This definition is similar to that of sales_star gave above, except that, here, the item and location
dimension tables are normalized. For instance, the item dimension of the sales_star data cube has been
normalized in the sales snowflake cube into two dimension tables, item, and supplier. Note that the
dimension definition for the supplier is specified within the definition for the item. Defining supplier in
this way implicitly creates a supplier key in the item dimension table definition. Similarly, the location
dimension of the sales_star data cube has been normalized in the sales snowflake cube into two
dimension tables, location, and city. The dimension definition for the city is specified within the definition
of location. In this way, a city key is implicitly created in the location dimension table definition.

Fact constellation schema definition for the Figure 6-3:

define cube sales [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier_type)

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city, province_or_state, country)

define cube shipping [time, item, shipper, from_location, to_location]:

dollar_cost = sum(cost_in_dollars), unit_shipped = count(*)

define dimension time as time in cube sales

define dimension item as item in cube sales

define dimension shipper as (shipper_key, shipper_name, location as location in cube sales,

shipper_type)

define dimension from_location as location in cube sales

define dimension to_location as location in cube sales

A define cube statement is used to define data cubes for sales and shipping, corresponding to the two fact
tables. Note that the time, item, and location dimensions of the sales cube are shared with the shipping
cube. This is indicated for the time dimension, for example, as follows. Under the define cube statement
for shipping, the statement “define dimension time as time in cube sales” is specified.

6.6 Concept Hierarchies

A concept hierarchy defines a sequence of mappings from a set of low-level concepts to higher-level, more
general concepts. Consider a concept hierarchy for the dimension location. City values for location include
Vancouver, Toronto, NewYork, and Chicago. Each city, however, can be mapped to the province or state
to which it belongs. For example, Vancouver can be mapped to British Columbia, and Chicago to Illinois.
The provinces and states can, in turn, be mapped to the country to which they belong, such as Canada or
the USA. These mappings form a concept hierarchy for the dimension location, mapping a set of low-level
concepts (i.e., cities) to higher-level, more general concepts (i.e., countries).

Figure 6-4: Concept hierarchy for the dimension location

The concept hierarchy described above is illustrated in Figure 6-4. Many concept hierarchies are implicit
within the database schema. For example, suppose that the dimension location is described by the
attributes number, street, city, province or state, zip code, and country. A total order relates these
attributes, forming a concept hierarchy such as “street < city < province or state < country”. This hierarchy
is shown in Figure 6-5.
Figure 6-5: Hierarchy for location

GAMP 5 Overview
100% (1)
GAMP 5 Overview
47 pages
Schema For Decision Support
No ratings yet
Schema For Decision Support
3 pages
Schema
No ratings yet
Schema
3 pages
Data Model Schemas
No ratings yet
Data Model Schemas
5 pages
Unit 2.docx
No ratings yet
Unit 2.docx
30 pages
Data Cube
No ratings yet
Data Cube
6 pages
A Multi-Dimensional Data Model
No ratings yet
A Multi-Dimensional Data Model
37 pages
Data Warehousing Schemas
No ratings yet
Data Warehousing Schemas
17 pages
introduction to DataWarehouse and DataMining
No ratings yet
introduction to DataWarehouse and DataMining
35 pages
$RD56ADG
No ratings yet
$RD56ADG
21 pages
Data Mining Notes UNIT II
No ratings yet
Data Mining Notes UNIT II
25 pages
Lect-6-Data warehousing-Part-II.ppt
No ratings yet
Lect-6-Data warehousing-Part-II.ppt
37 pages
Dataware House Strcture
No ratings yet
Dataware House Strcture
13 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
17 pages
Lecture Six-Schemas
No ratings yet
Lecture Six-Schemas
5 pages
Warehouse Schema Design
No ratings yet
Warehouse Schema Design
3 pages
Data Warehousing Schemas and Objects
No ratings yet
Data Warehousing Schemas and Objects
24 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
Dimensional Modeling and Schemas: Data Modeling Research Paper
No ratings yet
Dimensional Modeling and Schemas: Data Modeling Research Paper
11 pages
Data Warehouse Concepts PDF
0% (1)
Data Warehouse Concepts PDF
14 pages
DW-DM R19 Unit-1
100% (1)
DW-DM R19 Unit-1
25 pages
DWM 2
No ratings yet
DWM 2
21 pages
Schemas For Multidimensional Databases
No ratings yet
Schemas For Multidimensional Databases
5 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Bi Lecture4 - 2023
No ratings yet
Bi Lecture4 - 2023
49 pages
Infor Basics
No ratings yet
Infor Basics
15 pages
EXP
No ratings yet
EXP
3 pages
Unit-2 2
No ratings yet
Unit-2 2
15 pages
MODULE2
No ratings yet
MODULE2
22 pages
Data Cubemod2
100% (1)
Data Cubemod2
21 pages
The Basics: Facts & Dimensions
No ratings yet
The Basics: Facts & Dimensions
4 pages
Dimensional Modelling
No ratings yet
Dimensional Modelling
36 pages
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
No ratings yet
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
15 pages
Data Warehousing and Data Mining Dec 2023
No ratings yet
Data Warehousing and Data Mining Dec 2023
28 pages
Chapter Nine
No ratings yet
Chapter Nine
36 pages
Unit 5 DW
No ratings yet
Unit 5 DW
12 pages
Data Warehousing 2
No ratings yet
Data Warehousing 2
14 pages
Datawarehouse operations
No ratings yet
Datawarehouse operations
18 pages
ch3
No ratings yet
ch3
60 pages
ETL Testing
No ratings yet
ETL Testing
3 pages
Dimensional Modeling
100% (1)
Dimensional Modeling
12 pages
DMDW 7
No ratings yet
DMDW 7
30 pages
Data Warehouse Final Notes
No ratings yet
Data Warehouse Final Notes
17 pages
1
No ratings yet
1
35 pages
Data Warehouse
No ratings yet
Data Warehouse
8 pages
DMDW-MDM L8,9
No ratings yet
DMDW-MDM L8,9
53 pages
Schema Is A Logical Description of The Entire Database
No ratings yet
Schema Is A Logical Description of The Entire Database
4 pages
unit2--- 5marks(datascience)
No ratings yet
unit2--- 5marks(datascience)
16 pages
ETL Testing Fundamentals
No ratings yet
ETL Testing Fundamentals
5 pages
DWM Exp1 C49
No ratings yet
DWM Exp1 C49
13 pages
Unit 2
No ratings yet
Unit 2
33 pages
Dimensional Modelling
No ratings yet
Dimensional Modelling
26 pages
21IS503 UnitI LM2
No ratings yet
21IS503 UnitI LM2
31 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
DWDM Unit 2 PDF
No ratings yet
DWDM Unit 2 PDF
16 pages
Data warehousing Schemas
No ratings yet
Data warehousing Schemas
18 pages
Data Mining
No ratings yet
Data Mining
55 pages
Unit 2 Notes DWM
No ratings yet
Unit 2 Notes DWM
14 pages
Multi Dimensional Data Model[1]
No ratings yet
Multi Dimensional Data Model[1]
21 pages
Experiment No.02: LAB Manual Part A
No ratings yet
Experiment No.02: LAB Manual Part A
10 pages
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
1 SM
No ratings yet
1 SM
10 pages
4428MCR Wavin AS Acoustic Soil PIM SW216 WEB PDF
No ratings yet
4428MCR Wavin AS Acoustic Soil PIM SW216 WEB PDF
22 pages
Lec 4
No ratings yet
Lec 4
16 pages
CPM Rumah Contoh
No ratings yet
CPM Rumah Contoh
3 pages
HAP.ppt
No ratings yet
HAP.ppt
17 pages
Committes D Cyber N Legal and Ethical Ind
No ratings yet
Committes D Cyber N Legal and Ethical Ind
55 pages
BV Endura_Specifications
No ratings yet
BV Endura_Specifications
11 pages
CSCourses
No ratings yet
CSCourses
31 pages
TMS HTML Controls Pack
No ratings yet
TMS HTML Controls Pack
41 pages
Conquer
No ratings yet
Conquer
2 pages
Ben Shapiro NLRB Dismissal
No ratings yet
Ben Shapiro NLRB Dismissal
6 pages
Clutch: Section
No ratings yet
Clutch: Section
24 pages
KV
No ratings yet
KV
2 pages
National Standard Plumbing Code
100% (1)
National Standard Plumbing Code
6 pages
Champ® VMV: LED Luminaires For Hazardous Areas
No ratings yet
Champ® VMV: LED Luminaires For Hazardous Areas
12 pages
FinTech East Africa
No ratings yet
FinTech East Africa
137 pages
Autoverification Improved Process Efficiency, Reduced Staff Workload, and Enhanced Staff Satisfaction Using A Critical Path For Result Validation
No ratings yet
Autoverification Improved Process Efficiency, Reduced Staff Workload, and Enhanced Staff Satisfaction Using A Critical Path For Result Validation
11 pages
SP-02 Sumpit Sizing 5minutes
No ratings yet
SP-02 Sumpit Sizing 5minutes
1 page
Day-1 Notes (1)
No ratings yet
Day-1 Notes (1)
3 pages
Media and Information Literacy - G11
No ratings yet
Media and Information Literacy - G11
62 pages
GRD130D 6F2S0904 1.2
No ratings yet
GRD130D 6F2S0904 1.2
296 pages
Career After Mechanical Engineering
No ratings yet
Career After Mechanical Engineering
20 pages
Poultry Building 2 For Roofings
100% (1)
Poultry Building 2 For Roofings
4 pages
Create A Facial Animation Setup in Blender Part 2
No ratings yet
Create A Facial Animation Setup in Blender Part 2
61 pages
Owner's Manual: Powered by
100% (1)
Owner's Manual: Powered by
20 pages
Tree Nuts Market
No ratings yet
Tree Nuts Market
6 pages
Cgroups
No ratings yet
Cgroups
37 pages
ISA 330
No ratings yet
ISA 330
1 page
Honda Activa Parts
76% (33)
Honda Activa Parts
10 pages

6

Uploaded by

6

Uploaded by

6.

Schemas for Multidimensional Model

6.1 Star schema.

6.2 Snowflake Schema

Figure 6-2: Snowflake Schema

Figure 6-3: Fact Constellation Schema

6.4 Data Warehouse vs. Data Mart

Table 6-1: Data Warehouse vs Data Mart

6.5 Data Mining Query Language (DMQL)

The cube definition statement has the following syntax:

define cube <cube_name> [<dimension_list>]: <measure_list>

The dimension definition statement has the following syntax:

define dimension <dimension_name> as (<attribute_or_subdimension_list>)

Star schema definition for the Figure 6-1:

define cube sales_star [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier_type)

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city, province_or_state, country)

Snowflake schema definition for the Figure 6-2:

define cube sales_snowflake [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier(supplier_key,

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city(city_key, province_or_state, country))

Fact constellation schema definition for the Figure 6-3:

define cube sales [time, item, branch, location]:

dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)

define dimension time as (time_key, day, day_of_week, month, quarter, year)

define dimension item as (item_key, item_name, brand, type, supplier_type)

define dimension branch as (branch_key, branch_name, branch_type)

define dimension location as (location_key, street, city, province_or_state, country)

define cube shipping [time, item, shipper, from_location, to_location]:

dollar_cost = sum(cost_in_dollars), unit_shipped = count(*)

define dimension time as time in cube sales

define dimension shipper as (shipper_key, shipper_name, location as location in cube sales,

define dimension from_location as location in cube sales

define dimension to_location as location in cube sales

6.6 Concept Hierarchies

Figure 6-4: Concept hierarchy for the dimension location

You might also like