DWH Tables
DWH Tables
net/publication/340546587
CITATIONS READS
0 1,296
3 authors, including:
K Rajasekaran Saravanan P.
D B jain college Dhanraj baid jain college(Autonomous), Chennai
7 PUBLICATIONS 9 CITATIONS 28 PUBLICATIONS 34 CITATIONS
Some of the authors of this publication are also working on these related projects:
Analysis of Various Contrast Improvement Techniques for Dehazing an Image View project
All content following this page was uploaded by Saravanan P. on 10 April 2020.
Dr.K.Rajasekaran
Associate Professor, P.G. and Research Dept of Computer Science,
D.B.Jain College, Chennai, India
[email protected]
P.Saravanan
Assistant Professor, P.G. and Research Dept of Computer Science,
D.B.Jain College, Chennai, India
[email protected]
P.Prabakaran
Assistant Professor, P.G. and Research Dept of Computer Science,
D.B.Jain College, Chennai, India
[email protected]
Abstract- An Aim of focusing various types of tables and Schema in Data Warehouse. A Data Warehouse (DWH) is a
subject-oriented, integrated, time-variant, non-volatile and large amount of data in support of management’s decision-
making process. It is gaining importance day by day. The words On-Line Analytical Processing (OLAP) bring together a
set of tools that use multidimensional modeling in the extraction of information from the Data Warehouse.
Multidimensional data model is an connected part of On-Line Analytical Processing, or OLAP. It contains a
multidimensional view of huge amounts of historical data from operational sources, thus supplying useful information for
decision makers to improve their business intelligence which has become an integral part of decision making strategy.
There are two forms for data management i.e. operational databases and data warehouse. The operational databases are
where the put in data. Users of this type almost handle with one record at a time and they usually perform the same tasks.
The data warehouse is where we find the data away. Users of this kind of type almost deal with set of row at a time and
their questions require that thousands of rows be fetched into an answer set. The data stored in the warehouse is upload
from the operational system such as marketing, sales, finance etc. There are two types of Tables first one is
Dimension table and second one is Fact table.
Keywords - Dimensional modeling, Dimension table, Slowly Changing, Rapidly Changing, Junk, Inferred, Conformed,
Degenerate, Role Playing, Shrunken and Static Dimension table, Fact Table , Schema, Star, Snowflake and Fact
constellation (or) Multi-Star.
1. INTRODUCTION
The term Data Warehouse (DWH) was first coined by Bill Inmon in 1990. According to Bill Inmon, a data
warehouse is a subject-oriented, integrated, time-variant, non-volatile and large amount of data. This data (or)
information helps analyst to take informed decision in an organization.
An operational database undergoes frequent change on a daily basis on account of the communication that take
place. Assume a business executive wants to analyze earlier comment on any data such as a supplier, a product, or any
consumer data, then the executive will have no data available to study because the previous or old data has been updated
due to transactions.
A data warehouses provide us comprehensive and consolidate data in multidimensional view. Along with
comprehensive and consolidate view of data, a data warehouses also provide us Online Analytical Process (OLAP) tools.
These tools help us in interactive and efficient analysis of data in a multidimensional space. This analysis outcome in
Dimensional modeling (DM) is the design concept used by many data warehouse designers to build their data
warehouse. Dimensional model is the fundamental data model used by many of the commercial OLAP product available
today in the market. Dimensional model names a set of techniques and concepts used in design of data warehouse. It is
consider being different from entity-relationship modeling(ER). It is simpler, more communicative and easier to
understand than ER modeling. It is a technique for conceptualize and visualize data models as a set of measures that are
described by common aspects of the business. It is especially useful for summarizing and rearranging the data and
presenting views of the data to support data analysis.
Dimensional modeling focuses on numeric data such as values, counts, balances, weights, and occurrences. It does not
necessarily involve a relational database. The same model approach, at logical level, can be used for any physical form,
such as multidimensional database or even flat files. Dimensional model always uses the concept of facts (numeric
values) and dimensions (text). Facts are typically numeric values that can be aggregated and dimensions are groups of
hierarchies and descriptors that define the facts. For example, sales amount is a fact; register, timestamp, product, store,
etc. are elements of dimensions. Dimensional models are built by business process area, for example store sales,
inventory, claims, etc. Because the various business process areas share some but not all dimensions, efficiency in design,
operation, and consistency, is achieved using conformed dimensions, i.e. using one copy of the shared dimension across
subject areas. In this model, all data is contained in two types of tables called Dimension Table and Fact Table.
3. TABLES
A fact table consists of the measurements, metrics or facts of a business process. Tables mainly classified into two
types. There are
Dimensions Table
Fact Table
1) Location
2) Time
3) Product
Each dimension table contains data for one dimension. In above example get all your store location information and
put that into one single table called Location. Your store location data may be spanned across multiple tables in your
OLTP system (unlike OLAP), but you need to de-normalize all that data into one single table. A data warehouse require
concise, subject oriented schema that make possible on-line analysis. The majority of data model for data warehouse is
multidimensional model. There are nine types are Slowly Changing, Rapidly Changing, Junk, Inferred, Conformed,
Degenerate, Role Playing, Shrunken and Static Dimension table.
Slowly Changing Dimension- Attributes of a dimension that would undertake change over time. It depend on the
business requirement whether particular attribute history of change should be sealed in the data warehouse. This is called
a slowly changing attribute and a dimension contain such an attribute is called a slowly changing dimension.
Rapidly Changing Dimension- A dimension attribute that changes regularly is a rapidly altering attribute. If you
don’t need to track the change, the rapidly changing attribute is no problem, but if you do need to track the changes,
using a standard slowly changing dimension can result in a enormous increase of the size of the dimension. One solution
is to move the attribute to its own dimension, with a separate reference key in the fact table.
Junk Dimension - A junk dimension is a single table with a combination of various and dissimilar attributes to avoid
having a large number of reference keys in the fact table. Junk dimensions are often created to manage the reference keys
created by rapidly changing dimension.
Inferred Dimension- While loading fact(numeric) records, a dimension record may not yet be ready. One solution is
to generate a surrogate key with empty for all the other attributes. This should technically be called an inferred member,
but is often called an inferred dimension.
Conformed Dimension is used in various locations is called a conformed dimension. A conformed dimension may be
used with many fact tables in a single database, or across multiple data marts or data warehouses.
Degenerate Dimension is when the dimension attribute is store as part of fact table, and not in a part dimension
table. These are essentially dimension keys for which there are no other attributes. In a data warehouse, these are often
used as the result of a drill through query to examine the source of an aggregate number in a report. It can use these
Additive Facts- Additive facts can be used with any aggregation function like Sum( ), Avg( ) etc. Example is
Quantity, sales amount etc.
Semi Additive Facts- Semi-additive facts are those where only a few of aggregation function can be applied. For
example, Consider bank account details. You cannot apply the Sum( ) on the bank balance that does not give useful
results but min( ) and max( ) function may return useful information.
Non-Additive Facts- You cannot use numeric aggregation functions such as Sum( ), Avg( ) etc on Non-additive facts.
For example of non-additive fact is any kind of ratio or percentage. Non numerical values facts can also be a non-additive
facts.
Fact-less Fact Tables- A fact table that does not have any measure is a fact-less fact table. For example, a fact table
which has only product_ID and date key is a fact-less fact table.
Centipede Fact Table- Centipede fact table is a normalized fact table. Modeler may decide to normalize the fact
instead of snow flaking dimensions tables.
Conformed Fact Tables- It can be measures reused across multiple dimension model. For example, KPI such as
profit, sales revenue etc
Incident and Snapshot Facts- A fact table store some kind of measurement and captured against a specific time.
Now it might so happen that the business might unable to capture all of its measure always for every point in time. Then
those unavailable measurement can be kept empty (or) can be filled up with the last available measurements. The first
one is the example of incident fact and the second one is the example of snapshot fact.
Cumulative Fact- This type of fact table describes what has happened over a period of time. For example, this fact
table may bring out the total sales by product by store by day.
4. SCHEMA
Schema is a logical description of the entire database and Collection of Dimension and Fact table. It contain the
name and detail of records of all record types including all associated data-items and aggregate value. Similar to a database,
a data warehouse also require to maintain a schema. A database uses relational model, while a data warehouse uses Star,
Snowflake, and Fact Constellation schema or Multi-Star Schema.
4.1. Star Schema
Each dimension table directly connected to one central fact table is called. This dimension table contains the set of
attributes. The following diagram show the sales fact table data of a company with respect to the four dimension tables are
namely time, item, branch, and location. There is a fact table at the center. It contains the keys to each of four dimensions
tables like namely time_key, item_key, branh_key and location_key. The fact table also contain the attributes are namely
dollars_sold and units_sold.
Fact constellation has the entire dimension table directly connected to more than one central fact tables are called.
The following diagram show two fact tables, namely shipping and sales. Now the item dimension table contain the
attributes are item_key, item_name, brand, type, and supplier-key. The shipper_key is linked to the shipper dimension
table. The shipper dimension table contain the attributes are shipper _key , shipper _name and shipper _type.
5. CONCLUSION
This paper discuss the data warehouse types of tables such as dimension tables are Slowly Changing, Rapidly
Changing, Junk, Inferred, Conformed, Degenerate, Role Playing, Shrunken and Static Dimensions table and fact tables
are Additive Fact , Semi- Additive Fact and Non-Additive Fact table ,Multidimensional schemas such as star schema,
snowflake schema and fact constellation or multi-star schema. The advantage of using these table and schema are simpler
and communicative, easy to read then E-R models. It is useful for terse and rearranging the data and presenting views of
the data to support data analysis. Data warehouses (DWH) separate study workload from transaction workload and allow
an organization to consolidate data from several sources. Dimensional model is a logical design technique for structure
data so that it is instinctive to business users and delivers fast query presentation. Data presented to the business
intelligence tools must be grounded in simplicity to stand any chance of success. Simplicity is a fundamental requirement
because it ensures that users can easily understand databases, as well as allows software to efficiently navigate databases.
References
[1] Mandeep Kaur Sandhu, Amanjot Kaur, Ramandeep Kaur: Data Warehouse Schemas, International Journal of
Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163, Issue 4, Volume 2 (April 2015), pp-47-
51.
[2] Boris Vrdoljak, Marko Banek, and Stefano Rizzi: Designing Web Warehouses from XML Schemas Y. Kambayashi,
M. Mohania, W. Wöß (Eds.): DaWaK 2003, LNCS 2737, pp. 89-98, 2003. SpringerVerlag Berlin Heidelberg 2003
[3] Wolfgang Hummer, Andreas Bauer, Gunnar Harde: XCube – XML For Data Warehouses, DOLAP’03, November
7, 003, USA.
[4] M. Golfarelli, S. Rizzi, and B. Vrdoljak, Data warehouse design from XML sources., Proc. DOLAP’01, Atlanta, pp.
40-47, 2001.
[5] Jiawei Han and Micheline Kamber , “Data Mining Concepts and Technique” ,2nd Edition, , Morgan Kaufmann
Publisher.
[6] Ralph Kimball and Margy Ross, “The Data Warehouse Toolkit: The Complete Guide to Dimensional Modelling”
2nd Edition, R, John Willy & Sons.
[7] Soumya Sen, Ranak Ghosh, Debanjali Paul, Nabendu Chaki “Integrating related XML data into multiple data
warehouse schemas” research paper 2012-2013.
[8] R. Patel, MS.Alpa, “Data Modeling techniques for data warehouse” International Journal of Multidisciplinary
Research, Vol.2 Issue 2, February 2012, ISSN 2231 5780.
[9] Keshav Dev Gupta, Jyoti Gupta, Prakati Prasoon, “Novel Architecture with Dimensional Approach of Data
Warehouse” International Journal of Advanced Research in Computer Science and Software Engineering, Volume
3, Issue 3, March 2013.
[10] Anirban Sarkar “Data Warehouse Requirements Analysis Framework: Business-Object Based Approach”
International Journal of Advanced Computer Science and Applications, Vol. 3, No. 1, 2012.
[11] https://www.folkstalk.com/2010/01/types-of-dimensions.html
[12] http://dwgeek.com/types-of-fact-tables-data-warehouse.html/