DataWarehouseInterview Part3
DataWarehouseInterview Part3
A snapshot refers to a complete visualization of data at the time of extraction. It occupies less space
and can be used to back up and restore data quickly.
A snapshot is a process of knowing about the activities performed. It is stored in a report format
from a specific catalog. The report is generated soon after the catalog is disconnected.
What is the main difference between Inmon and Kimball philosophies of Data Warehousing?
Both differ in the concept of building the Data Warehouse.
Hence, the process will be as follows:
Kimball > First Data Marts > Combined Ways > Data Warehouse
Inmon > First Data Warehouse > Data mart
What is data validation strategies for data mart validation after loading process
Rapidly changing dimensions are dimensions where the attribute values of the dimension change
frequently causing the dimension grow rapidly
The rapid growth of this dimension will impact maintenance and performance as the dimension
grows.
Sometimes a dimension is defined that has no content except for its primary key. For example, when
an invoice has multiple line items, the line item fact rows inherit all the descriptive dimension
foreign keys of the invoice, and the invoice is left with no unique content. But the invoice number
remains a valid dimension key for fact tables at the line item level. This degenerate dimension is
placed in the fact table with the explicit acknowledgment that there is no associated dimension
table. Degenerate dimensions are most common with transaction and accumulating snapshot fact
tables.
Date Key (FK), Product Key (FK), Store Key (FK), Promotion Key (FP), and POS Transaction
Number
Date Dimension corresponds to Date Key, Production Dimension corresponds to Production Key. In a
traditional parent-child database, POS Transactional Number would be the key to the transaction
header record that contains all the info valid for the transaction as a whole, such as the transaction
date and store identifier. But in this dimensional model, we have already extracted this info into
other dimension. Therefore, POS Transaction Number looks like a dimension key in the fact table but
does not have the corresponding dimension table.
A surrogate key is a key which does not have any contextual or business meaning. It is manufactured
“artificially” and only for the purposes of data analysis. The most frequently used version of a
surrogate key is an increasing sequential integer or “counter” value (i.e. 1, 2, 3). Surrogate keys can
also include the current system date/time stamp, or a random alphanumeric string.
What does level of Granularity of a fact table signify
The granularity is the lowest level of information stored in the fact table. The depth of data level is
known as granularity. In date dimension the level could be year, month, quarter, period, week, day
of granularity.