Dataware House Lec 2
Dataware House Lec 2
The figure shows the essential elements of a typical warehouse. We see the Source Data component shows on the
left. The Data staging element serves as the next building block. In the middle, we see the Data Storage
component that handles the data warehouses data. This element not only stores and manages the data; it also
keeps track of data using the metadata repository. The Information Delivery component shows on the right
consists of all the different ways of making the information from the data warehouses available to the us
1) Data Extraction: This method has to deal with numerous data sources. We have to employ the appropriate
techniques for each data source.
2) Data Transformation: As we know, data for a data warehouse comes from many different sources. If data
extraction for a data warehouse posture big challenges, data transformation present even significant challenges.
We perform several individual tasks as part of data transformation.
First, we clean the data extracted from each source. Cleaning may be the correction of misspellings or may deal
with providing default values for missing data elements, or elimination of duplicates when we bring in the same
data from various source systems.
Standardization of data components forms a large part of data transformation. Data transformation contains many
forms of combining pieces of data from different sources. We combine data from single source record or related
data parts from many source records.
On the other hand, data transformation also contains purging source data that is not useful and separating
outsource records into new combinations. Sorting and merging of data take place on a large scale in the data
staging area. When the data transformation function ends, we have a collection of integrated data that is cleaned,
standardized, and summarized.
3) Data Loading: Two distinct categories of tasks form data loading functions. When we complete the structure
and construction of the data warehouse and go live for the first time, we do the initial loading of the information
into the data warehouse storage. The initial load moves high volumes of data using up a substantial amount of
time.
Metadata Component
Metadata in a data warehouse is equal to the data dictionary or the data catalog in a database management
system. In the data dictionary, we keep the data about the logical data structures, the data about the records and
addresses, the information about the indexes, and so on.
Data Marts
It includes a subset of corporate-wide data that is of value to a specific group of users. The scope is confined to
Unit 1 Page 2
It includes a subset of corporate-wide data that is of value to a specific group of users. The scope is confined to
particular selected subjects. Data in a data warehouse should be a fairly current, but not mainly up to the minute,
although development in the data warehouse industry has made standard and incremental data dumps more
achievable. Data marts are lower than data warehouses and usually contain organization. The current trends in
data warehousing are to developed a data warehouse with several smaller related data marts for particular kinds
of queries and reports.
Management and Control Component
The management and control elements coordinate the services and functions within the data warehouse. These
components control the data transformation and the data transfer into the data warehouse storage. On the other
hand, it moderates the data delivery to the clients. Its work with the database management systems and
authorizes data to be correctly saved in the repositories. It monitors the movement of information into the staging
method and from there into the data warehouses storage itself.
Unit 1 Page 3