A Fundamentals
A Fundamentals
1. Introduction
3. Assumptions:
In order to follow this document better, the reader would be required to have a
sound knowledge of the Data Warehousing concepts and also have an exposure to
SQL as a language for the database. Knowledge of ODBC and basic networking
is essential to help install Informatica and knowledge of Unix and Shells would be
helpful for Unix based servers.
4. Reference:
Title Location
5. Informatica in the Data Warehousing Scenario
DWH_Material_Prese
ntation.ppt
i. Requirement Gathering
The Project team will gather end user reporting requirements and the
remaining period of the project would be dedicated to satisfying these
requirements.
v. Reporting: Design, Develop and enable the end users to visualize the
reports thereby bringing value to the Data Warehouse.
c) What are the various ETL tools that are available?
Selection of an ETL tool would depend on various factors such as the Complexity
of the data transformation, Data Cleansing needs and the Volume of data
involved.
The commonly used ETL tools are:
Informatica
Ab Initio.
For information on Ab Initio as an ETL tool, refer the link
http://www.abinitio.com/abinitio/ab.nsf/index-flash
For discussions on Ab Initio, refer the link below:
http://www.datawarehouse.com/forum/read.php?f=21&i=1921&t=1921
Ascential DataStage
For information on Ab Initio, refer the link below
http://www.ascential.com/products/ds_features.html
Data Junction
Reveleus
d) What is Informatica?
Informatica provides an environment that can extract data from multiple sources,
transform the data according to the business logic that is built in the Informatica
Client application and load the transformed data into files or relational targets.
6. Architecture:
a) Informatica Repository:
The Informatica Repository is a database with a set of metadata tables that is
accessed by the Informatica Client and Server to save and retrieve metadata.
Repository stores the data needed for data extraction, transformation, loading, and
management.
b) Informatica Client:
The Informatica Client is used to manage users, define sources and targets, build
mappings and mapplets with the transformation logic, and create sessions to run
the mapping logic.
The Informatica Client has three main applications:
ii. Designer: The Designer has five tools that are used to analyze sources,
design target schemas and build the Source to Target mappings. These are
Source Analyzer: This is used to either import or create the
source definitions.
Warehouse Designer: This is used to import or create target
definitions.
Mapping Designer: This is used to create mappings that will be
run by the Informatica Server to extract, transform and load data.
Transformation Developer: This is used to develop reusable
transformations that can be used in mappings.
Mapplet Designer: This is used to create sets of transformations
referred to as Mapplets which can be used across mappings.
iii. Server Manager: The Server Manager is used to create, schedule, execute
and monitor sessions.
c) Informatica Server:
The Informatica Server reads the mapping and the session information from the
repository. It extracts data from the mapping sources, stores it in the memory,
applies the transformation rules and loads the transformed data into the mapping
targets.
Connectivity:
Informatica uses the Network Protocol, Native Drivers or the ODBC for the
Connectivity between its various components. The Connectivity details are as
provided in the diagram above.
7. Setting up Informatica:
i. Go to StartSettingsControl Panel
ii. Go to Administrative ToolsData Sources(ODBC)
iii. Click on the System DSN tab and add an entry.
iv. Select MERANT CLOSED 3.60 32-BIT Oracle 8 driver.
v. Provide any Data Source Name.
vi. Provide the tns entry name for the (Informatica) database as the Server
Name.
vii. Do a test connect by providing the informatica database userid and
password.
viii. Save the settings.
a) Source Definition:
• All fields from the Source are moved into the Source Qualifier.
*For information on the Naming Standard, please refer the document embedded below:
Informatica_ETL_Na
ming_Conventions.doc
P.N: The Naming standards provided in the document indicate generic standards that
CAN be followed while designing a mapping.
For information on creating and working with Shortcuts, refer the Informatica Designer
Help.
• The ISO_CTRY_COD field from the Source Qualifier is moved to the Lookup
transformation LKP_CTRY_COD and all the fields including the
ISO_CTRY_COD is moved to the Expression transformation EXP_COUNTRY.
c) Lookup Transformation (LKP_CTRY_COD)
i. Lookup transformation is Passive transformation.
ii. A Lookup transformation would be used in an Informatica mapping to
lookup data in a relational table, view, or synonym.
iii. The Informatica server queries the lookup table based on the lookup ports
in the transformation. It compares Lookup transformation port values to
lookup table column values based on the lookup condition. The result of
the Lookup would then be passed on to other transformations and targets.
• In the Lookup transformation LKP_CTRY_COD, the input field
SRC_COUNTRY_CODE is looked up against the COUNTRY_CODE field
of the Lookup table and if the Lookup is successful, then the corresponding
COUNTRY_CODE is returned as the output.
For more info on Lookup transformation and on Lookup caches, refer the Informatica
Designer Help and also the attached doc.
Lookup_Cache.doc
iv. A session can also be configured for handling specific database operations.
This is done by setting the “Treat rows as” field in the Session Wizard
dialog box that appears while session configuration.
• Open the Server Manager.
• Click on cifSIT9i under the Repositories tab
• Click on RepositoryConnect
• Provide the Username
• Expand the Ecif_Dev_map folder.
• Select the s_Map_CD_Country_code in the right pane, right click and select
edit.
• Properties for Sessions window open up.
• Pls refer fig below.
v. The “Treat rows as” option determines the treatment for all rows in the
session. The options provided here are insert, delete, update or data-driven.
vi. If the mapping for the session contains an Update Strategy transformation,
this field is marked Data Driven by default. If any other option is selected,
the Informatica Server ignores all Update Strategy transformations in the
mapping.
vii. The Data Driven option is selected if records destined for the same table
need to be flagged on occasion for one operation (for example, update), or
for a different operation (for example, reject).
viii. Records can be flagged for reject only with this option.
For more info on Update Strategy transformation and other settings for Update Strategy,
refer the Informatica Designer help.
ix. The Forward Rejected Rows option indicates whether the Update Strategy
transformation pass rejected rows to the next transformation or rejects them.
x. By default, Informatica Server forwards rejected rows to the next
transformation.
xi. The Informatica Server flags the rows for reject and writes them to the
session reject files.
xii. If the Forward Rejected Rows is not selected, the Informatica Server drops
rejected rows and writes them to the session log file.
• Update Strategy UPD_COUNTRY_CODE updates the target table
Shortcut_to_ECIF_COUNTRY which is a shortcut to the ECIF_COUNTRY
table.
g) Update Strategy Transformation (UPD_UPD_STG_COUNTRY)
• This receives the ISO_CTRY_COD and PROC_FLG fields from the filter
transformation FIL_NOTNULL_CTRY_COD when the COUNTRY_CODE
is NOT NULL.
• This updates the target table Shortcut_To_STG_COUNTRY which is a
shortcut to the STG_COUNTRY table.
The document provided below highlights the Best Practices that can be taken into
consideration either while designing mappings or when running sessions.
Informatica_Tuning_
Guide.doc
For info on the features in the Informatica Power Center 6.2, refer the link below:
http://www.itap.purdue.edu/ea/files/PMPC-62_release%20notes%20for%206.2.pdf
Pls refer the link below for enhancements related to Informatica PowerCenter 7.1
http://www.csn.no/nyhetsbrev/0402NyhetsbrevInfa_files/whats_new_PC7_dec2003.pdf