PowerCenter 7 Advanced: New Features
Education Services
Version PC7A-20040830
Informatica Corporation, 2004. All rights reserved.
Agenda
PowerCenter 7.1 Platforms and Connectivity PowerCenter 7.1 Options and Upgrades Workflow Manager: Session Editor Enhancement Workflow Monitor Enhancements (Workflow Monitor lab) Cross-Tool Enhancements Designer Enhancements (Client Usability, Flat File Lookup and Union, Creating XML Definitions and Transaction-Preserving Transformations labs) Workflow Manager: Error Logging Enhancement (Error Logging lab)
PowerCenter 7.1 Platforms and Connectivity
PowerCenter Server
PowerConnects:
Web Services SAS Plus in PowerCenter 7.1.1 MSMQ Hyperion Essbase HTTP Most PowerConnects on Linux
64-bit AIX 64-bit HP-UX Windows NT X AIX 4.3.3 X SuSE Linux (in PowerCenter 7.1.1)
Repository Server
PowerCenter Client
Windows NT X Windows 98 X
Added
3
X Discontinued
PowerCenter 7.1 Options
Data Profiling Data Cleansing Server Grid Real-Time/WebServices Partitioning Team-Based Development
Profile wizards, rules definitions, profile results tables, and standard reports Name and address cleansing functionality, including directories for US and certain international countries Server group management, automatic workflow distribution across multiple heterogeneous servers ZL Engine, always-on non-stop sessions, JMS connectivity, and real-time Web Services provider Data smart parallelism, pipeline and data parallelism, partitioning Version control, deployment groups, configuration management, automatic promotion Server engine, metadata repository, unlimited designers, workflow scheduler, all APIs and SDKs, unlimited XML and flat file sourcing and targeting, object export to XML file, LDAP authentication, role-based object-level security, metadata reporter, centralized monitoring
PowerCenter
PowerCenter 7.1 Upgrades
Install Base Version Upgrade
Data Profiling Data Cleansing Server Grid
PowerCenterRT v5 or v6 v7.1
New Customers
Real-Time /WebServices Partitioning
Purchasable options
PowerCenter v5 or v6 v7.1 PowerMart v5 or v6 v7.1
Note: PowerMart upgrades allow use of global repositories but extra repositories cost more.
Team-Based Development
PowerCenter
WorkFlow Manager: Session Editor Enhancement
v7 Session Editor
Properties and Config Object tabs have collapsible options rather than sub-tabs New Mapping tab consolidates Sources, Targets, Transformations and Partitions into one tab with two views: Transformations view Partitions view, with graphical display
Properties Tab
Collapsible options
Config Object Tab
Collapsible options
Mapping Tab - Transformations View
10
Mapping Tab Partitions View
Flag color indicates partition type
Folders for - Partition Points - Non-Partition Points
Graphical display shows mapping flow, partition points, partition type & number
11
Workflow Monitor Enhancements
12
Workflow Monitor Enhancements
Improved Task view
Workflow run tree display All workflows running on all servers at once Status messages Filters menu and toolbar with more options:
Workflows that ran in a specific time frame Sessions that ran during the last X hours
13
Copyright 2004 Informatica Corporation. All rights reserved.
Workflow Monitor Task View
v6
v7
14
Filter Toolbar
New Filter toolbar
Select type of tasks to filter Select servers to filter Filter tasks by specified criteria Display recent runs
15
Workflow Monitor Enhancements
Standard toolbar
Print preview
Toggle Navigator window on/off
Toggle Output window on/off
Server toolbar
Resume and recover workflow
16
Lab NF1 Workflow Monitor
17
Cross-Tool Enhancements
18
Cross-Tool Enhancements
Cool look Validation enhancements Object export/import Copying and comparing objects
19
Cool Look
Cool look (no borders to icons) default
Turn off in Tools => Customize, Toolbars tab
Many icons revised in toolbars and workspace
20
Validation Enhancements
Invalidation A parent object is invalidated when changes are made to its child object In v6, the parent object was marked invalid but the reason was not reported In v7, the reason is reported in the fetch.log Mass Validation In v6, the user had to fetch and validate each parent object individually In v7, the user can validate all the parent objects at the same time. This is useful to identify all invalidations caused by changing a shared child object. Available in Repository Manager Navigator tree, List View, and (for versioned repositories) in Results View
21
Object Export/Import
Full export/import of repository objects to/from XML Workflows, worklets, sessions, mappings, transformations Multiple objects in a single XML file Automatic handling of dependent objects
Objects can span multiple folders across a repository
22
Copying and Comparing Objects
23
Copying and Comparing Objects
Designer and Repository Manager copy conflicts now invoke the Copy Wizard Copy Wizard has several enhancements Workflow Manager and Repository Manager allow Compare Objects for workflows and tasks
24
Designer and Repository Manager Copy Conflict
v6 v7
Opens Copy Wizard
25
Copy Wizard Enhancements
v7
Simplified name resolution Compare conflicting objects Scope of resolution
26
Copy Wizard Enhancements contd
Compare objects before resolving a name conflict
v7
27
Compare (Diff) Workflows and Tasks
In Workflow Manager and Repository Manager
v7
28
Designer Enhancements
29
Designer Enhancements
Port Attribute Propagation Lookup Transformation with Flat Files
Union Transformation
Custom Transformation
XML Enhancements
Transaction-Preserving Transformations New Functions and Datatypes
30
Port Attribute Propagation
31
Port Attribute Propagation
When you change a port name, Designer automatically propagates references to that port in expressions, conditions, and other ports within the transformation Can also propagate changed port attributes forward and backward throughout the mapping
32
Port Attribute Propagation Steps 1-3
1. In Normal View, select one or more ports (use Shift or Ctrl key for multiple ports). Right-click and select Propagate Attribute.
2.
Dialog Box Opens
3. Select
Direction (forward / backward link path or both) Attributes to propagate (name, data type, precision, scale)
Options implicit dependencies to include (condition and / or expression). Disabled if Name attribute selected.
33
Port Attribute Propagation Steps 4-5
4. Preview (best practice) shows links to affected ports in green, unaffected ports in red
5.
Propagate updates:
I and I/O ports in forward link path O and I/O ports in backward link path
Selected attributes for all ports in the link path Port name in: Dependent expressions or conditions (if options selected) Associated port of a dynamic lookup Custom transformations
34
Lab NF2 Client Usability
35
Lookup Transformation with Flat Files
36
Lookup Transformation with Flat Files 1
In v7, you can use a flat file as source for a connected or
unconnected Lookup transformation
You can use any flat file definition in the repository or you can
import it
37
Lookup Transformation with Flat Files 2
When you import a flat file lookup source, the Designer invokes the Flat
File Wizard
38
Lookup Transformation Editor Flat File 1
39
Lookup Transformation Editor Flat File 2
40
Configuring a Session for Flat File Lookup
41
Union Transformation
42
Union Transformation
Merges data from multiple pipelines into one pipeline (similar to SQL Statement UNION ALL)
Passive Transformation Connected Mode only Ports Multiple input groups Single output group Ports in all input and output groups must match Usage Merging pipelines Does not remove duplicate rows
43
Union Transformation - Example
44
Lab NF3 Flat-File Lookup and Union
45
Custom Transformation
46
Custom Transformation 1
New framework for developing user defined transformations
Uses compiler-independent APIs C for server C++ for client Native transformation look and feel Supports: Active or passive transformations Multiple input and output groups Port-level metadata Transaction control Update strategy Partitioning
47
Custom Transformation 2
Calls an active or passive procedure defined in a dynamic linked library (DLL) or shared library
Active or Passive Transformation Connected Mode only Ports Mixed
Usage Perform transformation logic outside PowerCenter Uses Custom transformation functions Sorting, Aggregation
48
Custom Transformation 3
Custom transformation replaces the Advanced External Procedure (active) transformation External Procedure (passive) transformation remains
This supports Microsoft COM objects, including Java and
Visual Basic, as well as C and C++
49
XML Enhancements
50
XML Enhancements
XML Definition Wizard
Import from XML schemas (XML 2001 standard)
Generate XML views (groups)
XPath support
XML Editor
XML workspace displays XML views and relationships graphically Popup windows for schema details e.g. ComplexType hierarchies Data preview
Midstream XML Parser and Generator transformations Performance options for large XML targets
51
Import from XML Schemas
XML schemas are much richer than DTDs:
Written in XML Support multiple namespaces
(A namespace is a schema location, e.g. URL, where a group of related elements and attributes are defined)
Support many more datatypes
(44+ simpletypes plus user-defined complextypes)
Support substitution groups e.g. alternative root elements
More flexible, e.g.
Child elements occurring in any order
52
Multiple elements with the same name but different content
Elements with no content
Generate XML Views 1
XML definitions represent the XML hierarchy as groups, called XML views
XML Source Definition
XML Views (Groups)
53
Generate XML Views 2
The XML Wizard can generate XML views from rules (entity relationships, hierarchy relationships) or you can create custom XML views
54
Generate XML Views 3
For custom views, you can reduce metadata explosion by several options
55
XPath Support
XPaths list the path from the root element to an element or attribute with all intermediate components separated by /
XML Source Definition
56
XML Editor
Double-click XML definition in workspace or Right-click Edit XML Definition or from Source / Targets / Transformation menus Edit XML Definition
XML Metadata Navigator
XML Workspace
Components Pane - Properties - Actions - Data Values, if any (shows selected component)
Columns window (shows selected view)
57
XML Workspace XML Views
The XML Editors workspace displays the XML views (groups) as entities connected by lines and symbols indicating the relationships (parent/child, many:many, etc)
XML Source Definition XML Workspace
XML Views
58
XML Workspace View Schema Details
XML Editor has popup windows for Edit Namespace, ComplexType Hierarchy, Data Preview, etc.
59
Midstream XML Parser Transformation
Reads XML from a database table or message queue
In v6, had to use a mapplet with an XML Source Qualifier
60
MidStream XML Generator Transformation
Creates XML in a database table or message queue
In v6, had to use a mapplet interface
61
Performance Options for Large XML Targets
On Commit option allows user-defined commits to flush XML data On Commit Write to new document allows multiple XML output files Target cache size for XML tree (on overflow spills to disk)
Do not output empty elements avoids writing unnecessary elements
62
Lab NF4 Creating XML Definitions
63
Transaction-Preserving Transformations
64
Transaction-Preserving Transformations
In v.6, Aggregator, Rank, Joiner, and Sorter processed all input rows before emitting output rows In v.7, these and the new Custom transformation can process data one transaction at a time Benefits
Preserves transactions
Increased performance, less resource
65
Transformation Scope
Transformation Scope Most transformations Output
Row
As each row is processed
All input (only v6 option) Transaction (added in v7)
Agg, Rnk, Jnr, Srt When all rows processed
When commit encountered
Note: Custom transformations have whatever scopes are implemented by the developer
66
Example: Rank with Scope = All Input
In v6, a Rank transformation always has scope = All Input, dropping any incoming transactions Name Salary Name A4 Rank on All Input (Transactions are dropped) Salary $100K
A1
A2 A3 A4 COMMIT A5 A6
$80K
$40K $50K $100K $30K $60K
A7
A1 A6 A3
$90K
$80K $60K $50K
A2
A5
$40K
$30K
A7
67
$90K
Example: Rank with Scope = Transaction
In v7, a Rank transformation with scope = Transaction preserves incoming transactions Name A1 A2 A3 A4 Salary $80K $40K $50K $100K $30K $60K $90K Rank on a set of data bounded by transactions Name A4 A1 A3 A2 Salary $100K $80K $50K $40K $90K $60K $30K
COMMIT
A5 A6 A7
68
COMMIT
A7 A6 A5
Setting Transformation Scope
Transformation Scope
69
Lab NF5 Transaction-Preserving Transformations
70
New Functions and Datatypes
71
Soundex and Metaphone Functions
Used in expressions
Create index based on English pronunciations, e.g. SOUNDEX(Smith) = SOUNDEX(Smyth) Soundex
Encodes a string value into a four-character string (first input
character plus 3 numbers for unique consonants) Fast and standard
Metaphone
More accurate (but needs more computational power) Can specify length of string Algorithm not standard
72
New Datatypes
To handle Oracle, DB2, and SQL Server datatypes, PowerCenter 7 supports:
blob Large objects containing unstructured binary data
clob Large objects containing single-byte fixed-width character data
nclob Large binary objects containing single-byte or multiple-byte fixed-width character data xmltype Structured XML data (Oracle only)
73
WorkFlow Manager: Error Logging Enhancement
74
Error Types
Transformation error
Data row has only passed partway through the mapping
transformation logic
An error occurs within a transformation
Data reject
Data row is fully transformed according to the mapping
logic
Due to a data issue, it cannot be written to the target A data reject can be forced by an Update Strategy
75
Error Logging Off/On
Error Type
Transformation errors
Logging OFF (Default)
Written to session log then discarded
Logging ON
Appended to flat file or relational tables. Only fatal errors written to session log.
Data rejects
Appended to reject file Written to row error (one .bad file per target) tables or file
76
Setting Error Log Options
In Session task
Error Log Type Log Row Data Log Source Row Data
77
Error Logging Off Specifying Reject Files
In Session task
1 file per target
78
Error Logging Off Transformation Errors
Details and data are written to session log Data row is discarded If data flows concatenated, corresponding rows in parallel flow are also discarded
Transformation Error
79
Error Logging Off Data Rejects
Conditions causing data to be rejected include: Target database constraint violations, out-of-space errors, log space errors, null values not accepted Data-driven records, containing value 3 or DD_REJECT (the reject has been forced by an Update Strategy) Target table properties reject truncated/overflowed rows
Sample reject file
INSERT UPDATE DELETE REJECT 0,1313,Regulator System,Air Regulators,250.00,150.00 1,1314,Second Stage Regulator,Air Regulators,365.00,265.00 2,1390,First Stage Regulator,Air Regulators,170.00,70.00 3,2341,Depth/Pressure Gauge,Small Instruments,105.00,5.00
80
Log Row Data
Logs:
Session metadata
Reader, transformation, writer and user-defined errors For errors on input, logs row data for I and I/O ports For errors on output, logs row data for I/O and O ports
81
Logging Errors to a Relational Database 1
Relational Database Log Settings
82
Logging Errors to a Relational Database 2
PMERR_SESS: Stores metadata about the session run
such as workflow name, session name, repository name etc
PMERR_MSG: Error messages for a row of data are
logged in this table
PMERR_TRANS: Metadata about the transformation such
as transformation group name, source name, port names with data types are logged in this table
PMERR_DATA: The row data of the error row as well as
the source row data is logged here. The row data is in a string format such as [indicator1: data1 | indicator2: data2]
83
Error Logging to a Flat File 1
Creates delimited Flat File with || as column delimiter
Flat File Log Settings (Defaults shown)
84
Logging Errors to a Flat File 2
Format: Session metadata followed by de-normalized error information Sample session metadata
********************************************************************** Repository GID: 510e6f02-8733-11d7-9db7-00e01823c14d Repository: RowErrorLogging Folder: ErrorLogging Workflow: w_unitTests Session: s_customers Mapping: m_customers Workflow Run ID: 6079 Worklet Run ID: 0 Session Instance ID: 806 Session Start Time: 10/19/2003 11:24:16 Session Start Time (UTC): 1066587856 **********************************************************************
Row data format
Transformation || Transformation Mapplet Name || Transformation Group || Partition Index || Transformation Row ID || Error Sequence || Error Timestamp || Error UTC Time || Error Code || Error Message || Error Type || Transformation Data || Source Mapplet Name || Source Name || Source Row ID || Source Row Type || Source Data
85
Log Source Row Data 1
Separate checkbox in session task Logs the source row associated with the error row Logs metadata about source, e.g. Source Qualifier, source row id, and source row type
86
Log Source Row Data 2
Source row logging is not available downstream of an Aggregator, Rank, Joiner, Sorter (where output rows are not uniquely correlated with input rows)
Source row logging available Source row logging not available
87
Lab NF6 Error Logging
88
89