DataStage Tricks & Tips
DataStage Tricks & Tips
Jim Tsimis
Advanced Technical Support
Mike Carney
Advanced Consulting Group
Michael Ruland
Field Engineering
Steven Totman
Product Manager Connectivity
Agenda
Designer Session
General Debug Tips & Tricks Handling Complex Flat Files Joy of the Command Line Transaction Handling Tips & Tricks
Managing transactions in Server, Enterprise Edition, Enterprise MVS Edition and RTI
Performance Tuning
General Debug
Stage Variables are always executed and drive the transformer stage
General Debug
Sequential File stage: FILTER OPTION - use this to specify that the data is passed through a filter program before being written to a file or files on output or before being placed in a dataset on input.
Input column definitions (3 columns) The selected complex column is decoded into individual columns
od x A x
10
11
Combine Records stage: combines records, in which particular key-column values are identical, into vectors of subrecords.
Column Import stage: imports data from a single column and outputs it to one or more columns. Column Export stage: exports data from a number of columns of different data types into a single column of data type string or binary.
12
Promote
Make
Vectors
Make Split A vector is a 1 dimensional array of any type except tagged. Elements of a vector are of the same type, and are numbered from 0. A vector can be of fixed or variable length. For fixed length vectors the length is explicitly stated, for variable length ones a property defines a link field which gives the length at run time.
13
Enterprise Edition
Combining Vectors and Subrecords
14
15
16
This script is available from your account team and backs up all projects on an identified server
17
Automated Diagramming
"E:\Program Files\Ascential\DataStage\dsdesign.exe" /h=YourHost /u=UserID /p=***** YourProject YourJobName /saveasbmp=e:\Diagrams\YourProject\JobDesigns\YourJobName.bmp
A script is available from Ascential that will obtain all the jobs within a selected project and create a bmp diagram for each job into a selected folder. This can be an effective way to create a file that MetaStage can later use to present a graphical representation of the DataStage job design in an HTML or XML report.
18
Job Control Web Service in HTML using Web Service Behavior Job Control Web Service in Office XP documents Job Control Web Service in VBScript
19
20
Used to specify whether to continue or to roll back if a link is skipped due to a constraint on it not being satisfied.
Used to specify whether or not to continue or rollback on failure of the SQL statement.
21
The DataStage Enterprise Edition MVS (XE/390) Business Rule Stage provides the ease of graphical construction through drag and drop facilities as well as the ability to customize the processing rules to meet specific demands.
22
T4 T3 EOU
T2 T4
Stock Account
23
24
Job acts like web service via Real Time Integration Server (RTI)
25
26
Creates an XML template that can be used as a starter job. Facilities exist to allow consulting to further customize the template such that token values can be replaced during job creation.
27
28
Pre-configured Stages
Implemented through Shared containers Configure Stage with parameters
Create empty job with parameters When developing new job.
Start with empty job with parameters Drag/Drop preconfigured stages holding CTRL key. Minimal configuration required.
29
30
31
32
33
35
APT_BUFFER_MAXIMUM_MEMORY default is 3M
Increase for large memory configurations to avoid buffering to disk
APT_BUFFER_DISK_WRITE_INCREMENT default is 1M
Increase to create larger bursts of I/O during buffering to disk
Controlling the Buffers in DataStage Server Set BUFFERSIZE and TIMEOUT for intra/inter-partitioning default is 128K
Set for project in administrator or in job properties for a particular job
36
APT_PM_PLAYER_TIMING
Used to understand the CPU characteristics of a data flow
APT_RECORD_COUNTS
Used to check for data skew across data partitions
37
Performance Tuning
The Configuration File
Tells DataStage how to exploit the underlying computer hardware. For any given system there is not one ideal config file since in a given job there is a lot of variance about how they work on that system. General hints: (assumes SMP environment)
avoid using the disk that are used for landing input and output data for scratch and resource disk Do not use NFS or other remotely mounted disk for scratch disk Understand the file system underneath the mount points being used by the configuration file Separate the I/O between nodes as much as possible to provide the maximum I/O bandwidth Run your application using various configurations to understand its complexion during volume testing before moving to production.
38
Launches In December
39
Tuesday 9:15am
Operator Tips & Tricks Session
Upgrades & Installs Version Control Production Automation Running in a High Availability environment
40
EOD/EOT
E388 8195 92A2 4086 9699 4081 A3A3 8595 8489 9587 40A3 8889 A240 A285 A2A2 8996 9540 ! Please let us know if you have any comments or suggestions regarding this material.