0% found this document useful (0 votes)
49 views

What Are Other Performance Tunings You Have Done in Your Last Project To Increase The Performance of Slowly Running Jobs?

Some key ways to improve performance in DataStage jobs include: 1. Staging data from databases using hash or sequential files for faster processing and recovery in case of job failure. 2. Tuning OCI stages and project tunables for optimal performance. 3. Sorting data in the database instead of DataStage when possible and removing unnecessary data early in jobs.

Uploaded by

abhimovies
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

What Are Other Performance Tunings You Have Done in Your Last Project To Increase The Performance of Slowly Running Jobs?

Some key ways to improve performance in DataStage jobs include: 1. Staging data from databases using hash or sequential files for faster processing and recovery in case of job failure. 2. Tuning OCI stages and project tunables for optimal performance. 3. Sorting data in the database instead of DataStage when possible and removing unnecessary data early in jobs.

Uploaded by

abhimovies
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 5

What are other Performance tunings you have done in your last project to

increase the performance of slowly running jobs?


1. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using

Hash/Sequential files for optimum performance also for data recovery in case job aborts.

2. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster

inserts, updates and selects.

3. Tuned the 'Project Tunables' in Administrator for better performance.

4. Used sorted data for Aggregator.

5. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better

performance of jobs

6. Removed the data not used from the source as early as possible in the job.

7. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS

queries

8. Converted some of the complex joins/business in DS to Stored Procedures on DS for faster

execution of the jobs.

9. If an input file has an excessive number of rows and can be split-up then use standard logic to

run jobs in parallel.

10. Before writing a routine or a transform, make sure that there is not the functionality required

in one of the standard routines supplied in the sdk or ds utilities categories.

Constraints are generally CPU intensive and take a significant amount of time to process. This

may be the case if the constraint calls routines or external macros but if it is inline code then

the overhead will be minimal.

11. Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the

unnecessary records even getting in before joins are made.

12. Tuning should occur on a job-by-job basis.

13. Use the power of DBMS.

14. Try not to use a sort stage when you can use an ORDER BY clause in the database.

15. Using a constraint to filter

16.Make every attempt to use the bulk loader for your particular database. Bulk loaders
are generally faster than using ODBC or OLE

1. By using hashfile stage we can improve the


performance.
In case of hashfile stage we can define the read cache
size
& write cache size but the default size is 128M.B.
2. By using active-to-active link performance also we can
improve the performance.
Here we can improve the performance by enabling the row
buffer, the default row buffer size is 128K.B.
3. By removing unwanted columns.
4. By selecting appropriate update actions.
5. In parallel by replacing transformer with copy or
filter
stage we can improve the performance.Because if you are
using more than 5 transformers in a stage the performance
will degrade,so to avoid transformer you can use copy or
filter.
6. In server by using linkpartitioner,linkcollectoe & IPC
stages also we can improve the performance.

DataStage Performance Tuning Tips


Some of the Key factors for the consideration

1. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the
server using Hash/Sequential files for optimum performance also for data
recovery in case job aborts.

2. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values
for faster inserts, updates and selects.

3. Tuned the 'Project Tunables' in Administrator for better performance.


4. Used sorted data for Aggregator.

5. Sorted the data as much as possible in DB and reduced the use of DS-Sort for
better performance of jobs

6. Removed the data not used from the source as early as possible in the job.
7. Worked with DB-admin to create appropriate Indexes on tables for better
performance of DS queries
8. Converted some of the complex joins/business in DS to Stored Procedures on DS
for faster execution of the jobs.
9. If an input file has an excessive number of rows and can be split-up then use
standard logic to run jobs in parallel.
10.Before writing a routine or a transform, make sure that there is not the
functionality required in one of the standard routines supplied in the sdk or ds
utilities categories.
Constraints are generally CPU intensive and take a significant amount of time
to process. This may be the case if the constraint calls routines or external
macros but if it is inline code then the overhead will be minimal.
11.Try to have the constraints in the 'Selection' criteria of the jobs itself. This will
eliminate the unnecessary records even getting in before joins are made.
12.Tuning should occur on a job-by-job basis.

13.Use the power of DBMS.


14.Try not to use a sort stage when you can use an ORDER BY clause in the
database.
15.Using a constraint to filter a record set is much slower than performing a
SELECT … WHERE….
16.Make every attempt to use the bulk loader for your particular database. Bulk
loaders are generally faster than using ODBC or OLE.
17.

18.Minimise the usage of Transformer (Instead of this use Copy modify Filter Row
Generator)

19.Use SQL Code while extracting the data


20.Handle the nulls

21.Minimise the warnings


22.Reduce the number of lookups in a job design

23.Use not more than 20stages in a job


24.Use IPC stage between two passive stages Reduces processing time

25.Drop indexes before data loading and recreate after loading data into tables
26.Gen\'ll we cannot avoid no of lookups if our requirements to do lookups
compulsory.
27.There is no limit for no of stages like 20 or 30 but we can break the job into
small jobs then we use dataset Stages to store the data.
28.IPC Stage that is provided in Server Jobs not in Parallel Jobs
29.Check the write cache of Hash file. If the same hash file is used for Look up and
as well as target disable this Option.
30.If the hash file is used only for lookup then \ enable Preload to memory\ . This
will improve the performance. Also check the order of execution of the
routines.

31.Don\'t use more than 7 lookups in the same transformer; introduce new
transformers if it exceeds 7 lookups.

32.Use Preload to memory option in the hash file output.


33.Use Write to cache in the hash file input.

34.Write into the error tables only after all the transformer stages.
35.Reduce the width of the input record - remove the columns that you would not
use.
36.

37.Cache the hash files you are reading from and writting into. Make sure your
cache is big enough to hold the hash files.

38.Use ANALYZE.FILE or HASH.HELP to determine the optimal settings for your


hash files

This would also minimize overflow on the hash file. (Need for Server Jobs)

1. If possible break the input into multiple threads and run multiple instances of
the job.

2. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the
server using Hash/Sequential files for optimum performance also for data
recovery in case job aborts.
3. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values
for faster inserts updates and selects.
4. Tuned the 'Project Tunables' in Administrator for better performance.

5. Used sorted data for Aggregator.


6. Sorted the data as much as possible in DB and reduced the use of DS-Sort for
better performance of jobs
7. Removed the data not used from the source as early as possible in the job.
8. Worked with DB-admin to create appropriate Indexes on tables for better
performance of DS queries
9. Converted some of the complex joins/business in DS to Stored Procedures on DS
for faster execution of the jobs.
10.If an input file has an excessive number of rows and can be split-up then use
standard logic to run jobs in parallel.
11.Before writing a routine or a transform make sure that there is not the
functionality required in one of the standard routines supplied in the sdk or ds
utilities categories.
Constraints are generally CPU intensive and take a significant amount of time
to process. This may be the case if the constraint calls routines or external
macros but if it is inline code then the overhead will be minimal.
12.Try to have the constraints in the 'Selection' criteria of the jobs itself. This will
eliminate the unnecessary records even getting in before joins are made.
13.Tuning should occur on a job-by-job basis.

14.Use the power of DBMS.


15.Try not to use a sort stage when you can use an ORDER BY clause in the
database.
16.Using a constraint to filter a record set is much slower than performing a
SELECT WHERE .
17.Make every attempt to use the bulk loader for your particular database. Bulk
loaders are generally faster than using ODBC or OLE.

You might also like