Connectivity Guide For Oracle Databases
Connectivity Guide For Oracle Databases
Version 8 Release 7
SC19-3441-00
SC19-3441-00
Note Before using this information and the product that it supports, read the information in Notices and trademarks on page 171.
Copyright IBM Corporation 2008, 2011. US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Chapter 1. Migrating jobs to use connectors . . . . . . . . . . . . . 1
Using the user interface to migrate jobs . Using the command line to migrate jobs . . . . . . . . 2 . 3 Oracle environment logging . . . Properties that control job failure . Messages . . . . . . . . . Reference . . . . . . . . . . Data type mapping and Oracle data Dictionary views . . . . . . Environment variables. . . . . . . . . . . . . types . . . . . . . . . . . . . . . . . . . . . . . . . 59 59 60 70 70 80 82
. . . . 111
. 112 113 . 113 . 114 . 114 . 115 . 115 . 115 . 121 . 121 . 121 122 . 122 . 123 . 123 . 124 . 126 . 127 . 127
Functionality of Oracle OCI stages . . . . . Configuration requirements of Oracle OCI stages The Oracle Connection . . . . . . . . . Defining the Oracle Connection . . . . . . Connecting to an Oracle Database . . . . Defining Character Set Mapping . . . . . . Defining Input Data . . . . . . . . . . About the Input Page. . . . . . . . . Reject Row Handling . . . . . . . . . Writing Data to Oracle . . . . . . . . . SQL statements and the Oracle OCI stage . . Accessing the SQL builder from a server stage Using Generated SQL Statements . . . . . Using User-Defined SQL Statements . . . . Defining Output Data . . . . . . . . . About the Output Page . . . . . . . . Reading Data from Oracle . . . . . . . . Using Generated Queries . . . . . . . Example of a SQL Select Statement . . . .
iii
Using User-Defined Queries . DATE Data Type Considerations . Oracle Data Type Support . . . Character Data Types. . . . Numeric Data Types . . . . Additional Numeric Data Types Date Data Types . . . . . Miscellaneous Data Types . . Handling $ and # Characters . .
. . . . . for . . .
. . . . . . . . . . . . . . . Oracle . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
135
. 135 . . . . . . . . 135 136 136 136 136 136 137 137
141
. . . . . . . . . . . . . . . . . 141 141 142 142 143 143 144 144 144 145 146 146 147 148 148 148 149
Filter Expression Panel . . . . . . . . Insert Page . . . . . . . . . . . . . Insert Columns Grid . . . . . . . . . Update Page . . . . . . . . . . . . Update Column Grid . . . . . . . . . Filter Panel . . . . . . . . . . . . Filter Expression Panel . . . . . . . . Delete Page . . . . . . . . . . . . . Filter Panel . . . . . . . . . . . . Filter Expression Panel . . . . . . . . Sql Page . . . . . . . . . . . . . . Resolve Columns Grid . . . . . . . . Expression Editor . . . . . . . . . . . Main Expression Editor . . . . . . . . Calculation/Function/Case Expression Editor Expression Editor Menus . . . . . . . Joining Tables . . . . . . . . . . . . Specifying Joins . . . . . . . . . . Join Properties Dialog Box . . . . . . . Alternate Relation Dialog Box . . . . . . Properties Dialogs . . . . . . . . . . . Table Properties Dialog Box . . . . . . SQL Properties Dialog Box . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
149 149 149 150 150 151 151 151 151 151 151 152 153 153 156 157 159 160 160 161 161 161 162
163
Reading command-line syntax . . . . 165 Product accessibility . . . . . . . . 167 Contacting IBM . . . . . . . . . . 169
Notices and trademarks . . . . . . . 171 Links to non-IBM Web sites . . . . . 175 Index . . . . . . . . . . . . . . . 177
iv
Procedure
1. Choose Start > Programs > IBM InfoSphere Information Server > Connector Migration Tool. 2. In the Log on window, complete these fields: a. In the Host field, enter the host name of the services tier. You can specify an optional port by separating it from the host name with a colon. The host name that you specify here is the same one that you specify when you start the Designer client, for example, mymachine:9080). b. In the User name field, enter your InfoSphere DataStage user name. c. In the Password field, enter your InfoSphere DataStage password. d. In the Project field, enter the name of the project. To access an InfoSphere DataStage server that is remote from the domain server, specify the project name in full as server:[port]/project. As an alternative, you can press the button adjacent to the Project field to display a dialog box from which you can select the fully-qualified project name. e. Click OK. An icon indicates the status of each job. A gray icon indicates that the job cannot be migrated. A gray icon with a question mark indicates that the job might be successfully migrated. 3. Display the jobs and stages to consider for migration: v Choose View > View all jobs to display all of the jobs in the project. This is the default view. v Choose View > View all migratable jobs to display all of the jobs that are in the project and that can be migrated to use connectors. Jobs that do not contain any stages that can be migrated are excluded from the job list. v Choose View > View jobs by stage types to open the Filter by stage type window. 4. Perform the following steps to analyze jobs: a. Highlight the job in the job list. b. Expand the job in the job list to view the stages in the job. c. Select one or more jobs, and click Analyze. After analysis, the color of the job, stage, or property icon indicates whether or not it can be migrated. A green icon indicates that the job, stage, or property can be migrated. An red icon indicates that the job or stage cannot be migrated. An orange icon indicates that a job or stage can be partially migrated and that a property in a stage has no equivalent in a connector. A gray icon indicates that the job or stage is not eligible for migration. Note: The Connector Migration Tool displays internal property names, rather than the names that the stages display. To view a table that contains the
internal name and the corresponding display name for each property, from the IBM InfoSphere DataStage and QualityStage Designer client, open the Stage Types folder in the repository tree. Double-click the stage icon, and then click the Properties tab to view the stage properties. 5. Click Preferences and choose how to migrate the job: v Choose Clone and migrate cloned job to make a copy of the job and then migrate the copy. The original job remains intact. v Choose Back up job and migrate original job to make a copy of the job and then migrate the original job. v Choose Migrate original job to migrate the job without making a backup. 6. Select the jobs and stages to migrate, and then click Migrate. The jobs and stages are migrated and are placed in the same folder as the original job. If logging is enabled, a log file that contains a report of the migration task is created. After a job is successfully migrated, a green checkmark displays beside the job name in the Jobs list to indicate that the job has been migrated.
Procedure
1. From the IBM InfoSphere DataStage client command line, go to the <InformationServer>\Clients\CCMigrationTool directory. 2. Enter the command CCMigration, followed by the following required parameters: v -h host:port, where host:port is the host name and port of the InfoSphere DataStage server. If you do not specify a port, the port is 9080 by default. v -u user name, where user name is the name of the InfoSphere DataStage user. v -p password, where password is the password of the InfoSphere DataStage user v -P project, where project is the name of the project to connect to. To specify an InfoSphere DataStage server that is remote from the domain server, specify the fully qualified project name by using the format server:[port]/project. v One of the following: -M If you specify this parameter, the original jobs are migrated, and backup jobs are not created. -B job name extension, where job name extension is a set of alphanumeric characters and underscores. If you specify this parameter, the Connector
Chapter 1. Migrating jobs to use connectors
Migration Tool creates backup jobs, names the backup jobs source job name+job name extension, and then migrates the original jobs. The backup jobs are saved in the same location in the repository as the source jobs. - C job name extension, where job name extension is a set of alphanumeric characters and underscores. If you specify this parameter, the Connector Migration Tool clones the source jobs, names the cloned jobs source job name+job name extension, and then migrates the cloned jobs. The cloned jobs are saved in the same location in the repository as the source jobs. If you specify one of these options, the migration proceeds without requiring any additional user input. If you do not specify -M, -B, or - C, the user interface is displayed so that you can make additional choices for how to migrate the jobs. 3. Optional: Enter any of the following optional parameters: v -L log file, where log file is the file name and path for the log file that records the results of the migration. v -S stage types, where stage types is a comma-separated list of stage types. By default, the Connector Migration Tool migrates all stage types. Use this parameter to migrate only jobs that contain the specified stage types. If you specify both the -S and -J parameters, only the specified stage types within the specified jobs are migrated. If you specify the -S parameter and do not specify the -C, -M or -B parameter, only jobs that contain the specified stage types appear in the job list that is displayed in the user interface. Limiting the jobs that are displayed can significantly reduce the startup time of the Connector Migration Tool. v -J job names, where job names is a comma-separated list of jobs. By default, the Connector Migration Tool migrates all eligible jobs in the project. Use this parameter to migrate only specific jobs. If you specify the -J parameter and do not specify the -C, -M or -B parameter, only the specified jobs appear in the job list that is displayed in the user interface. Limiting the jobs that are displayed can significantly reduce the startup time of the Connector Migration Tool. v -c shared container names, where shared container names is a comma-separated list of shared containers. By default, the Connector Migration Tool migrates all eligible shared containers in the project. Use this parameter to migrate only specific shared containers. If you specify the -c parameter and do not specify the -C, -M, or -B parameter, only the specified shared containers appear in the job list that displays in the user interface. Limiting the shared containers that display might significantly reduce the startup time of the Connector Migration Tool. v -R If you specify this parameter, the Connector Migration Tool reports the details of the migration that would occur if the specified jobs were migrated, but does not perform an actual migration. The details are reported in the log file that is specified by using the -L parameter. v -A If you specify this parameter, the Connector Migration Tool adds an annotation to the job design. The annotation describes the stages that were migrated, the job from which the stages were migrated, and the date of the migration. v -djob dump file, where job dump file is the file name and path for a file where a list of jobs, shared containers, and stages is written. Using a job dump file is helpful when you want to determine which jobs are suitable for migration. You can use the -d parameter with the -J, -c, and -S parameters to list particular jobs, shared containers, and stage types, respectively.
v -V If you specify this parameter, the Connector Migration Tool specifies the target connector variant for migrated stages. The format of the list is a comma-separated list containing {StageTypeName=Variant}. v -v If you specify this parameter with the -d command, the values of stage properties will be included in the report. If omitted, the report only contains stage names and types, but not the stage properties. This option is useful to identify jobs that have stages with certain property values. If this option is specified, then s is ignored. v -T If you specify this parameter, the Connector Migration Tool enables the variant migration mode. All connector stages found in jobs and containers whose stage type matches those listed by the V command are modified. v -U If you specify this parameter, the Connector Migration Tool enables the property upgrade migration mode. All connector stages found in jobs and containers whose properties match the conditions specified in the StageUpgrade.xml file are upgraded.
Example
The following command starts the Connector Migration Tool, connects to the project billsproject on the server dsserver as user billg, and migrates the jobs db2write and db2upsert:
CCMigration -h dsserver:9080 -u billg -p padd0ck -P billsproject -J db2write,db2upsert -M
DB2 Connector Oracle Connector ODBC Connector ODBC Connector Oracle Connector Teradata Connector
ODBC Enterprise Oracle OCI Load Oracle Enterprise Teradata Teradata Teradata Teradata API Enterprise Load Multiload
WebSphere MQ
WebSphere MQ Connector
To use any of the deprecated stage types in new jobs, drag the stage type from theRepository to the canvas or to the palette. In the Repository tree, navigate to Stage Types. Under Stage Types, open the Parallel or the Server subdirectory, depending on which stage you want to use. Drag the stage type to the job canvas or to the palette.
If you set both the TNS_ADMIN and ORACLE_HOME environment variables, the TNS_ADMIN environment variable takes precedence over the ORACLE_HOME environment variable for locating the tnsnames.ora configuration file. The TNS_ADMIN and the ORACLE_HOME environment variables are not mandatory. However, if one or both are not specified, you cannot select a connect descriptor name to define the connection to the Oracle database. Instead, when you define the connection, you must provide the complete connect descriptor definition or specify an Oracle Easy Connect string. Note: If you use the Oracle Basic Instant Client or the Basic Lite Instant Client, the tnsnames.ora file is not automatically created for you. You must manually create the file and save it to a directory. Then specify the location of the file in the TNS_ADMIN environment variable.
Procedure
1. Importing Oracle metadata on page 14 2. Creating a job that includes the Oracle connector and the required links on page 15
10
3. 4. 5. 6.
Defining a connection to an Oracle database on page 29 Setting up column definitions on a link on page 30 Specifying the read mode and the data source on page 33 Compiling and running a job on page 38
Procedure
1. Importing Oracle metadata on page 14 2. Creating a job that includes the Oracle connector and the required links on page 15 3. Defining a connection to an Oracle database on page 29 4. Setting up column definitions on a link on page 30. 5. Specifying the write mode and the target table on page 34 6. Optional: Rejecting records that contain errors on page 36 7. Compiling and running a job on page 38
11
In a normal lookup, the connector runs the specified SELECT statement or PL/SQL block only once; therefore, the SELECT statement or PL/SQL block cannot include any input parameters. The Lookup stage searches the retrieved result set data and looks for matches for the parameter sets that arrive in the form of records on the input link to the Lookup stage. A normal lookup is also known as an in-memory lookup because the lookup is performed on the cached data in memory. In a sparse lookup, the connector runs the specified SELECT statement or PL/SQL block once for each parameter set that arrives in the form of a record on the input link to the Lookup stage. The specified input parameters in the statement must have corresponding columns defined on the reference link. Each input record includes a set of parameter values that are represented by key columns that the connector sets on the bind variables in the SELECT statement or PL/SQL block, and then the connector runs the statement or block. The result of the lookup is routed as one or more records through the reference link from the connector back to the Lookup stage and from the Lookup stage to the output link of the Lookup stage. A sparse lookup is also known as a direct lookup because the lookup is performed directly on the database.
12
Procedure
1. Complete these steps: a. Add the Oracle connector to the job, create a reference link from the Oracle connector to the Lookup stage, and then double-click the connector to open the properties. b. In the Lookup Type field, choose Normal or Sparse. 2. Complete these tasks: a. Defining a connection to an Oracle database on page 29 b. Specifying the read mode and the data source on page 33 c. Setting up column definitions on a link on page 30 d. Compiling and running a job on page 38
Procedure
1. To grant SELECT access to a single dictionary view, issue the following statement:
GRANT SELECT ON dictionary_view TO user_name
where dictionary_view is the name of the view and user_name is the user name with which the connector connects to the database. 2. To use a role to grant a user SELECT access to multiple dictionary views, use statements that are similar to the following sample statements. These sample statements show how to create a role, grant access to two dictionary views, and then assign the role to a user. To use these sample statements, replace role_name, dictionary_view, and user_name with the names that are specific to your configuration and issue one GRANT SELECT ON statement for each dictionary view.
13
CREATE ROLE role_name GRANT SELECT ON dictionary_view1 TO role_name GRANT SELECT ON dictionary_view2 TO role name GRANT role_name TO user_name
Procedure
1. From the IBM InfoSphere DataStage and QualityStage Designer client, choose Import > Table Definitions > Start Connector Import Wizard. 2. Select the variant of the Oracle connector that corresponds to the release of the Oracle client that you installed on the InfoSphere DataStage server. 3. Click the down arrow in the Server field to obtain a list of Oracle services, and then do one of the following: v Select the Oracle service to connect to. If the list is empty, the connector cannot locate the Oracle tnsnames.ora file. The connector tries to locate the file by checking the TNS_ADMIN or ORACLE_HOME environment variables. v Enter the complete content of the connect descriptor, as it would appear in the Oracle tnsnames.ora file; or enter the Easy Connect string that defines the connection to the Oracle database. 4. In the Username and Password fields, enter the user ID and password to use to authenticate with the Oracle service. By default, the connector is configured for Oracle database authentication. This form of authentication requires that the values that you specify in the Username and Password fields match the credentials that are configured for the user in the Oracle database. 5. Optional: In the Use External authentication field, select Yes. This form of authentication requires that the user be registered in Oracle and identified as a user who is authenticated by the operating system. Note: When the Connector Import Wizard or the connector stage dialog invokes the Oracle connector to perform a design-time operation such as importing metadata, viewing data, testing a connection, or enumerating services, the connector runs within the ASB agent process. This process runs on the computer where the InfoSphere DataStage server is installed. On a computer that runs Microsoft Windows, the ASB agent runs under the Local System account; on a computer that runs Linux or UNIX, the ASB agent runs under the root system account. Therefore, choosing Use External Authentication causes design-time operations to use the built-in system accounts to authenticate with the database, a scenario that you typically want to avoid. 6. Click Test connection, and then click Save to save the connection. If you do not save the connection definition in the repository, only InfoSphere DataStagecan access the imported metadata; other components and tools of InfoSphere Information Server might have no access to the metadata. 7. In the Host name and Database name fields, specify the names of the repository objects under which to import the metadata. If you choose the values that the connector provides as defaults, the objects are created in the
14
metadata repository if they don't already exist. Alternatively, choose from the list of Host and Database objects that are already present in the repository. To define new host and database object, click New location... The names of the Host and Database objects do not need to match the actual names of the Oracle server host system and database. However, using matching names makes it easy to track the imported metadata. The Host object in the repository serves as a logical container of the Database objects, which in turn serve as containers for the imported metadata objects. 8. The wizard provides three levels of filtering so that you can narrow down the list of objects to import. Perform these steps to specify one or more filters to use: a. The Schema filter displays a list of all of the table owners in the database. Select a schema as the first filter. b. The table types filter displays a list of schema objects types to include in the results. By default, the options Include views, Include tables, Include materialized views, and Include index-organized tables are selected. You can also select Include external tables and Include synonyms in the results. The types filter is the second filter. c. In the Name filter, enter additional criteria that filters the list of objects by name. You can use the percent sign (%) as a wildcard character. For example, to obtain a list of objects that contain the word blue in their names, enter %blue% in the Name filter. 9. On the Selection panel, select one or more tables to import. To import definitions for the primary keys, foreign keys and indexes that are associated with the selected tables, check the related boxes. To view the current data in a table, select the table and then click View data. To select tables that have a primary key or foreign key relationship with the selected table, click Related tables. 10. Click Import, and then select the location in the metadata repository under which to import the table definitions.
Creating a job that includes the Oracle connector and the required links
Before you can read, write, or look up data in an Oracle database, you must create a job that includes the Oracle connector, add any required additional stages, and create the necessary links.
Procedure
1. From the IBM InfoSphere DataStage and QualityStage Administrator Designer client, select File > New from the menu. 2. In the New window, select the Parallel or Server Job icon, and click OK. 3. Follow these steps to add the Oracle connector to the job: a. In the Designer client palette, select the Database category. b. Locate Oracle in the list of available databases, and click the down arrow to display the available stages. c. Drag the Oracle connector to the canvas. 4. Create the necessary links and add additional stages for the job: v For a job that reads Oracle data, create the next stage in the job, and then create an output link from the Oracle connector to the next stage. v For a job that writes Oracle data, create one or more input links from the previous stage in the job to the Oracle connector. If you use multiple input
Chapter 3. Oracle connector
15
links, you can specify the for the input data and the order for the record processing. If you want to manage rejected records, add a stage to hold the rejected records, and then add a reject link from the Oracle connector to that stage. v For job that looks up Oracle data, create a job that includes a Lookup stage, and then create a reference link from the Oracle connector to the Lookup stage.
Record ordering
If the connector has multiple input links, you can control the processing order of input data across links. Use one of these methods to control the processing order of input data: Specifying the order of input data by input link: When the connector uses multiple input links, you can control the sequence in which records are processed by ordering the links. About this task The order in which you specify the links on the Link Ordering tab determines the order in which the records in the links are processed for each unit of work. Procedure 1. From the stage editor, select an input link. 2. Click the Link Ordering tab. 3. Click a link that you want to reorder, and use the arrow buttons to move the link up or down. Specifying the order for records: If a connector has multiple input links, you can control the order of record processing by specifying the order for the records. Procedure 1. Double-click the connector stage icon to open the connector properties. 2. Set Record ordering to one of the following: v All records specifies that all records are processed from each link in order. v First record specifies that one record is processed from each link, in turn, until all records from all links have been processed. v Ordered specifies that records are processed from each link in an order you specify by using the Key column, Null order, and Case-sensitive properties. 3. If you choose Ordered, complete these additional properties: a. Key column Specify the name of the column to use as the sort key. b. Sort order Specify Ascending or Descending. c. Null order Specify where to sort null values in the sort order. The choices are Before or After. d. Case sensitive Specify whether or not text comparisons are case-sensitive. The choices are Yes or No.
16
Configuring nodes
To modify the number of nodes on which a job runs, edit the configuration file that specifies nodes, node pools, and constraints.
Procedure
1. Create a new configuration file or edit an existing configuration file. 2. Set the APT_CONFIG_FILE environment variable to the full path of the configuration file that you want to use. 3. On the Advanced tab of the connector stage properties, perform these tasks: a. Verify that the Execution mode property is set to Parallel, which is the default setting. b. Optional: Use the Node pool and resource constraints and the Node map constraint fields to restrict the nodes on which the connector runs. Note: If you plan to use the Oracle partitions read method to read data from a partitioned Oracle table or if you plan to use the Oracle connector partition type to write data to a partitioned Oracle table, do not specify any
17
constraints. As an alternative include the ORACLE resource constraint in the configuration file to specify the nodes on which you want to run the Oracle connector. Examples: Constraining nodes in a parallel job: These examples present a sample parallel configuration file and show how to use node pools to constrain the nodes on which the connector runs. In the parallel configuration file, you specify nodes and node pools. Then you use one of the following methods to configure the connector to run on only a subset of the nodes that are specified in the parallel configuration file: v Define a node pool in the parallel configuration file. Then on the Advanced tab of the connector properties, select that node pool. The connector runs on only the nodes that are members of that node pool. v On the Advanced tab of the connector properties, select specific nodes. The following example parallel configuration file defines four nodes: node1, node2, node3, and node4. Each node runs on MYHOST, which is the computer that runs the IBM InfoSphere DataStage server. The file defines four node pools: pool1, pool2, pool3, and pool4, as well as the default "" pool. The APT_CONFIG_FILE environment variable points to this parallel configuration file.
{ node "node1" { fastname "MYHOST" pools "" "group1" "group2" "group3" resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""} resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""} } node "node2" { fastname "MYHOST" pools "" "group1" "group2" resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""} resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""} } node "node3" { fastname "MYHOST" pools "" "group1" "group3" resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""} resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""} } node "node4" { fastname "MYHOST" pools "" "group1" "group2" resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""} resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""} } }
For the first example, assume that you perform these steps: 1. On the Advanced tab of the stage properties, select Node pool and resource constraints. 2. In the Constraint field, select Node pool. 3. In the Name field, select group3.
18
The connector is restricted to running on only node1 and node3 because only these two nodes belong to the group3 node pool. For the second example, assume that you perform these steps: 1. On the Advanced tab of the stage properties, select Node map constraint. 2. Select node1 and node2. The connector is restricted to running on only node1 and node2 because those nodes are explicitly specified.
Procedure
1. On the Advanced tab, set Execution mode to Parallel. 2. On the Properties tab, set Enable partitioned reads to Yes. 3. Set Read mode to Select. 4. Use one of these methods to define the SELECT statement that the connector will use at runtime: v Set Generate SQL at runtime to Yes, and then enter the name of the table in the Table name property. Use the syntax schema_name.table_name, where schema_name is the owner of the table. If you do not specify schema_name, the schema that belongs to currently connected user is used. The connector automatically generates and runs the SELECT * FROM schema_name.table_name statement. Note: To read data from a particular partition of a partitioned table, set the Table scope property to Single partition, and specify the name of the partition in the Partition name property. The connector then automatically adds a PARTITION(partition_name) clause to the generated SELECT statement. To read data from a particular subpartition of the composite partitioned table, set the Table scope property to Single subpartition and specify the name of the subpartition in the Subpartition name property. The connector then automatically adds a SUBPARTITION(subpartition_name) clause to the generated SELECT statement. v Set Generate SQL at runtime to No, and then specify the SELECT statement in the Select statement property. You can enter the SQL statement or enter
Chapter 3. Oracle connector
19
the fully-qualified file name of the file that contains the SQL statement. If you enter a file name, you must also set Read select statement from file to Yes. 5. On the Partitioning tab, set the Partitioned reads method property to the partitioning method that you want to use. The default partitioning method is Rowid range. 6. To provide the input values that the partitioned read method uses, complete these steps: a. In the Table name for partitioned reads property, specify the name of the table that the partitioned read method uses to define the subsets of data that each node reads from the source table. Note: If you do not specify a table name, the connector uses the value of the Generate SQL at runtime property to determine the table name. If Generate SQL at runtime is set to Yes, the connector uses the table name that is specified in the Table name property. If Generate SQL at runtime is set to No, the connector looks at the SELECT statement that is specified in the Select statement property and uses the first table name that is specified in the FROM clause. b. If you choose the Rowid range or the Minimum and maximum range partitioned read method, in the Partition or subpartition name for partitioned reads property, specify the name of partition or subpartition that the partitioned read methods uses. Note: If you do not specify a value for the Partition or subpartition name for partitioned reads property, the connector uses the entire table as input for the partitioned read method. When the connector is configured to read data from a single partition or subpartition, you typically specify the name of the partition or subpartition in the Partition or subpartition name for partitioned reads property. Then the connector analyzes only the data that belongs to that partition or subpartition. This process typically results in a more even distribution of data and a more efficient use of nodes. c. If you choose the Modulus or the Minimum and maximum range partitioned read method, in the Column name for partitioned reads, enter the name of the column from the source table to use for the method. The column must be an existing column in the table, must be of NUMBER(p) data type, where p is the number precision, and must have a scale of zero. Support for partitioned read methods: The connector supports these partitioned read methods: Rowid range, Rowid round robin, Rowid hash, Modulus, Minimum and maximum range, and Oracle partitions. For all partitioned read methods except the Oracle partitions method, the connector modifies the WHERE clause in the specified SELECT statement. If the WHERE clause is not included in the specified SELECT statement, the connector adds a WHERE clause. For the Oracle partitions method, the connector modifies the specified SELECT statement by adding a PARTITON(partition_name) clause. When the specified SELECT statement contains subqueries, the connector modifies the first SELECT...FROM subquery in the SELECT statement.
20
Rowid range Every Oracle table includes the ROWID pseudo column that contains a rowid value that uniquely identifies each row in the table. When you use the Rowid range method, the connector performs these steps: 1. The connector queries the DBA_EXTENTS dictionary view to obtain storage information about the source table. 2. The connector uses the information from the DBA_EXTENTS dictionary view to define a range of ROWID values for each node. 3. At runtime, each node runs the specified SELECT statement with a slightly modified WHERE clause. The modified WHERE clause ensures that the node reads only the rows that have ROWID values in its assigned range. If the specified SELECT statement does not have a WHERE clause, the connector adds it. The connector does not support the Rowid range method in these cases: v When select access is not granted on the DBA_EXTENTS dictionary view for the currently connected user. v When the connector reads from an index-organized table v When the connector reads from a view In these cases, the connector logs a Warning message and automatically uses the Rowid hash method, which does not have these restrictions. Rowid round robin The Rowid round robin method uses the ROWID_ROW_NUMBER function from the DBMS_ROWID package to obtain the row number of the row within the table block in which the row resides and uses the MOD function on the row number to distribute rows evenly among the nodes. These are the advantages of using the Rowid round robin method instead of using the Rowid range method: v The currently connected user does not require select access on the DBA_EXTENTS dictionary view. v The Rowid round robin method supports reading data from an index-organized table. v The Rowid round robin method supports reading data from a view. The rows in the view must correspond to the physical rows of the table. The Rowid round robin method cannot read rows from a view that is derived from a join operation on two or more tables. These are the advantages of using the Rowid range method instead of using the Rowid round robin method: v The SELECT statement for each node is less complex because it does not require as many SQL functions. v The Rowid range method provides a better distribution of rows across the nodes because the distribution is based on the physical collocation of the rows. In general, the Rowid range method requires that the connector perform fewer read operations on the source table.
21
Rowid hash The Rowid hash method is similar to the Rowid round robin method except that instead of using the ROWID_ROW_NUMBER function to obtain the row number, the Rowid hash method uses the ORA_HASH function to obtain a hash value for the rowid value of each row. Then the Rowid hash method applies the MOD function on the row number to distribute rows evenly among the nodes. Modulus To use this method, you must specify a column name from the input table in the Column name for partitioned reads property. The specified column must be of the data type NUMBER(p), where p is a value between 1 and 38. The specified column must exist in the table that is specified in the Table name for partitioned reads property, the Table name property, or the Select statement property, which is used only if you do not explicitly specify the table name in one of the other two properties. For each node, the connector reads the rows that satisfy the following condition: MOD(column_value, number_of_nodes) = node_number, where MOD is the modulus function, column_value is the value for the column specified in Column name for partitioned reads property, number_of_nodes is the total number of nodes on which the stage runs, and node_number is the index of the current node. The indexes are zero-based. Therefore, first node has index 0; the second node has index 1; and so on. Minimum and maximum range To use this method, you must specify a column name in the Column name for partitioned reads property. The specified column must be of the data type NUMBER(p), where p is a value between 1 and 38. The column name must be from the table that is specified in the Column name for partitioned reads property or the Table name property, or the Select statement property, which is used only if you do not explicitly specify the table name in one of the other two properties. The connector calculates the minimum and maximum value for the specified column and then divides the calculated range into an equal number of subranges. The number of subranges equals the number of nodes that are configured for the stage. On each node, the connector runs a SELECT statement that returns the rows for which the specified column has the values that are within the subrange that is associated with that node. Oracle partitions The Oracle partitions method can be used with partitioned tables. When this method is specified, the connector determines the number of partitions in the table and dynamically configures the number of nodes to match the number of table partitions. The connector associates each node with one table partition. For each node, the connector reads the rows that belong to the partition that associated with that node. To perform this operation, the connector adds the PARTITION(partition_name) clause to the SELECT statement where partition_name is the name of the partition that associated with the current node. Consequently, when you specify a value for the Select statement property, do not include a PARTITION or SUBPARTITION clause.
22
Note that the connector can dynamically adjust the number of nodes on which it runs. However, for this process to work, do not use the Advanced page of the stage dialog to constrain the node configuration at design-time. If the node configuration is constrained at design-time and the resulting number of nodes does not match the number of partitions in the table, the connector returns an error; and the job fails. Examples: Using partitioned read methods: To understand how each partitioned read method works, review these examples of using the Rowid range, Rowid round robin, Rowid hash, Modulus, Minimum and maximum range, and Oracle partitions methods. Rowid range This the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1 > 10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on four nodes. v The Partitioned reads method property is set to Rowid range. In this example, the connector calculates the rowid range for each processing node and runs a SELECT statement on each node. For each node, the SELECT statement specifies the rowid range that is assigned to that node. The SELECT statements are similar to the following statements, but the actual rowid range values will vary: Node 1
SELECT * FROM TABLE1 WHERE TABLE1.ROWID BETWEEN AAARvrAAEAAAAVpAAA AND AAARvrAAEAAAAVuH// AND (COL1 > 10)
Node 2
SELECT * FROM TABLE1 WHERE TABLE1.ROWID BETWEEN AAARvrAAEAAAAVvAAA AND AAARvrAAEAAAAV0H// AND (COL1 > 10)
Node 3
SELECT * FROM TABLE1 WHERE TABLE1.ROWID BETWEEN AAARvrAAEAAAAV1AAA AND AAARvrAAEAAAAV6H// AND (COL1 > 10)
Node 4
SELECT * FROM TABLE1 WHERE TABLE1.ROWID BETWEEN AAARvrAAEAAAAV7AAA AND AAARvrAAEAAAAWAH// AND (COL1 > 10)
Rowid round robin This is the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1 > 10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on four nodes. v The Partitioned reads method property is set to Rowid round robin. The connector runs these SELECT statements on the nodes: Node 1
Chapter 3. Oracle connector
23
Node 2
SELECT * FROM TABLE1 WHERE MOD(DBMS_ROWID.ROWID_ROW_NUMBER(TABLE1.ROWID), 4) = 1 AND (COL1 > 10)
Node 3
SELECT * FROM TABLE1 WHERE MOD(DBMS_ROWID.ROWID_ROW_NUMBER(TABLE1.ROWID), 4) = 2 AND (COL1 > 10)
Node 4
SELECT * FROM TABLE1 WHERE MOD(DBMS_ROWID.ROWID_ROW_NUMBER(TABLE1.ROWID), 4) = 3 AND (COL1 > 10)
Rowid hash This is the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1>10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on four nodes. v The Partitioned reads method property is set to Rowid hash. The connector runs these SELECT statements on the nodes: Node 1
SELECT * FROM TABLE1 WHERE MOD(ORA_HASH(TABLE1.ROWID), 4) = 0 AND (COL1 > 10)
Node 2
SELECT * FROM TABLE1 WHERE MOD(ORA_HASH(TABLE1.ROWID), 4) = 1 AND (COL1 > 10)
Node 3
SELECT * FROM TABLE1 WHERE MOD(ORA_HASH(TABLE1.ROWID), 4) = 2 AND (COL1 > 10)
Node 4
SELECT * FROM TABLE1 WHERE MOD(ORA_HASH(TABLE1.ROWID), 4) = 3 AND (COL1 > 10)
Modulus This is the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1>10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on four nodes. v The Partitioned reads method property is set to Modulus. v The Column name for partitioned reads property is set to COL2, and COL2 is defined as NUMBER(5) in TABLE1. The connector runs the following SELECT statements on the nodes: Node 1
SELECT * FROM TABLE1 WHERE MOD(TABLE1.COL2, 4) = 0 AND (COL1 > 10)
24
Node 2
SELECT * FROM TABLE1 WHERE MOD(TABLE1.COL2, 4) = 1 AND (COL1 > 10)
Node 3
SELECT * FROM TABLE1 WHERE MOD(TABLE1.COL2, 4) = 2 AND (COL1 > 10)
Node 4
SELECT * FROM TABLE1 WHERE MOD(TABLE1.COL2, 4) = 3 AND (COL1 > 10)
Minimum and maximum range This is the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1>10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on four nodes. v The Partitioned reads method property is set to Minimum and maximum range. v The Column name for partitioned reads property is set to COL2, and COL2 is defined as NUMBER(5) in TABLE1. The connector determines the minimum and maximum value for column COL2. If the minimum value is -20 and maximum value is 135, the connector runs the following SELECT statements on the nodes: Node 1
SELECT * FROM TABLE1 WHERE TABLE1.COL2 <= 18 AND (COL1 > 10)
Node 2
SELECT * FROM TABLE1 WHERE TABLE1.COL2 BETWEEN 19 AND 57 AND (COL1 > 10)
Node 3
SELECT * FROM TABLE1 WHERE TABLE1.COL2 BETWEEN 58 AND 96 AND (COL1 > 10)
Node 4
SELECT * FROM TABLE1 WHERE TABLE1.COL2 >= 97 AND (COL1 > 10)
Oracle partitions This is the configuration for this example: v The Select statement property is set to SELECT * FROM TABLE1 WHERE COL1>10. v The Table name for partitioned reads property is set to TABLE1. v The connector is configured to run in parallel mode on five nodes. v The Partitioned reads method property is set to Oracle partitions. v TABLE1 has four partitions:
CREATE TABLE TABLE1 ( COL1 NUMBER(10), COL2 DATE ) PARTITION BY RANGE (COL2) ( PARTITION PART1 VALUES LESS THAN (TO_DATE(01-JAN-2006,DD-MON-YYYY)),
Chapter 3. Oracle connector
25
PARTITION PART2 VALUES LESS THAN (TO_DATE(01-JAN-2007,DD-MON-YYYY)), PARTITION PART3 VALUES LESS THAN (TO_DATE(01-JAN-2008,DD-MON-YYYY)), PARTITION PART4 VALUES LESS THAN (MAXVALUE) );
The connector determines that TABLE1 has four partitions: PART1, PART2, PART3, AND PART4. The connector concludes that the stage must run on four processing nodes. Because the stage was configured to run on five nodes, the connector removes the fifth node from the list of nodes and logs an Informational message to indicate that the list of nodes was adjusted and that the stage will run on four nodes. The connector runs the following SELECT statements on the nodes: Node 1
SELECT * FROM TABLE1 PARTITION(PART1) WHERE COL1 > 10
Node 2
SELECT * FROM TABLE1 PARTITION(PART2) WHERE COL1 > 10
Node 3
SELECT * FROM TABLE1 PARTITION(PART3) WHERE COL1 > 10
Node 4
SELECT * FROM TABLE1 PARTITION(PART4) WHERE COL1 > 10
Procedure
1. From the parallel canvas, double-click the connector icon, and then select the input link. 2. On the Partitioning tab, select a partition type. Oracle connector partition type: For a range-partitioned, list-partitioned or interval-partitioned table, the Oracle connector partition type ensures that the distribution of input records matches the organization of the partitions in the table.
26
The Oracle connector supports the use of the built-in partition types such as Random and Modulus. In addition, the connector provides one additional partition type: Oracle connector. The following information describes how the connector works when you select Oracle connector as the partition type To partition the input records across nodes when Oracle connector partition type is selected, the connector first looks at the partitioning information for the table. In most cases, the name of the table matches the name of the table to which the connector writes the data; therefore, the table name is usually specified in the Table name property or is implicitly specified in the INSERT, UPDATE, or DELETE SQL statement. To configure the connector to use the partitioning information from one table but write the data to a different table, you specify the table name in the Table name for partitioned writes property. The connector logs an Informational message that contains the name of the table from which it collects partitioning information. If the connector lacks sufficient information to determine the name of the table, the connector logs a Warning message and forces sequential execution. After determining the table name for the partitioned write, the connector determines the set of nodes on which to run. The connector determines the number of partitions that are on the table and associates one node with each partition. The number of partitions must match the number of nodes. There are three cases that result in a mismatch between the number of nodes and the number of partitions. In the first case, the configuration of the parallel processing nodes specifies a node pool, a resource constraint, or a node map. If the configuration specifies a constraint, the connector cannot dynamically modify the set of processing nodes, reports a Fatal error, and stops the operation. In the second case, the list of nodes that are configured for the stage contains more nodes than the number of partitions in the table. In this case, the connector removes the excess nodes from the end of the list. In the third case, the list of nodes that are configured for the stage contains fewer nodes than the number of partitions in the table. In this case, the connector adds nodes to the end of the list. The definition for each added node matches the definition of the last node in the original list. Next, the connector determines the node to which to send each input record. For each incoming record, the connector inspects the data in the fields that correspond to the table columns that constitute the partition key for the table. The connector compares those values to the boundary values that are specified for the individual partitions of the table and determines the partition that will store the records. Because the number of nodes matches the number of partitions and each partition has only one node assigned to it, the connector routes the records to the node that is associated with each partition, and the node writes the records into the database. For the connector to determine both the number of partitions in a table and the partitioning type that was used to partition the table, the table must exist in the database before you run the job. The only exception to this rule is when the Table action property is set to Create or Replace and the Create statement property specifies a CREATE TABLE statement. In this case, the connector analyzes the CREATE TABLE statement to determine that number of partitions and the partition type that the table will have after it is created at runtime. The connector uses this information to determine the number of nodes that the stage will run on. Note that if the table uses a supported partition type, for example range, list, or interval, but the partition key in the table includes a virtual column, the connector does not force sequential execution. Instead, the connector runs on the number of nodes that
27
is equal to the number of table partitions. However, because only one node actually processes the data, the connector effectively runs in sequential mode. If the Table action property is set to Create or Replace and the Generate create statement at runtime property is set to Yes, the connector does not create the table as a partitioned table. Therefore, the connector cannot associate the table partitions with the nodes. In this case, the connector logs a Warning message and runs the stage in sequential mode. If the table does not exist and the Before SQL statement property or the Before SQL (node) statement property specifies the CREATE TABLE statement, the connector reports an error because it tries to determine the number of partitions and the partition type before it runs the before SQL statement that creates the table. When the Table scope is set to Single partition or Single subpartition, the connector runs the stage in sequential mode and logs a Warning message. In this case, the connector is explicitly configured to write data to only one partition or subpartition; therefore, only one node is assigned to that partition or subpartition. Oracle partition types The following list describes how the Oracle connector partition type supports specific Oracle partition types. Range, Composite range-range, Composite range-list, Composite range-hash, The Oracle connector partition type supports writing to range-partitioned tables. The connector inspects the values of the record fields that correspond to the partition key columns, determines the partition to which the record belongs, and redirects the record to the node that is associated with that table partition. List, Composite list-range, Composite list-list, Composite list-hash The Oracle connector partition type supports writing to list-partitioned tables. The connector inspects the value of the record that corresponds to the partition key column, determines the partition to which the record belongs, and redirects the record to the node that is associated with that table partition. Hash The Oracle connector partition type does not support writing to hash-partitioned tables. In this case, the connector runs the stage in sequential mode and logs a Warning message.
Interval, Composite interval-range, Composite interval-list, Composite interval-hash The Oracle connector partition type supports writing to interval-partitioned tables. The connector inspects the value of the record that corresponds to the partition key column and determines the partition to which the record belongs. If the record belongs to one of the partitions that existed when the job started, the connector redirects the record to the node that is associated with that table partition. Otherwise, the connector redirects the record to a special node that is reserved for loading records into new, dynamically created partitions. Reference The Oracle connector partition type does not support writing to reference-partitioned tables. In this case, the connector runs the stage in sequential mode and logs a Warning message.
28
Virtual The Oracle connector partition type does not support writing to a table in which the partition key includes a virtual column. In this case, the connector runs the stage in sequential mode and logs a Warning message. System The Oracle connector partition type does not support writing to system-partitioned tables. In this case, the connector runs the stage in sequential mode and logs a Warning message.
Procedure
1. Double-click the connector stage icon to open the connector properties. 2. In the Server field, do one of the following: v Click Select to display a list of Oracle services, and then select the Oracle service to connect to. If the list is empty, the connector cannot locate the Oracle tnsnames.ora file. The connector tries to locate the file by checking the TNS_ADMIN and ORACLE_HOME environment variables. v Enter the complete content of the connect descriptor, as it would appear in the Oracle tnsnames.ora file. v Use the following syntax to enter an Oracle Easy Connect string: host[:port][/service_name] v Leave the property blank to connect to the default local Oracle service. The ORACLE_SID environment variable defines the default local service. The TWO_TASK environment variable on Linux or UNIX and the LOCAL environment variable on Microsoft Windows define the default remote service. Note: Selecting an Oracle service is preferable to using the TWO_PHASE or LOCAL environment variables. 3. In the Username and Password fields, enter the user ID and password to use to authenticate with the Oracle service. By default, the connector is configured for Oracle database authentication. This form of authentication requires that the specified name and password match the credentials that are configured for the user in the Oracle database. 4. Optional: In the Use external authentication field, select Yes. This form of authentication requires that the user be registered in Oracle and identified as a user who is authenticated by the operating system.
29
Procedure
1. From the parallel canvas, double-click the Oracle connector icon. 2. In the top left corner of the stage editor, select the link to edit. Note that you cannot directly edit columns on a reject link. The columns on the reject link are copies of the columns that are defined on the input link and any reject-specific columns that you select on the Reject tab of the stage dialog. 3. Use one of the following methods to set up the column definitions: a. Drag and drop a table definition from the repository view to the link on the job canvas. Then use the arrow keys to move the columns back and forth between the Available columns and Selected columns lists. b. From the Columns tab, click Load and select a table definition from the metadata repository. Then to choose which columns from the table definition to apply to the link, move the columns from the list of Available columns to the list of Selected columns. 4. Right-click within the columns grid, and select Properties from the menu. Select the properties to display, specify the order in which to display them, then click OK. 5. Modify the column definitions. You can change the column names, data types, and other attributes. In addition, you can manually add or insert new columns or remove existing columns. 6. To save the new table definition in the metadata repository, complete these steps: a. From the Columns tab, click Save and then click OK to display the repository view. b. Navigate to an existing folder, or create a new folder in which to save the table definition. c. Select the folder, and then click Save.
30
match. If you set the Enable quoted identifiers property to Yes, the connector performs case-sensitive name matching. Otherwise, the connector performs case-insensitive name matching, which is the default. If the Read mode property is set to PL/SQL and the Lookup type is set to Sparse, the connector matches by name the reference link columns with the parameters in the PL/SQL block. The connector maps the columns marked as key columns to PL/SQL input/output parameters and maps the remaining columns to the PL/SQL output parameters. If the connector cannot match the names, the connector attempts to use the column order to associate link columns and parameters. Therefore, the connector associates the first column on the link with the first parameter, associates the second column on the link with the second parameter, and so on. When the Write mode property is set to Insert, Update, Delete, or PL/SQL, the connector maps the columns on the input link to the input parameters that are specified in the SQL or PL/SQL statement. Two formats are available for specifying parameters in the statement: DataStage syntax and Oracle syntax. The following list describes how the connector performs matching, based on the format that you use to specify the parameters: DataStage syntax The DataStage syntax is ORCHESTRATE.parameter_name. If you use DataStage syntax to specify parameters, the connector uses name matching. Therefore, every parameter in the statement must match a column on the link, and the parameter and the column must have the same name. If the connector cannot locate a matching column for a parameter, a message is logged and the operation stops. Note: To use a keyword other than ORCHESTRATE in the DataStage syntax, define the CC_ORA_BIND_KEYWORD environment variable, and set its value to the keyword that you want to use. Oracle syntax The Oracle syntax is :name, where name is the parameter name or parameter number. If you use the Oracle syntax to specify parameters, the connector first tries name matching. If name matching fails because some or all of the names do not match, the connector checks whether the name values are integers. If all of the name values are integers, the connector uses these integers as 1-based ordinals for the columns on the link. If all of the name values are integers but some or all of the integer values are invalid, meaning smaller than 1 or larger than the total number of columns on the link, the connector reports a Fatal error and the operation stops. If some of all of the name values are not integers, the connector performs matching based on column order. Note: For PL/SQL blocks, you must use the Oracle syntax. If you use DataStage syntax, the connector logs an error and the operation stops. If you use integer values for parameter names, you must specify the integers in increasing order; otherwise, the connector logs a Fatal message, and the operation stops. Both DataStage syntax and Oracle syntax If you use both DataStage syntax and Oracle syntax to specify parameters, the connector logs a Fatal error, and the operation stops. To avoid this problem, you must consistently use the same format to specify parameters.
Chapter 3. Oracle connector
31
32
RPAD (COL2, 20, '*') is not mapped to any column on the output link. Therefore, the connector adds the following column to the link: CC_2_RPAD_COL2__20______ In the new column name, the number 2 is used in the column name prefix because the SQL expression appears as the second column in the SELECT statement list. Each non-alphanumeric character (, ' *) is replaced by a two underscore characters. The space characters in the SQL expression are ignored. Finally, the connector removes the COL2 column from the output link because that column is unmapped. If runtime column propagation is not enabled, the connector performs matching by position. Consequently, COL1 and COL2 remain on the link, and COL2 on the link represents the values of the SQL expression from the SELECT statement. If the column alias COL2 is used for the SQL expression and runtime column propagation is enabled, the mapping by name is successful, and the two existing link columns, COL2 and COL2, are used. The SELECT statement in this case is SELECT COL1, RPAD(COL2, 20, '*') COL2 FROM TABLE1.
When you set the Read mode property to PL/SQL, there are two ways to define the source of the data: v Enter the PL/SQL block manually. v Enter the name of a file that contains the PL/SQL block. When you run the job, the connector runs the specified PL/SQL block only once and returns the output bind variables that are specified in the PL/SQL block. A PL/SQL block is useful for running a stored procedure that takes no input parameters but that returns values through one or more output parameters.
Procedure
1. From the parallel canvas, double-click the Oracle connector icon, and then select the output link to edit. 2. Set Read mode to Select or PL/SQL. 3. If you set Read mode to Select, use one of these methods to specify source of the data:
33
v Set Generate SQL at runtime to Yes, and then enter the name of the table or view in the Table name property. Use the syntax schema_name.table_name, where schema_name is the owner of the table. If you do not specify schema_name, the schema that belongs to currently connected user is used. v Set Generate SQL at runtime to No, and then specify the SELECT statement in the Select statement property. v Set Generate SQL at runtime to No, and then enter the fully-qualified file name of the file that contains the SQL statement in the Select statement property. If you enter a file name, you must also set Read select statement from file to Yes. v Click the Select statement property, and then next to the property, click Build to start the SQL Builder. To construct the SQL statement, drag and drop table and column definitions that are stored in the repository and choose options for configuring clauses in the SQL statement. 4. If you set Read mode to PL/SQL, use one of these methods to specify the source of the data: v Manually enter the PL/SQL block in the PL/SQL block property. v Enter the fully-qualified file name of the file that contains the PL/SQL block in the PL/SQL block property. If you enter a file name, you must also set Read PL/SQL block from file to Yes. Note: The specified PL/SQL block must begin with the keyword DECLARE or BEGIN and must end with the keyword END, and you must enter a semicolon after the END keyword.
Update
34
Table 2. Write modes and descriptions (continued) Write mode Delete Description The connector attempts to delete rows in the target table that correspond to the records that arrive on the input link. Matching records are identified by the values that correspond to link columns that are marked as key columns. The behaviour of this property is very similar to the Insert property. However, when this Write mode is selected, the records that could not be written to the database because of a primary key or unique constraint are ignored and the connector proceeds to process the remaining records. But any error other than primary key or unique constraint violation still results in logging a fatal error message and stopping the job. For each input record, the connector first tries to insert the record as a new row in the target table. If the insert operation fails because of a primary key or unique constraint, the connector updates the existing row in the target table with the new values from the input record. For each input record, the connector first tries to locate the matching rows in the target table and to update them with the new values from the input record. If the rows cannot be located, the connector inserts the record as a new row in the target table. For each input record, the connector first tries to delete the matching rows in the target table. Regardless of whether rows were actually deleted or not, the connector then runs the insert statement to insert the record as a new row in the target table. For each input record, the connector runs the specified PL/SQL block. The connector uses the Oracle direct path load method to bulk load data.
Procedure
1. From the parallel canvas, double-click the Oracle connector icon and then select the input link to edit. 2. To automatically generate the SQL at runtime, perform these steps: a. Set Generate SQL at runtime to Yes. b. Set Write mode to Insert, Update, Delete, Insert then update, Update then insert, or Delete then insert. c. Enter the name of the target table in the Table name property. 3. To manually enter the SQL, perform these steps: a. Set Generate SQL at runtime to No.
Chapter 3. Oracle connector
35
b. Set Write mode to Insert, Update, Delete, Insert then update, Update then insert, or Delete then insert. c. Enter SQL statements in the properties the correspond to the selected Write mode: Insert statement, Update statement, Delete statement. As an alternative, click Build beside each property to start the SQL Builder. Then to build the statement, drag and drop column definitions that are stored in the repository, and choose options for configuring clauses in the statement. 4. To read the SQL statement from a file, perform these steps: a. Set Generate SQL at runtime to No. b. Enter the fully-qualified name of the file that contains the SQL statement in the Insert, Update, Insert then update, Update then insert, or Delete then insert property. c. Set Read insert statement from file, Read update statement from file, or Read delete statement from file to Yes. 5. To specify a PL/SQL block, perform these steps: a. Set the Write mode to PL/SQL. b. Enter the PL/SQL block in the PL/SQL block property. Note: The PL/SQL block must begin with the keyword DECLARE or BEGIN and end with the keyword END. You must include a semicolon character after the END keyword. 6. To bulk load data, perform these steps: a. Set Write mode to Bulk load. b. Enter the name of the table in the Table name property. Use the syntax schema_name.table_name, where schema_name is the owner of the table. If you do not specify schema_name, the schema that belongs to currently connected user is used.
36
target table and the operation succeeds, but no data is updated. These situations result in a row not being updated: v The key field values in the input record do not match the key column values of any row in the target table v The key field values in the input record match the key column values in some rows in the target table, and the remaining column values in the input record match the corresponding column values in those same rows. The connector checks for this condition only when the Write mode property is set to Update. Note that this condition does not have a corresponding Oracle error code and error message. Row not deleted This condition occurs when the connector attempts to delete a row in the target table and the operation succeeds, but no data is deleted. This situation occurs when the key field values in the input record do not match the key column values any row in the target table. The connector checks for this condition only when the Write mode property is set to Delete. Note that this condition does not have a corresponding Oracle error code and message. SQL error constraint check This condition occurs when an operation cannot be completed because of a constraint check. Note that there are some situations when this SQL error does not result in a record being sent to the reject link. For example, when the Write mode property is set to Insert then update and the insert operation fails because of a primary key constraint, the connector attempts to update the row, rather than send the record to the reject link. However, if the update operation fails for one of the selected reject conditions, the connector sends the input record to the reject link. SQL error type mismatch This condition occurs when a data value in the record is not compatible with the data type of the corresponding column in the target table. In this case, Oracle cannot convert the data and returns error. SQL error data truncation This condition occurs when the data types of the columns on the link are compatible with the column data types in the target table, but there is a loss of data because of a size mismatch. SQL error character set conversion This condition occurs when the record contains Unicode data for some of its NChar, NVarChar or LongNVarChar columns, and conversion errors happen when that data is converted to the database character set specified in the NLS_CHARACTERSET database parameter. SQL error partitioning This condition occurs when the connector tries to write a record to a particular partition in the partitioned table, but the specified partition is not the partition to which the record belongs. SQL error XML processing This condition occurs when a record that contains an XML data document cannot be inserted into an XMLType column in a table because the XML data contains errors. For example, if the specified XML document is not well-formed or if the document is invalid in relation to its XML schema, this error condition occurs.
Chapter 3. Oracle connector
37
SQL error other This condition covers all SQL errors that are not covered explicitly by one of the conditions listed above.
Procedure
1. Configure a target stage to receive the rejected records. 2. Right-click the Oracle connector and drag to create a link from the Oracle connector to the target stage. 3. If the link is the first link for the Oracle connector, right-click the link and choose Convert to reject. If the Oracle connector already has an input link, the new link automatically displays as a reject link. 4. Double-click the connector to open the stage editor, and then in the navigator, highlight the reject link, which is represented by a line of wide dashes. 5. Click the Reject tab. 6. In the Filter rejected rows based on selected conditions list, select one or more conditions to use to reject records. If you do not choose any conditions, none of the rows are rejected. In this case, any error that occurs while the records are being written to the target table results in job failure. 7. Use one of the following methods to specify when to stop a job because of too many rejected rows: v In the Abort when field, select Percent. Then in the Abort when (%) field, enter the percentage of rejected rows that will cause the job to stop. In the Start count after (rows) field, specify the number of input rows to process before calculating the percentage of rejected rows. v In the Abort when field, select Rows. Then in the Abort after (rows) field, specify the maximum number of reject rows allowed before the job stops. 8. Optional: In the Add to reject row list, select ERRORCODE or ERRORMESSAGE or select both. Then when a record fails, the rejected record includes the Oracle error code and the corresponding message that describes the failure. For a complete list of the Oracle error codes and messages, see the Oracle documentation.
Procedure
1. In the Designer client, open the job that you want to compile. 2. Click Compile. 3. If the Compilation Status area displays errors, edit the job to resolve the errors. After resolving the errors, click Compile again. 4. When the job compiles successfully, click Run, and specify the job run options: a. Enter the job parameters, as required. b. Click the Validate button to check the job configuration without actually reading or writing any data.
38
c. Click the Run button to read, write, or look up data. 5. View the status of the job: a. Open the Director client. b. In the Status column, verify that the job was validated and completed successfully. If the job or the validation failed, choose View > Log to view messages that describe runtime problems. 6. If the job has runtime problems, fix the problems, recompile, validate, and run the job until it completes successfully.
Procedure
To configure transactions, set the Isolation level property to one of the following:
Option Read committed Description Each SELECT statement that runs within the transaction sees the rows that were committed when the current statement started. Each SELECT statement that runs within the transaction sees only the rows that were committed when the transaction started. Read only isolation works the same way that serializable isolation works, except that the DML statements INSERT, UPDATE, DELETE and MERGE are not allowed in the transaction. This isolation level prevents the PL/SQL block from running DML statements. However, be aware that if the PL/SQL block overrides the isolation level, the block can run DML statements, even if you set the isolation level to Read only.
Serializable
Read only
39
Procedure
To configure Oracle row prefetching, set one or both of the following properties: v Set Prefetch row count to the number of rows to prefetch for each fetch request that results in a roundtrip to the Oracle server. v Set Prefetch buffer size to the size in KB to use as the buffer for the prefetched rows.
Procedure
1. Set Array size to a number between 1 and 999,999,999. The default is 2,000. 2. Set Record count to the number of records to process before the connector commits the current transaction. The default is 2,000. The value that you specify must be a number between 0 and 999,999,999 and be a multiple of the value that you specify for the Array size property. Enter 0, and the connector processes all records before it commits the transaction. Note: If the value that you specify for the Record count property is not 0 and is not a multiple of the value that you specify for the Array size property, the
40
connector automatically chooses an array size so that the record count is a multiple of it. When choosing the array size, the connector attempts to find a value that is close to the value that you specified. If the connector cannot find that value, it chooses the value 1 or the value that matches record count value, whichever is closer to the value that you specified. Then connector logs an informational message to inform you that it modified the value of the Array size property. 3. Optional: Use the Mark end of wave property to specify whether or not to insert an end-of-wave marker after the number of records that are specified in the Record count property are processed. By default, end-of-wave markers are not inserted. Note: When the end-of-wave marker is inserted, any records that the Oracle connector buffered are released from the buffer and pushed into the job flow so that downstream stages can process them.
Procedure
Set Runtime column propagation to Yes. After completing the mapping, the connector removes any output link columns that were not mapped. If the job later references one of the unmapped columns, a runtime error occurs. For example, if the statement SELECT COL1, COL2 FROM TABLE1 is specified for the stage and the output link defines the columns COl1, COL2, and COL3, the connector performs the following tasks: 1. Binds column COL1 from the statement to column COL1 on the link. 2. Binds column COL2 from the statement to column COL2 on the link. 3. Removes column COL3 from the link at runtime because COL3 is unmapped. If a downstream Transformer stage references column COL3, the job fails at runtime; and the Transformer stage generates a message that indicates that column COL3 could not be found. To correct the error condition, you add column COL3 to the SELECT statement, or you remove column COL3 from the output link. Note that in the following cases, the connector does not remove unused columns from the output link: v The Read mode property is set to PL/SQL. v The Before SQL or Before SQL (node) property is set, and the property creates the table from which the connector subsequently reads the data. When the Runtime column propagation property is enabled for the stage, a SELECT statement contains an SQL expression for a column name, and no alias is specified for the column, the connector automatically adds a new column to the link and specifies a column name that matches the SQL expression. The following rules apply to how the column name is derived from the SQL expression: v Non-alphanumeric characters, underscore characters (_), dollar signs ($), and pound signs (#) are replaced with a pair of underscore characters. v The dollar sign is replaced with __036__. v The pound sign is replaced with __035__. v White space characters are ignored.
Chapter 3. Oracle connector
41
v If any character replacement is performed, the prefix CC_N_ is appended to the column name, where N is the index of the SQL expression column in the SELECT statement list. The first column in the SELECT statement list has index 1; the second column has index 2; and so on. The following example illustrates how runtime column propagation works. Assume that the Runtime column propagation property is enabled for the stage, that the statement SELECT COL1, RPAD (COL2, 20, '*') FROM TABLE1 is specified in the stage, and that the output link defines two columns: COL 1 and COL2. Because runtime column propagation is enabled, the connector tries to match columns only by name, not by position. The COL1 column from the SELECT statement is mapped to COL1 column on the output link, but the SQL expression RPAD (COL2, 20, '*') is not mapped to any column on the output link. Therefore, the connector adds the following column to the link: CC_2_RPAD_COL2__20______ In the new column name, the number 2 is used in the column name prefix because the SQL expression appears as the second column in the SELECT statement list. Each non-alphanumeric character (, ' *) is replaced by a two underscore characters. The space characters in the SQL expression are ignored. Finally, the connector removes the COL2 column from the output link because that column is unmapped. If runtime column propagation is not enabled, the connector performs matching by position. Consequently, COL1 and COL2 remain on the link, and COL2 on the link represents the values of the SQL expression from the SELECT statement. If the column alias COL2 is used for the SQL expression and runtime column propagation is enabled, the mapping by name is successful, and the two existing link columns, COL2 and COL2, are used. The SELECT statement in this case is SELECT COL1, RPAD(COL2, 20, '*') COL2 FROM TABLE1. When Oracle connector dynamically adds a column to the link at runtime in a job that has the Runtime column propagation property enabled and the link column corresponds to a LONG or LONG RAW table column in the database, the connector sets the link column length to be the maximum possible value that meets both of these conditions: v The value does not exceed 999999. v When the value is multiplied by the value that is specified in the Array size property for the stage, the product does not exceed 10485760 (the number of bytes in 10MB).
Procedure
For the Drop unmatched fields property, choose Yes (default) or No. If you choose Yes, the connector drops any unused columns on the input link. For each dropped column, the connector writes an Informational message in the job
42
log to indicate that the column and its associated values were ignored. When the Drop unmatched fields property is set to No, the connector logs a Fatal error message and stops the job when it encounters an unused column on the input link. You use the Enable quoted identifiers property to specify whether the name matching between the input link columns and target SQL statement parameters or table columns is case-sensitive or not. The following example describes a job and illustrates the effects of setting the Drop unmatched fields property and the Enable quoted identifiers property: v The connector stage is configured to use the Bulk load as the Write mode. v The target table in the database contains these columns: FIRSTNAME, LASTNAME and DATEOFBIRTH. v The input link of the connector contains these columns: FirstName, LastName, Address, DateofBirth, Phone, and Email. The results differ, depending on how you set the properties: v If you set Drop unmatched fields to Yes and set Enable quoted identifiers to No, the connector logs Informational messages to indicate that the Address, Phone, and Email columns from the input link are not used. The connector loads only the data provided for the FirstName, LastName and DateofBirth input link columns. v If you set Drop unmatched fields to No and set Enable quoted identifiers to No, the connector logs a Fatal message to indicate that the Address column from the input link is not used, and the job stops. v If you set Drop unmatched fields to No and set Enable quoted identifiers to Yes, the connector logs a Fatal message to indicate that the FirstName column from the input link is not used, and the job stops.
Procedure
To configure the Preserve training blanks property, select one of the following values: v If Yes is selected, the trailing whitespace characters are treated as any other characters. They are preserved along with the other characters and the data is passed to the database in its original form. This is the default behavior for the connector. v If No is selected, the stage removes trailing whitespace characters from the text field values that it receives from the framework. The trimmed values are passed to the database. Any leading whitespace characters in the values are preserved.
43
Procedure
Set the Fail on row error property to one of the following options: v If you select Yes, when a record was not written to the database, the connector logs an unrecoverable error and the job stops. v If you select No, when a record was not written to the database, the connector logs a warning message and continues processing the remaining input records.
Procedure
In the Log multiple matches property, specify one of the following values: v Select None to not log any message for multiple matches. v Select Informational to log messages of informational severity. v Select Warning to log message of warning severity. v Select Fatal to log message of fatal severity and stop the job.
44
Procedure
1. To create a table at runtime, perform these steps: a. Set Table action to Create. b. Use one of these methods to specify the CREATE TABLE statement: v Set Generate create table statement at runtime to Yes and enter the name of the table to create in the Table name property. In this case, the connector automatically generates the CREATE TABLE statement from the column definitions on the input link. The column names in the new table match the column names on the link. The data types of columns in the new table are mapped to the column definitions on the link. v Set Generate create table statement at runtime to No, and enter the CREATE TABLE statement in the Create table statement property.
2. To replace a table at runtime, perform these steps: a. Set Table action to Replace. b. Use one of these methods to specify the DROP TABLE statement: v Set Generate drop table statement at runtime to Yes, and enter the name of the table to drop in the Table name property. v Set Generate drop table statement at runtime to No, and enter the DROP TABLE statement in the Drop table statement property. c. Use one of these methods to specify the CREATE TABLE statement: v Set Generate create table statement at runtime to Yes, and enter the name of the table to create in the Table name property. v Set Generate create table statement at runtime to No, and enter the CREATE TABLE statement in the Create table statement property. 3. To truncate a table at runtime, perform these steps: a. Set Table action to Truncate. b. Use one of these methods to specify the TRUNCATE TABLE statement: v Set Generate truncate table statement at runtime to Yes, and enter the name of the table to truncate in the Table name property. v Set Generate truncate table statement at runtime to No, and enter the TRUNCATE TABLE statement in the Truncate table statement property. 4. To cause the job to fail when a statement fails, set Fail on error for [create, truncate, drop] statement to Yes. Then when the statement fails, the job stops. Otherwise, when the statement fails, the connector logs a Warning message and the job continues.
45
you might use an SQL statement to create a target table and add an index to it. The SQL statement that you specify is performed once for the whole job, before any data is processed. After running the statement that is specified in the Before SQL statement property or After SQL statement property, the connector explicitly commits the current transaction. For example, if you specify a DML statement, such as INSERT, UPDATE, DELETE, or MERGE, in the Before SQL statement property, the results of the DML statement are visible to individual nodes. When the connector stage is used to write records to the database and is configured to perform a table action on the target table before writing data, you can use the Run table action first property to control whether the Before SQL statement or the Table action must be performed first. To run an SQL statement on each node that the connector is configured to run on, use the Before SQL (node) statement property or the After SQL (node) statement property. The connector runs the specified SQL statement once before any data is processed on each node or once after any data is processed on each node. For example, to set the data format to use for the client session on a node, you specify the ALTER SESSION statement in Before SQL (node) property. After running the statement that is specified in the Before SQL (node) statement property or After SQL (node) statement property, the connector explicitly commits the current transaction. You use the same basic procedure to configure the Before SQL statement, After SQL statement, Before SQL (node) statement, and After SQL (node) statement properties. The following steps describe how to configure the Before SQL statement property.
Procedure
1. Set Run before and after SQL statements to Yes. 2. In the Before SQL statement property, enter the SQL or PL/SQL statement, or enter the fully-qualified path to the file that contains the SQL or PL/SQL statement. Note: Do not include input bind variables or output bind variables in the SQL or PL/SQL statement. If the statement contains these types of variables, the connector logs a Fatal message, and the operation stops. If you specify a file name, the file must be on the computer where the IBM InfoSphere DataStage server is installed. 3. If you specify a file name, set Read Before SQL statement from file to Yes. 4. Set Fail on error for before SQL statement to Yes (default) or No. If this property is set to Yes and the SQL or PL/SQL statement fails, the connector logs a Fatal message, and the job stops. Otherwise, the connector logs a Warning message, and the job continues.
46
Procedure
1. Set the CC_ORA_NODE_USE_PLACEHOLDER environment variable to TRUE. 2. Set the CC_ORA_NODE_PLACEHOLDER_NAME environment variable to the node placeholder name that you will use in user-defined SQL statements or PL/SQL blocks. 3. Include the CC_ORA_NODE_PLACEHOLDER_NAME and CC_ORA_NODE_USE_PLACEHOLDER environment variables as parameters of the job.
Example
In this example, two nodes insert data into two different tables. The following are the assumptions for this example: v The connector is configured to write data to a database table. v The Write mode property is set to PL/SQL. v The connector is configured to run on two nodes. v The Partition type property is set to Entire so that all input records are sent to all of the nodes and no partitioning takes place. v The CC_ORA_NODE_USE_PLACEHOLDER environment variable is set to TRUE, and the CC_ORA_NODE_PLACEHOLDER_NAME environment variable is set to DSNODENUM so that the connector substitutes the current node number for each occurrence of DSNODENUM. The PL/SQL property contains this value:
BEGIN IF DSNODENUM = 0 THEN INSERT INTO TABLE1 VALUES (:COL1, :COL2); ELSE IF DSNODENUM = 1 THEN INSERT INTO TABLE2 VALUES (:COL1, :COL2); END IF; END;
47
BEGIN IF 0 = 0 THEN INSERT INTO TABLE1 VALUES (:COL1, :COL2); ELSE IF 0 = 1 THEN INSERT INTO TABLE2 VALUES (:COL1, :COL2); END IF; END;
48
v The connector cannot be configured to automatically delete the rows that violate the constraint. v If you define a reject link and select the SQL Error - constraint violation condition for the reject link, the job fails, and the message IIS-CONN-ORA001058 is written to the job log, indicating that an exceptions table is required. The format of the exceptions table is specified in the utlexcpt.sql and utlexcpt1.sql scripts, which are in the Oracle installation directory. For example, for installations on Microsoft Windows, the scripts are under the directory %ORACLE_HOME%\RDBMS\ADMIN. The utlexcpt.sql script defines the format for exceptions tables that accept the physical ROWID values that conventional tables use. The utlexcpt1.sql script defines the format for exceptions tables that accept the universal ROWID (UROWID) values that both conventional and index-organized tables use. When a database already has an exceptions table, the table must use the format specified in one of the two scripts that correspond to the type of the target table; otherwise, the connector reports a fatal error about the table format, and the job stops. If the database does not already have an exceptions table, the connector uses the correct format to create one. When you configure the connector to disable triggers before loading the data, the connector disables the triggers and logs a message about this action. If disabling some of the triggers fails, the connector logs an error message, and the job fails. The connector uses a similar process to enable triggers after loading the data.
Procedure
1. To disable and enable constraints, complete these steps: a. Set Perform operations before bulk load to Yes. Set Disable constraints to Yes. Set Perform operations after bulk load to Yes. Set Enable constraints to Yes. Enter the name of the exceptions table in the Exceptions table name property. If the exceptions table does not exist, the connector creates it. If the exceptions table already exists, the connector deletes any data that is in the table and then uses it. f. Set Process exception rows to Yes. When Process exception rows is set to Yes, the connector deletes from the target table the rows that fail the constraint checks. If you defined a reject link for the connector and enabled the SQL error - constraint check reject condition, the connector sends the deleted rows to the reject link. If Process exception rows is set to No and some rows fail a constraint check, the job stops. 2. To disable and enable triggers, complete these steps: b. c. d. e. a. b. c. d. Set Perform operations before bulk load to Yes. Set Disable triggers to Yes. Set Perform operations after bulk load to Yes. Set Enable triggers to Yes.
49
Procedure
1. Set Use Oracle date cache to Yes. 2. In the Cache size property, enter the maximum number of entries that the cache stores. The default is 1,000. 3. Set Disable cache when full to Yes. When the number of entries in the cache reaches the number specified in the Cache size property and the next lookup in the cache results in a miss, the cache is disabled.
Managing indexes
Specify how to control table indexes during a bulk load and how to rebuild indexes after a bulk load completes.
Procedure
1. To control how to handle table indexes during a bulk load, set the Index maintenance option property to one of the following:
Option Do not skip unusable Description While loading rows into the table, the connector tries to maintain indexes. If an index on the table is in an unusable state, the bulk load fails.
50
Description The connector skips indexes that are in an unusable state and maintains indexes that are in a usable state. Note: When performing a bulk load into a partitioned table that has a global index defined, the bulk load fails. The connector skips all indexes. Any index that is usable before the load is marked unusable after the load.
Skip all
2. To a. b. c.
rebuild indexes after a bulk load, complete these steps: Set Perform operations after bulk load to Yes. Set Rebuild indexes to Yes. Optional: To include a parallel clause in the ALTER INDEX statement when the index is rebuilt, select one of the following for the Parallel clause property: v Select Do not include to include no parallel clause and to use the existing setting for the index.
v Select NOPARALLEL to disable parallelism. In this case, access to the index segment is serialized. v Select PARALLEL to enable parallelism for rebuilding the index and for all subsequent queries and DML statements that are performed on the index segment. As an option, for the Degree of parallelism property, enter a number that represents the degree of parallelism to use in the parallel clause. Leave the property blank, and the Oracle client automatically calculates the optimal parallelism degree. d. Optional: To include a logging clause in the ALTER INDEX statement when the index is rebuilt, select one of the following for the Logging clause property: v Select Do not include to include no logging clause and to use the existing setting for the index. v Select NOLOGGING to disable logging to the redo log. v Select LOGGING to enable logging to the redo log. e. Optional: To stop rebuilding an index if the index rebuild statement fails, set Fail on error for index rebuilding to Yes. If an index rebuild fails, the connector logs a Fatal message.
51
Based on the types and lengths of the columns that are defined on the input link, the connector calculates whether the specified array size can always fit into the specified buffer size. If the buffer is too small to accommodate the number of records specified for the array size, the connector automatically resets the array size to the maximum number of records that fit in the buffer. When an upstream stage provides records to the Oracle connector in the form of waves, each wave includes an end-of-wave marker, which is a special record that signifies the end of the wave. In this case, the array size applies to each separate wave of records. If there are not enough records to fill the buffer to the specified array size value, the connector loads the incomplete buffer of records as a batch and then processes the next wave of records. When records do not arrive in waves and instead all arrive in a single wave, the array size applies to that single wave. If the connector stage is configured to load data to a table, a partition, or a subpartition segment from a single processing node, you can set the Allow concurrent load sessions property to No to prevent other applications such as external applications or other DataStage jobs from loading data to the same segment while the connector stage is loading data. If the connector stage is configured to run in parallel on more than one processing node, each of the processing nodes establishes a separate Oracle session and loads data to the target table concurrently. In this scenario, if the Allow concurrent load sessions property is set to No it prevents multiple processing nodes from concurrently loading data to the same segment in the database. This situation might lead to the Oracle error ORA-00054, wherein the processing nodes try to load data to a segment while another processing node is loading data to the same segment. To avoid this situation, the Allow concurrent load sessions property can be set to Yes. Sometimes, the connector stage is configured to load data from multiple processing node to a partitioned Oracle table and the stage is configured to partition the input data by setting the Partition type option on the Partitioning tab. In this scenario, and the supported table partitioning types, each processing node loads data to its assigned partition segment or a set of subpartition segments and the processing nodes do not compete for access to the segment. In this scenario, setting Allow concurrent load sessions property to No does not prevent the connector stage from loading data in parallel from multiple processing nodes but prevents other applications from concurrently loading data to the segments accessed by this connector stage.
Procedure
1. Set Array size to a value 1 - 999,999,999. The default is 2,000. 2. Set Buffer size to a value 4 - 100,240, which represents the buffer size in KB. The default is 1,024 KB. 3. Set the Allow concurrent load sessions property depending on your requirement.
52
Procedure
1. Set Manual mode to Yes. 2. In the Directory for data and control files property, specify a directory to which the connector must save the control and data files that it generates for manual load. If the connector fails to open the specified directory, it logs a fatal message and the job stops. 3. In the Control file name property, specify a name for the control file. The stage generates this file and stores it in the directory specified in the Directory for data and control files property. If the control file name value is not specified, the connector generates the name automatically in the servername_tablename.ctl format, where, servername is the value specified in the Server property, and tablename is the value specified in the Table name property. If the connector fails to save the control file under the specified file name, it logs a fatal message and the job stops. 4. In the Data file name property, specify the name of the data file. The stage generates this file and stores it in the directory specified in the Directory for data and control files property. If the data file name value is not specified, the connector generates the name automatically in the servername_tablename.dat format. If the connector fails to save the data file under the specified file name, it logs a fatal message and the job stops. 5. In the Load options property, specify the bulk load options that the connector should include in the generated control file. The value contains parameters that are passed to the Oracle SQL*Loader utility when it is invoked to process the generated control and data files. The default format is OPTIONS(DIRECT=FALSE,PARALLEL=TRUE). The DIRECT=FALSE parameter tells the Oracle SQL*Loader to use the conventional path load instead of the direct path load. The PARALLEL=TRUE parameter tells it that the data can be loaded in parallel from multiple concurrent sessions. Refer to the Oracle product documentation for information about these options and other available options. Note: The word OPTIONS and the parentheses must be included in the value specified for the property. The connector saves this property value in its original form to the generated control file and does not check its syntax.
Case-sensitivity
To maintain the case-sensitivity of Oracle schema object names, you can manually enter double quotation marks around each name or set the Enable quoted identifiers property set to Yes. The Oracle connector automatically generates and runs SQL statements when either of these properties are set: v Generate SQL at runtime is set to Yes. v Table action is set to Create, Replace, or Truncate.
Chapter 3. Oracle connector
53
In these cases, the generated SQL statements contain the names of the columns and the name of the table on which to perform the operation. The column names in the database table match the column names that are specified on the link for the stage. The table name matches the table specified in the Table name property. By default, the Oracle database converts all object names to uppercase before it matches the names against the Oracle schema object names in the database. If the Oracle schema object names all use uppercase, then how you specify the names in the connector properties, by using uppercase, lowercase, or mixed case, has no effect on schema matching. The names will match. However, if the Oracle schema object names use all lowercase or mixed case, you must specify the names exactly as they appear in the Oracle schema. In this case, you must manually enter double quotation marks around each name or set the Enable quoted identifiers property to Yes. For example, assume that the Enable quoted identifiers property is set to No and that you want to create a table that contains one column and use the SELECT statement that references the column. The statement CREATE TABLE Table2b (Col1 VARCHAR2(100)) creates the table TABLE2B which contains one column, COL1. The statement SELECT Col1 FROM tABLE2B runs successfully because the Oracle database automatically changes the Col1 and tABLE2B names in the statement to the uppercase versions COL1 and TABLE2B and matches these names with the actual schema object name and column name in the database. Now assume that you use the statement CREATE TABLE "Table2b" ("Col1" VARCHAR2(100)) to create the table Table2b, which contains one column, Col1. Case-sensitivity is preserved because you use enclosed the table and column names in quotation marks. Now the statement SELECT Col1 FROM tABLE2B fails because the Oracle database automatically changes Col1 and Table2b to the uppercase versions COL1 and TABLE2B, and these names do not match the actual names, Col1 and Table2b, in the database. However, the statement SELECT "Col1" FROM "Table2b" runs successfully. Now consider an example that illustrates the effect of the Enable quoted identifiers property on table and column creation. Assume that the Table name property is set to john.test; that the input link contains columns Col1, Col2, and Col3, all of which are of VarChar(10) data type; and the Table action property is set to Create. If the Enable quoted identifiers property is set to No, the connector generates and runs these SQL statements at runtime and creates the table JOHN.TEST with the columns COL1, COL2, and COL3:
CREATE TABLE john.test(Col1 VARCHAR2(10),Col2 VARCHAR2(10),Col3 VARCHAR2(10));
However, if the Enable quoted identifiers property is set to No, the connector generates and runs this SQL statement at runtime and creates the table john.test with the columns Col1, Col2, and Col3:
CREATE TABLE "john"."test"("Col1" VARCHAR2(10),"Col2" VARCHAR2(10), "Col3" VARCHAR2(10));
54
(carriage return), and LF (line feed). In addition, the connector treats text values as-is and does not trim leading or trailing white space characters. The Oracle database does not support empty string values in text columns. Instead, the Oracle database treats these values as NULL values. Before writing values into fixed-size text columns, the Oracle database pads all non-empty values with space characters. For example, assume that you use the following statement to create a target table named TABLE1 and configure the connector to insert or bulk load data into this table:
CREATE TABLE TABLE1 (COL1 VARCHAR2(10), NULL, COL2 CHAR(3) NULL);
The following table shows the input data for columns COL1 and COL2 and the corresponding values that will be stored in TABLE1. In the table, the dash (-) represents a space character.
Table 3. Example input column values and corresponding table values that are stored in the database Column values "VAL1-1-", "V1-" "V2--", "2-" "-", "-" "3", NULL NULL, "4" "", "" NULL, NULL Table values "VAL1-1-", "V1-" "V2--", "2--" "-", "---" "3", NULL NULL, "4--" NULL, NULL NULL, NULL
55
When you define the connection between the Oracle connector and the Oracle database, you must complete these fields: XA database name Enter the value from the DB field of the XAOpenString entry. This field is required only if you register more than one Oracle resource manager with the MQ queue manager that the DTS stage references. Server Enter the value of the SqlNet field of the XAOpenString entry.
56
Before SQL (node) statement property is set to Yes, the connector reruns the statement specified in Before SQL (node) statement property once on each node. 5. If all of the TAF attempts fail or if the Oracle client indicates that TAF cannot be completed, the connector logs a Warning message, and the operation stops because there is no valid connection to the database.
Multiple database connections are configured, and Manage application failover is set to No
This is the configuration for this example: v The connector is configure to run a SELECT statement that reads 1,000,000 rows from a table. v The Manage application failover property is set to No. v The connector is configured to connect to an Oracle RAC system. v The connector specifies ORCL_1 as the connect descriptor to use to connect to the database instance orcl1. v The tnsnames.ora configuration file contains the following connect descriptors:
ORCL_1 = (DESCRIPTION = (ADDRESS = (PROTOCOL = tcp)(HOST = orcl1-server)(PORT = 1521)) (CONNECT_DATA = (SERVICE_NAME = orcl)(INSTANCE_NAME = orcl1) (FAILOVER_MODE = (BACKUP = ORCL_2)(TYPE = select)(METHOD = preconnect)))) ORCL_2 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = orcl2-server)(PORT = 1521)) (CONNECT_DATA = (SERVICE_NAME = orcl)(INSTANCE_NAME = orcl2) (FAILOVER_MODE = (BACKUP = ORCL_1)(TYPE = select)(METHOD = preconnect))))
The connection that is established through the ORCL_1 connect descriptor has the following characteristics: v The Oracle client connects to the listener on host orcl1-server and port 1521 and attaches to the service orcl and the instance orcl1. v The FAILOVER_MODE specifies that if the orcl1 instance becomes unavailable while the application is connected to it, the SELECT type of TAF takes place. v The BACKUP option specifies the backup connect descriptor that the Oracle client uses if failover occurs. v The METHOD option specifies when the Oracle client connects to the backup instance. The value PRECONNECT specifies that the backup connection be established at the same time that the primary connection is established. Then if the primary connection fails, the failover to the backup connection occurs. The alternative value for the METHOD option is BASIC. When BASIC is specified, the connection to the backup instance happens when the failover actually occurs. If the connection to the instance orcl1 fails while the connector is fetching data from a table, the connector stops processing data until the failover to the instance orcl2 takes place. Because Manage transparent application failover is set to No, the connector does not receive any notification when failover starts or completes. Because the connection to the backup instance is established at the same time that the primary connection is established, the failover occurs quickly and might occur
Chapter 3. Oracle connector
57
so quickly that the delay is not noticeable. After the failover completes, the connector continues fetching data because the failover TYPE is set to SELECT. If the failover TYPE was set to SESSION, the next fetch request that the connector issued would fail. Then the connector would log a Fatal message, and the job would stop. If the connector was configured to write data and was running an INSERT statement when the connection to the instance failed, after the failover completed and the connector attempted to insert new data or commit the data that was inserted just prior to the instance failing, the statement would fail. The connector would log an error message, and the job would stop.
A single database connection is configured, and Manage application failover is set to Yes
In this example, there is only one database instance, and failover occurs only after the Oracle administrator restarts the instance. This is the configuration for this example: v The connector is configured to run a SELECT statement that reads 1,000,000 rows from a table. v The Manage application failover property is set to Yes. v The connector is configured to connect to a single database instance. v The connector specifies ORCL as the connect descriptor to use to connect to the database instance orcl. v The tnsnames.ora configuration file contains the following connect descriptor:
ORCL = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = orcl-server)(PORT = 1521)) (CONNECT_DATA = (SERVICE_NAME = orcl) (FAILOVER_MODE = (TYPE=select)(METHOD=basic)(RETRIES=20)(DELAY=5) ) ) )
The connection that is established through the ORCL connect descriptor has the following characteristics: v The Oracle client connects to the listener on host orcl-server and port 1521 and attaches to service orcl, which implements a single instance. v The FAILOVER_MODE specifies that if the instance becomes unavailable while the application is connected to it, the SELECT type of TAF takes place. v The METHOD option, which is set to BASIC, specifies that the attempt to reconnect to the instance happens when the failover occurs. If the connection to the instance fails while the connector is fetching data from a table, the connector receives a notification that failover is taking place because Manage transparent application failover is set to Yes. Each time that the Oracle client attempts to reestablish the connection, the Oracle client notifies the connector, and the connector logs a message. The Oracle client ignores the RETRIES and DELAY options because the Number of retries and Time between retries properties are configured for the connector. If Manage application failover is set to No, the Oracle client tries up to 20 times, the value of the RETRIES option, to reestablish the connection and waits 5 seconds, the value of the DELAY option, between failover attempts. During that time, the connector appears to stop and might seem to fail, when, in fact, the delay occurs
58
59
IIS-CONN-ORA-001001 IIS-CONN-ORA-001002
v Fail on error for index rebuilding By default, all of the properties, except Fail on error for drop table statement and Fail on error for index rebuilding, are set to Yes; if an error occurs, the message is reported to the log file, and the job continues. If a property is set to No, when an error occurs, the corresponding message is reported to the log file, and the job stops. If you set the property Process warning messages as fatal errors to Yes, the job stops when the first Warning message is issued, and the connector reports the error in the log. By default, this property is set to No. In this case, when the first Warning message is issued, it is sent to the log; and the job continues.
Messages
Identify an error or problem and resolve the problem by using the appropriate recovery action. Messages have the following severity levels: Fatal, Error, Warning, Informational, Debug, and Trace. The environment variable CC_MSG_LEVEL controls which messages are reported to the log file. By default, Informational level messages and higher are reported. To change the level of messages that are reported to the log, edit the CC_MSG_LEVEL environment variable, and set it to one of the following values: v 1 - Trace v 2 - Debug v v v v 3 4 5 6 Informational Warning Error Fatal
For example, when you perform problem diagnostics, you might want to set CC_MSG_LEVEL to 2 so that you can view Debug messages, as well as higher level messages. You can set properties that control when to stop a job. For example, if the Process warning messages as fatal errors property is set to Yes, a job stops when the connector reports the first Warning message. The following topics list the messages by severity. Each message is documented along with corrective actions that might fix the error condition.
Fatal messages
Read the text of each fatal message, along with a description of the cause of the error and recommendations for corrective actions to take. When a fatal message occurs, the job stops.
IIS-CONN-ORA-001001 The variable {0} has value {1} which is not valid in the current context. Explanation: Fatal error. This is a generic error message that the connector reports when it cannot choose a more specific message. User response: Use the specified variable and value to try to determine the source of the problem. For example, the values may be related to a property name and a value that was configured for the connector. IIS-CONN-ORA-001002 The OCI function {0} returned status {1}: OCI_INVALID_HANDLE. Explanation: Fatal. This is an internal error that occurs
60
IIS-CONN-ORA-001003 IIS-CONN-ORA-001018
in the communication between the connector and the Oracle client. The problem might be related to an external problem. User response: Check the log for additional Informational, Warning, and Fatal messages that might describe the situation that resulted in this Fatal error. IIS-CONN-ORA-001003 The OCI function {0} returned status {1}, Error code {2}, Error message: {3}. Explanation: Fatal. This is a generic message that indicates that the Oracle client returned an error after the connector called the specified function. User response: Evaluate the reported error status, error code, and error message. Based on that information, try to deduce the reason for the error. IIS-CONN-ORA-001004 The connector could not establish connection to Oracle server {0}. Method: {1}, Error code: {2}, Error message {3}. Explanation: Fatal User response: Ensure that the Server, Username, and Password properties are correctly specified, and verify that the Oracle service and the listener are running. If you are using OS authentication, ensure that OS authentication is correctly configured in the Oracle database. IIS-CONN-ORA-001010 Unsupported data type: {0}. Explanation: Fatal. This is an internal error that indicates that the connector encountered an Oracle data type that is not supported for the current context. The error might be related to an external problem. IIS-CONN-ORA-001011 Unsupported type code: {0}. Explanation: Fatal. This is an internal error that indicates that the connector encountered an Oracle data type that is not supported for the current context. The error might be related to an external problem. IIS-CONN-ORA-001012 Memory allocation failed for {0} bytes. Explanation: Fatal. The system ran out of available memory, and the connector could not allocate free memory for the operation that it was performing. User response: Close some applications to free up the memory. IIS-CONN-ORA-001013 The connector could not initialize XA environment by calling Oracle function {0}. Explanation: Fatal. There was a problem initializing the distributed transaction environment at runtime. User response: Ensure that the system is correctly configured for use with the Distributed Transaction stage (DTS) and the transaction manager. IIS-CONN-ORA-001014 The statement failed with status {0}: {1} for input row {2}. Explanation: Fatal. User response: Look at the reported status and error message and the input data to try to determine what caused the error. IIS-CONN-ORA-001015 The connector could not create table {0} because the data type information for the column {1} could not be obtained. Explanation: Fatal. This is an internal error that might be related to an external problem. IIS-CONN-ORA-001016 The array size must be set to 1 so that the connector can process LOB values. Explanation: Fatal. User response: Set the Array size property to 1. IIS-CONN-ORA-001017 The connector could not determine the ROWID value for LOB column {0} in table {1}. Explanation: Fatal. The connector could not obtain the row identifier for the row that contains the LOB value that is passed by reference (locator). User response: Ensure that the column and the table that the error message references are available and accessible to the current user. IIS-CONN-ORA-001018 The connector could not obtain the table name for the LOB column {0}. The LOB reference was not created. Explanation: Fatal. The connector could not obtain the row identifier for the row that contains the LOB value that is passed by reference (locator). User response: Ensure that the table that contains the LOB column is available and accessible to the current user.
61
IIS-CONN-ORA-001019 IIS-CONN-ORA-001029
IIS-CONN-ORA-001019 The connector could not find the tnsnames.ora file. Verify that ORACLE_HOME or TNS_ADMIN environment variables are set. Alternatively, use Oracle Easy Connect naming method or specify a full connect descriptor. Explanation: Fatal. IIS-CONN-ORA-001020 The connector could not open Oracle network configuration file {0}. Explanation: Fatal. The connector failed to open the tnsnames.ora configuration file. The file might be in the correct location but might not have read-access granted, or a system-level error prevented the connector from opening the file. User response: Verify that the file exists and that its contents can be viewed. To work around this problem, specify a full connect descriptor or an Easy Connect string in the Server property. IIS-CONN-ORA-001021 The connector could not read Oracle network configuration file {0}. Explanation: Fatal. The connector failed to read the contents of the tnsnames.ora configuration file. User response: Verify that the file exists and that it is not empty. To override the tnsnames.ora configuration file, specify a full connect descriptor or an Easy Connect string for the Server property. IIS-CONN-ORA-001022 The following SQL statement failed: {0}. Explanation: Fatal. User response: Verify that the syntax of the statement is correct. Look at the log file for additional warning and error messages that might contain information that identifies why the statement failed. IIS-CONN-ORA-001023 The connector could not find a column in the input schema to match parameter {0}. Explanation: Fatal. User response: Ensure that the statement parameters match the column names on the input link. Look at the number and names of the parameters and columns, and ensure that an unambiguous mapping exists between each pair. IIS-CONN-ORA-001024 While reading data for column {1}, the connector received Oracle error code ORA-{0}. Explanation: Fatal. User response: See the Oracle documentation for more information about the Oracle error. Compare the input column definition with the database column definition. Evaluate whether the Oracle error is a result of a mismatch between the two column definitions. IIS-CONN-ORA-001025 The connector could not automatically generate the UPDATE statement. Specify at least one non-key column in the input schema. Explanation: Fatal. IIS-CONN-ORA-001026 The connector could not automatically generate the WHERE clause for the {0} statement. Specify at least one key column in the input schema. Explanation: Fatal. For the connector to generate the UPDATE or DELETE statement automatically at runtime, the input link must have at least one key column. IIS-CONN-ORA-001027 The connector is constrained to run on {0} processing nodes, but the Oracle partitioning scheme that was specified for table {1} requires that the total number of processing nodes be {2}. Explanation: Fatal. User response: Modify the constraint rules so that the total number of nodes matches the number of nodes that the connector requires, or remove the constraints so that the connector can dynamically specify the number of nodes that it needs. IIS-CONN-ORA-001028 The connector could not find the specified column {0} in table {1}. Explanation: Fatal. User response: Ensure that the specified column name and table name are correct and that the column exists in the table. Also ensure that the current user owns the table or that the owner name is included with the table name. IIS-CONN-ORA-001029 The data type of column {0} in table {1} is {2}, and the scale is {3}. Specify the data type NUMBER with the scale of 0. Explanation: Fatal. When the connector reads data in parallel by using the Modulus method or the Minimum
62
IIS-CONN-ORA-001030 IIS-CONN-ORA-001042
and maximum range method, the specified column must be a NUMBER column with the scale set to 0 or not specified. IIS-CONN-ORA-001030 The connector could not match name {0} with any partition or subpartition name in table {1}. Explanation: Fatal. User response: Ensure that the specified values are correct and that the specified name matches a partition or subpartition name in the specified table. IIS-CONN-ORA-001031 The connector could not find the specified table or view {0}. Explanation: Fatal. User response: Ensure that the specified table or view name matches an existing table or view name. Note that if the schema name is not specified with the table name or view name, the connector assumes that the table or view is owned by the currently connected user. IIS-CONN-ORA-001032 The connector could not match the partition key column {0} with any column in the input schema. Explanation: Fatal. The connector could not determine which column on the input link to use for comparing values against the partition key boundary value. User response: Verify the number and names of the columns on the input link. Ensure that one of the columns matches the partition key column. IIS-CONN-ORA-001033 The input schema column {0} is not compatible with the partition key column {1} of type {2}. Explanation: Fatal. The data type of the selected column is not compatible with the data type of the partition key column. User response: Ensure that the data types of the two columns are compatible. IIS-CONN-ORA-001034 The connector could not validate the input schema. Specify at least one column in the input schema. Explanation: Fatal. User response: Define at least one column on the link of the stage. IIS-CONN-ORA-001035 The property {0} requires a value, but no value was specified. Explanation: Fatal. IIS-CONN-ORA-001036 The index {0} is out of boundary for property {1}. Explanation: Fatal. This is an internal error that might be related to an external problem. The message does not refer to the table index, but instead refers to the internal connector property index. IIS-CONN-ORA-001038 The connector could not find any tables to include in the SELECT statement for the view data operation. Explanation: Fatal. User response: Look at the specified SQL statement in the connector properties and ensure that it correctly specifies table names. If the Generate SQL at runtime property is set to Yes, ensure that the Table name property specifies a table name. IIS-CONN-ORA-001039 While parsing parameter {0}, the connector detected an unmatched double quote character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement has a valid value and that all double quote characters in the statement are properly matched. IIS-CONN-ORA-001040 While parsing parameter {0}, the connector detected an unmatched single quote character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement has a valid value and that all single quote characters are properly matched. IIS-CONN-ORA-001041 While parsing parameter {0}, the connector detected an unexpected character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement has a valid value. IIS-CONN-ORA-001042 While parsing parameter {0}, the connector expected an identifier at position {1}. Explanation: Fatal User response: Ensure that the specified SQL statement has a valid value.
63
IIS-CONN-ORA-001043 IIS-CONN-ORA-001055
IIS-CONN-ORA-001043 While parsing table name {0}, the connector detected an unmatched double quote character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement has a valid value. IIS-CONN-ORA-001044 While parsing table name {0}, the connector detected an unmatched single quote character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement contains a valid value. IIS-CONN-ORA-001045 While parsing table name {0}, the connector detected an unexpected character at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement contains a valid value. IIS-CONN-ORA-001046 While parsing table name {0}, the connector expected an identifier at position {1}. Explanation: Fatal. User response: Ensure that the specified SQL statement contains a valid value. IIS-CONN-ORA-001047 The connector could not find the specified file {0}; or the current user does not have read permission on the file; or the file is empty. Explanation: Fatal. User response: Ensure that the file location is specified correctly and that the specified file exists, has read permissions granted, and is not empty. IIS-CONN-ORA-001049 The connector encountered parameter {0} in DataStage format and parameter {1} in Oracle format. A single format must be used consistently for all parameters. Explanation: Fatal User response: Use a single syntax consistently when specifying statement parameters. DataStage syntax is ORCHESTRATE:parameter_name. Oracle syntax is : name, where name is the parameter name or parameter number. IIS-CONN-ORA-001054 System call {0} failed with OS error {1} ({2}). Explanation: Fatal. This is a generic error that the connector receives after it invokes an operating system function. IIS-CONN-ORA-001055 The specified statement: {0} is of incorrect type. The required statement type is: {1}. Explanation: Fatal. User response: Ensure that the statement type is appropriate for the property for which it was specified. IIS-CONN-ORA-001053 An index rebuild operation failed, and the connector is configured to stop when a rebuild fails. Explanation: Fatal. User response: Resolve the problem that caused the index rebuild to fail, or change the properties that configure how to proceed when an index rebuild fails. IIS-CONN-ORA-001050 A warning message was issued, and the connector is configured to stop when this type of message occurs. Explanation: Fatal. User response: Resolve the reported issue so that the connector no longer reports it as a Warning message, or change the connector properties to allow it to proceed when it reports a Warning message. IIS-CONN-ORA-001051 An unsupported data type {0} was encountered during a bulk load. Explanation: Fatal. User response: Change the type of the column in the target table to one of the supported types, load data from a different table, or change the value of the Write mode property from Bulk load to Insert statement. IIS-CONN-ORA-001052 While loading data, the connector received Oracle error code {0}. Explanation: Fatal. User response: Consult the Oracle documentation for information about the error code. Verify that the definitions for the input columns match the definitions for the columns in the target table.
64
IIS-CONN-ORA-001056 IIS-CONN-ORA-003005
IIS-CONN-ORA-001056 The schema column {0} must have the length specified because it is used to access database column {1} of data type {2}. Explanation: Fatal. User response: Enter a length value for the specified column. Choose a length that is appropriate for the data that will be transferred through the column. IIS-CONN-ORA-001057 The connector encountered a parameter that uses IBM InfoSphere DataStage syntax (ORCHESTRATE.parameter_name), which is not allowed in PL/SQL blocks. Use Oracle syntax (:name, where name is the parameter name or the parameter number) to specify the parameter. Explanation: Fatal. User response: Use only Oracle syntax for specifying bind parameters in the PL/SQL block. IIS-CONN-ORA-001058 The connector was configured to write data in bulk load mode and the reject condition for checking constraints was selected for the reject link. For this operation to work, it is necessary to also provide the exceptions table name. Explanation: Fatal User response: Specify a value for the Exceptions table name property, or uncheck the SQL error constraint violation reject condition for the reject link.
Warning messages
Read the text of each Warning message along with a description of the cause of the error and recommendations for corrective actions to take. If you set the Process warning messages as fatal errors property to Yes, a job stops when the connector reports the first Warning message.
IIS-CONN-ORA-003001 While dropping table {0}, the connector encountered an error. Explanation: Warning. The connector receives this error when it tries to run a DROP TABLE statement. This error typically indicates that the table does not exist. User response: Look in the log file for additional messages that indicate the actual reason for this failure at the SQL level. IIS-CONN-ORA-003002 While creating table {0}, the connector encountered an error. Explanation: Warning. The connector receives this message when it tries to run the CREATE TABLE statement. For example, this message is returned when the database already contains a table that has the specified name and that is owned by the specified table owner. User response: Look at the log file for additional messages that indicate the actual reason for this failure at the SQL level. IIS-CONN-ORA-003003 While truncating table {0}, the connector encountered an error. Explanation: Warning. The connector received an error when it tried to delete rows from the table. User response: Look in the log file for additional messages that indicate the actual reason for the failure at the SQL level. IIS-CONN-ORA-003004 The connector was configured to load data in parallel, but the reject condition for checking constraints was selected for the reject link. This combination is not supported. The connector will run in sequential mode. Explanation: Warning. The connector can send records down the reject link only on a processing node and not on the conductor node. The constraint checking in bulk load mode must be performed only once. To ensure that the constraint checking is done only once for the stage and is done on the processing node, set the Execution mode property to Sequential. Then there is exactly one node. User response: To eliminate the warning, perform one of the following tasks: v Change the Execution mode to Sequential. v Change the Write mode to Insert. v Deselect the SQL error-constraint check condition for the reject link. IIS-CONN-ORA-003005 A data conversion error was encountered in bulk load mode for row {0}, column {1}. Explanation: Warning. User response: Look at the input value for the specified row and column. Ensure that the value is valid for the corresponding output table column.
65
IIS-CONN-ORA-003006 IIS-CONN-ORA-003014
IIS-CONN-ORA-003006 A data load error was encountered in bulk load mode for row {0}. Explanation: Warning. User response: Look at the input data for the specified row. Ensure that the column values are valid for the corresponding output table definition. IIS-CONN-ORA-003007 The connector could not enable constraint {0} on table {1} because some row in the table violated the constraint. The ROWID values of those rows are stored in exception table {2}. Explanation: Warning. Some rows in the loaded data violate the constraints that the connector disabled during the load and tried to re-enable after the load. User response: If the connector is configured to process rejected records, look at the rejected records to determine the cause of the problem. If the connector is not configured to process reject records, look at the target table rows that have ROWID values that match the ROWID values that are stored in the specified exceptions table to determine how the rows violated the specified constraint. IIS-CONN-ORA-003008 The connector could not rebuild index {0} on table {1}. Error code: {2}, Error message: {3}. Explanation: Warning. The index rebuild operation failed to complete. An Oracle error code and error message were returned. User response: Look at the returned Oracle error code and error message to determine why the operation failed. IIS-CONN-ORA-003009 The connector was configured to use the Oracle partitions method to perform partitioned reads on {0} processing nodes, but table {1} is not partitioned. The connector will run in sequential mode. Explanation: Warning. User response: Change the Execution mode property from Parallel to Sequential, or change the Partitioned reads method property from Oracle partitions to another value. IIS-CONN-ORA-003010 The connector was configured to perform partitioned writes on {0} processing nodes, but the table {1} is not partitioned. The connector will run in sequential mode. Explanation: Warning. User response: Change the Execution mode property from Parallel to Sequential, or change the Partition type property from Oracle connector to another value. IIS-CONN-ORA-003011 The connector was configured to perform partitioned writes on table {0}, but this table uses partitioning scheme {1}, for which the connector does not support partitioned writes. The connector will run in sequential mode. Explanation: Warning. User response: Change the setting for the Execution mode property from Parallel to Sequential, or change the setting for the Partition type property from Oracle connector to another value. IIS-CONN-ORA-003012 The connector was configured to perform partitioned reads on table {0} using the Oracle partitions method, but a single partition or subpartition {1} was specified. The connector will run in sequential mode. Explanation: Warning. User response: Change the setting for the Execution mode property from Parallel to Sequential, or change the setting for the Partitioned reads method property from the Oracle partitions to another value. IIS-CONN-ORA-003013 The connector was configured to perform partitioned writes on table {0}, but a single partition or subpartition {1} was specified. The connector will run in sequential mode. Explanation: Warning. User response: Change the setting for the Execution mode property from Parallel to Sequential, or change the setting for the Partition type property from Oracle connector to another value. IIS-CONN-ORA-003014 The connector was configured to perform partitioned reads using the Oracle partitions method, but the specified SELECT statement already contains PARTITION or SUBPARTITION clauses. The connector will run in sequential mode. Explanation: Warning. User response: Change the setting for the Execution mode property from Parallel to Sequential, or change the setting for the Partitioned reads method property from Oracle partitions to another value.
66
IIS-CONN-ORA-003015 IIS-CONN-ORA-003025
IIS-CONN-ORA-003015 The connector could not obtain access to the {0} system view. Access to that system view is required for the Rowid range read method. The connector will use the Rowid hash read method instead. Explanation: Warning. User response: Ensure that the current user has read access to the specified Oracle static dictionary view, or change the setting for the Partitioned read method property from Rowid range to another value. IIS-CONN-ORA-003016 Transparent application failover is not enabled for the current service. Explanation: Warning. User response: Enable transparent application failover for the current service, or set the Manage application failover property to No. IIS-CONN-ORA-003017 Transparent application failover was initiated. The type of failover is {0}. Explanation: Warning. The connection to the currently connected database instance failed, and the Oracle client initiated transparent application failover (TAF) for the connector. User response: Try these three solutions: v Wait for the failover to finish so that the job can continue running. Note that in some cases, even after TAF finishes, the job might still fail. v Investigate why the database instance failed and correct the problem. Then run the job again. v Wait until the instance is back up. Then run the job again. IIS-CONN-ORA-003018 The connector will wait {0} seconds for transparent application failover to complete; attempt {1} of {2}. Explanation: Warning. IIS-CONN-ORA-003019 Transparent application failover completed. The connector will attempt to resume data processing. Explanation: Warning. Note that even after transparent application failover (TAF) competes, the job might still fail. Job failure occurs when the operation that was interrupted when TAF started cannot be configured on the newly established client session. IIS-CONN-ORA-003020 Transparent application failover failed. Explanation: Warning. This message indicates that transparent application failover did not complete within the specified number of attempts. In most cases, the job fails because the connection to the database is invalid. IIS-CONN-ORA-003021 Transparent application failover did not complete within the specified time and number of attempts. Explanation: Warning. The Oracle client determined that it cannot complete the transparent application failover for the connector. In most cases, the job subsequently fails because the connection to the database is invalid. IIS-CONN-ORA-003022 The connector was configured to perform partitioned writes, but the connector failed determine the name of the table to use as input. The connector will run in sequential mode. Explanation: Warning. User response: Enter a value in the Table name property or in the Table name for partitioned writes property. IIS-CONN-ORA-003023 The connector was configured to perform partitioned reads, but the connector could not determine the name of the table to use as input. The connector will run in sequential mode. Explanation: Warning. User response: Enter a value in the Table name main property or enter a value in Table name subproperty of the Enable partitioned writes property. IIS-CONN-ORA-003024 While running the {0} statement {1}, the connector encountered an error. Explanation: Warning. User response: Check the syntax of the specified statement and look for any errors. Check the log file for other messages that might contain more information about the failure. IIS-CONN-ORA-003025 Number of records rejected the current on node {0}. Explanation: This message reports the number of messages that were rejected on the current processing node. The total number of records that were rejected is
Chapter 3. Oracle connector
67
the sum of all of the rejected records from all of the processing nodes. If the stage is running on a single node, the reported number matches the total number of records that were rejected by the stage. User response: Inspect the rejected records, which
contain the data from the original records. If you included the ERRORCODE and ERRORMESSAGE columns on the reject link, each rejected record includes information about the error that caused the record to be rejected.
Informational messages
Table 4. Informational message numbers and corresponding message text Message number IIS-CONN-ORA-004001 IIS-CONN-ORA-004002 IIS-CONN-ORA-004003 IIS-CONN-ORA-004004 IIS-CONN-ORA-004005 IIS-CONN-ORA-004006 IIS-CONN-ORA-004007 IIS-CONN-ORA-004008 IIS-CONN-ORA-004009 IIS-CONN-ORA-004010 IIS-CONN-ORA-004011 IIS-CONN-ORA-004012 IIS-CONN-ORA-004013 IIS-CONN-ORA-004014 IIS-CONN-ORA-004015 IIS-CONN-ORA-004016 IIS-CONN-ORA-004017 IIS-CONN-ORA-004018 IIS-CONN-ORA-004019 Message text The connector connected to Oracle server {0}. The connector is configured to use external authentication. The connector is configured to participate in distributed transaction environment. The connector will run in sequential mode. The connector will run in parallel on {0} processing nodes. The connector generated the following {0} statement at runtime: {1}. The connector created the table {0}. The connector dropped the table {0}. The connector truncated the table {0}. The connector ran the specified Before SQL statement. The connector ran the specified After SQL statement. The connector ran the specified Before SQL (node) statement. The connector ran the specified After SQL (node) statement. Number of rows fetched on the current node: {0}. Number of rows inserted on the current node: {0}. Number of rows updated on the current node: {0}. Number of rows deleted on the current node: {0}. Number of rows processed by the PL/SQL block on the current node: {0}. Number of records processed by the lookup select statement on the current node: {0}.
68
Table 4. Informational message numbers and corresponding message text (continued) Message number IIS-CONN-ORA-004021 Message text The connector was configured to run on {0} processing nodes, but the Oracle partitioning scheme used for table {1} requires a total of {2} processing nodes. The state will run on {3} processing nodes. The connector was configured to run in parallel mode on {0} nodes, but the partitioned reads were not enabled. The connector will run in sequential mode. The connector has matched partition key column {0} with input schema field {1}. Date cache statistics: cache size: {0}, number of elements in the cache: {1}, number of hits: {2}, number of misses: {3}, cache was disabled (1 - Yes, 0 - No): {4}. The connector disabled constraint {0} on table {1}. The connector disabled all triggers on table {0}. The connector enabled constraint {0} on table {1}. The connector deleted the rows in table {0} that violated constraint {1}. The connector enabled all triggers on table {0}. The connector will load {0} rows at a time. The connector rebuilt index {0} on table {1}. Transparent application failover is enabled for the current service. Number of rows loaded on the current node: {0}. The connector will use table {0} as input for the partitioned reads method. The connector will use table {0} as input for the partitioned writes method. The connector ran the {0} statement: {1}. The record count value was set to {0} and the array size value was set to {1}. The connector will change array size value to {2} because the record count value must be a multiple of the array size value.
IIS-CONN-ORA-004022
IIS-CONN-ORA-004023 IIS-CONN-ORA-004024
IIS-CONN-ORA-004025 IIS-CONN-ORA-004026 IIS-CONN-ORA-004027 IIS-CONN-ORA-004028 IIS-CONN-ORA-004029 IIS-CONN-ORA-004030 IIS-CONN-ORA-004031 IIS-CONN-ORA-004032 IIS-CONN-ORA-004033 IIS-CONN-ORA-004034 IIS-CONN-ORA-004035 IIS-CONN-ORA-004036 IIS-CONN-ORA-004037
Debug messages
There is only one generic debug message, which has up to four arguments. IIS-CONN-ORA-005001 has the message text CCORA DEBUG: {0}{1}{2}{3}{4}. The content of the debug message is useful for performing problem diagnostics on a job.
69
Trace messages
There are two trace messages. One specifies that a method was entered, and the other specifies that a method was exited. Both messages include the name of the class that defines the method, if applicable, and the name of the method.
Table 5. Trace message numbers and the corresponding message text Message number IIS-CONN-ORA-006001 IIS-CONN-ORA-006002 Message text ->{0}::{1} <-{0}::{1}
Reference
These reference topics provide detailed information about data type mappings, dictionary views, environment variables, and environment logging.
There are two ways to specify the session parameters. First, you can set the environment variables that have the same names as the session parameters. If you
70
use this method, you must also define the NLS_LANG environment variable. Second, you can alter the current session by including ALTER SESSION SET parameter = value statements in the Before SQL statement (node) property. When the Oracle connector forwards datetime values to the Oracle client as text, the Oracle client assumes that the values match the format that the NLS session parameters specify. If the format does not match, the Oracle client returns an error for the values, and the connector logs a message. For example, if the NLS_DATE_FORMAT session parameter is set to MM/DD/YYYY, then the text values that the connector writes to a column of DATE data type must adhere to that format. In this case, the value 12/03/2008 is acceptable, but the value 03-DEC-2008 is not. When the design-time schema specifies a column in a datetime data type, the Oracle connector ignores the Oracle NLS settings and converts the values into the Oracle datetime data type. You can configure the Oracle connector to log Debug messages that contain information about the current settings for the Oracle NLS session parameters, NLS database parameters, and the NLS_LANG environment variable. By default, Debug messages are not displayed in the log file. To view Debug messages in the log file, set the CC_MSG_LEVEL environment variable to 2.
71
that is not LOB-aware, the target stage cannot recognize the reference string as a special locator value and treats the reference string as ordinary data. There are advantages and disadvantages to using the reference form. The main advantage is that the reference form can transfer large LOB values from the source stage to the target stage. The main disadvantage is that interim stages cannot process the actual values. For example, if you add a Transformer stage to a job, the Transformer stage cannot perform operations on the actual LOB values because only the reference strings, not the actual values, are transferred through the job. The reference form is generally effective for transferring large LOB values that are 1 MB or more. Be aware of these issues when you configure the connector to read and write LOB data: v The connector supports both the inline and reference form to transfer BFILE, BLOB, CLOB, NCLOB, and XMLType columns. v The connector supports only the inline form to transfer LONG and LONG RAW columns. The length attribute for the column on the link must be set to the maximum expected length for the actual data at runtime. v When you configure the Oracle connector to read data from a BFILE column, you can transfer the actual file contents, or you can transfer a reference to the file location. If you are transferring the file contents of a BFILE, set the Transfer BFILE contents property to Yes. By default, Transfer BFILE contents is set to No and will transfer the reference to the file location. v When you configure the connector to read XMLType data and manually create the SELECT statement, you must use an alias to reference the table, and the XMLType column must use the Oracle GETCLOBVAL() or GETBLOBVAL() member function to get the actual XML content as BLOB or CLOB. If the column on the output link is defined as LongVarChar or LongNVarChar and passed inline, use the Oracle GETCLOBVAL() member function. If the column is defined as LongVarBinary and passed inline, use the GETBLOBVAL() member function. Do not use the GETCLOBVAL() and GETBLOBVAL() member functions when passing XMLType columns as LOB references. To read from an XMLType object table or object view, use the OBJECT_VALUE pseudonym for the column name. v When you configure the connector to write XMLType, if the column on the input link is defined as Binary, VarBinary, or LongVarBinary, you must use the Oracle SYS.XMLTYPE.CREATEXML() member function in the SQL statement to create the XML content. To configure the Oracle connector to use the reference form, you set Enable LOB references to Yes and then in the Columns for LOB references property, select the columns to pass by reference. Only link columns of LongVarChar, LongNVarChar and LongVarBinary data types are available for selection. Examples: Transferring XMLType data: These examples illustrate reading XMLType data from a standard table, object table, and object view. Writing to an XMLType column The following is the table definition:
CREATE TABLE TABLE1 (COL1 NUMBER(10), COL2 XMLTYPE) XMLTYPE COL2 STORE AS BINARY XML;
72
To write the binary XML value to the XMLType column, enter this INSERT statement in the Insert statement property in the connector:
INSERT INTO TABLE1 (COL1, COL2) VALUES (ORCHESTRATE.COL1, SYS.XMLTYPE.CREATEXML(ORCHESTRATE.COL2, 1, NULL, 1, 1));
Note: In this example, the second parameter of the SYS.XMLTYPE.CREATEXML function specifies the character set ID for the US7ASCII character set in Oracle. The third parameter is an optional schema URL that forces the input conform to the specified schema. The fourth parameter is a flag that indicates that the instance is valid according to the specified XML schema. The fifth parameter is a flag that indicates that the input is well formed. Reading XMLType data from a standard table or view The following is the table definition:
CREATE TABLE TABLE1 (COL1 NUMBER(10), COL2 XMLTYPE) XMLTYPE COL2 STORE AS CLOB;
To retrieve the XML value as a CLOB value, enter this SELECT statement in the Select statement property in the connector:
SELECT COL1, T.COL2.GETCLOBVAL() FROM TABLE1 T;
To retrieve the XML value as a BLOB value that uses the character encoding AL32UTF8, enter this SELECT statement in the Select statement property in the connector:
SELECT COL1, T.COL2.GETBLOBVAL(893) FROM TABLE1 T;
Note: The number 893 is the character set ID for the AL32UTF8 character set in Oracle. Oracle defines a character set ID for each character encoding that it supports. For information about the supported character encodings and IDs, see the Oracle documentation. Reading XMLType data from an object table The following is the table definition:
CREATE TABLE TABLE1 OF XMLTYPE XMLTYPE STORE AS BINARY XML;
To retrieve the XML value as a CLOB value, enter this SELECT statement in the Select statement property in the connector:
SELECT T.OBJECT_VALUE.GETCLOBVAL() FROM TABLE1 T;
To retrieve the XML value as a BLOB value that uses the US7ASCII character encoding, enter this SELECT statement in the Select statement property in the connector:
SELECT T.OBJECT_VALUE.GETBLOBVAL(1) FROM TABLE1 T;
Note: The number 1 is the character set ID for the US7ASCII character set in Oracle. Reading XMLType data from an object view This example uses the TABLE1, which was defined in the previous example. The following is the view definition:
CREATE VIEW VIEW1 AS SELECT * FROM TABLE1;
Chapter 3. Oracle connector
73
To retrieve the XML value from VIEW1 as a CLOB value, enter this SELECT statement in the Select statement property in the connector:
SELECT V.OBJECT_VALUE.GETCLOBVAL() FROM VIEW1 V;
CHAR multibyte
74
Table 6. Oracle data types and corresponding DataStage data types (continued) Oracle data type VARCHAR2(n BYTE) VARCHAR2(n CHAR) single-byte VARCHAR2(n CHAR) multibyte CLOB single-byte CLOB multibyte LONG single-byte LONG multibyte NCHAR(n) NCHAR NVARCHAR2(n) NCLOB NUMBER NUMBER (p, s) {p>=s} {s>=0} DataStage data type VARCHAR VARCHAR NVARCHAR LONGVARCHAR LONGNVARCHAR LONGVARCHAR LONGNVARCHAR NCHAR See NCHAR(n) and assume n = 1. NVARCHAR LONGNVARCHAR DOUBLE DECIMAL n unset unset p s p-s unset unset unset unset unset n unset 285 unset unset unset unset unset unset unset s s unset unset unset unset unset unset unset unset unset unset fsp fsp fsp unset unset unset unset unset unset unset unset unset unset unset unset unset unset unset Microseconds Microseconds Microseconds n n n unset unset unset unset n unset unset unset unset unset unset unset unset unset unset unset unset unset unset unset unset
NUMBER(p, s) {p<s} DECIMAL {s>=0} NUMBER(p, s) {s<0} DECIMAL FLOAT(p) {1 <=p <=63} FLOAT(p) {64 <=p <= 126} BINARY_FLOAT BINARY_DOUBLE LONG RAW RAW(n) BLOB BFILE DATE TIMESTAMP(fsp) TIMESTAMP(fsp) WITH TIME ZONE TIMESTAMP(fsp) WITH LOCAL TIME ZONE TIMESTAMP FLOAT DOUBLE FLOAT DOUBLE LONGVARBINARY VARBINARY LONGVARBINARY VARCHAR TIMESTAMP TIMESTAMP TIMESTAMP TIMESTAMP
TIMESTAMP WITH See TIMESTAMP(fsp) WITH TIME ZONE TIME ZONE and assume fsp=6.
75
Table 6. Oracle data types and corresponding DataStage data types (continued) Oracle data type DataStage data type
TIMESTAMP WITH See TIMESTAMP(fsp) WITH LOCAL TIME LOCAL TIME ZONE and ZONE assume fsp=6. INTERVAL YEAR (yp) TO MONTH VARCHAR yp+4 sp+13 dp+17 dp+sp+11 unset unset unset unset unset unset unset unset
INTERVAL DAY TO VARCHAR SECOND (sp) INTERVAL DAY (dp) TO SECOND INTERVAL DAY (dp) TO SECOND (sp) INTERVAL YEAR TO MONTH VARCHAR VARCHAR
INTERVAL DAY TO See INTERVAL DAY (dp) SECOND TO SECOND (sp) and assume dp=2 and sp=6. ROWID UROWID(n) UROWID CHAR VARCHAR See UROWID(n) and assume n=4000. 18 n 18 unset unset unset
XMLType stored as See CLOB single-byte. CLOB or OBJECT_ RELATIONAL single-byte XMLType stored as See CLOB multibyte. CLOB or OBJECT_ RELATIONAL multibyte XMLType stored as BINARY XML Other See BLOB. UNKNOWN unset unset unset
76
Table 7. DataStage column definitions and corresponding Oracle column definitions DataStage column definition Data type: Bit Length: any Scale: any Extended: Data type: Char Length: unset Scale: any Extended: unset Data type: Char Length: n Scale: any Extended: unset Data type: VarChar Length: unset Scale: any Extended: unset Data type: VarChar Length: n Scale: any Extended: unset Data type: LongVarChar Length: any Scale: any Extended: unset Data type: Char Length: unset Scale: any Extended: Unicode Data type: Char Length: n Scale: any Extended: Unicode Data type: VarChar Length: unset Scale: any Extended: Unicode Data type: VarChar Length: n Scale: any Extended: Unicode Data type: LongVarChar Length: n Scale: any Extended: Unicode Data type: NChar Length: unset Scale: any Extended: Data type: NChar Length: n Scale: any Extended: Oracle column definition NUMBER(5,0)
CHAR(2000)
CHAR(n)
VARCHAR2(4000)
VARCHAR2(n)
CLOB
NCHAR(1000)
NCHAR(n)
NVARCHAR2(2000)
NVARCHAR2(n)
NCLOB
NCHAR(1000)
NCHAR(n)
77
Table 7. DataStage column definitions and corresponding Oracle column definitions (continued) DataStage column definition Data type: NVarChar Length: unset Scale: any Extended: Data type: NVarChar Length: n Scale: any Extended: Data type: LongNVarChar Length: any Scale: any Extended: Data type: Binary Length: unset Scale: any Extended: Data type: Binary Length: n Scale: any Extended: Data type: VarBinary Length: unset Scale: any Extended: Data type: VarBinary Length: n Scale: any Extended: Data type: LongVarBinary Length: any Scale: any Extended: Data type: Decimal Length: p Scale: unset Extended: Data type: Decimal Length: p Scale: s Extended: Data type: Double Length: any Scale: any Extended: Data type: Float Length: any Scale: any Extended: Oracle column definition NVARCHAR2(2000)
NVARCHAR2(n)
NCLOB
RAW(2000)
RAW(n)
RAW(2000)
RAW(n)
BLOB
NUMBER(p)
NUMBER(p,s)
BINARY_DOUBLE
BINARY_FLOAT
78
Table 7. DataStage column definitions and corresponding Oracle column definitions (continued) DataStage column definition Data type: Real Length: any Scale: any Extended: Data type: TinyInt Length: any Scale: any Extended: unset Data type: SmallInt Length: any Scale: any Extended: unset Data type: Integer Length: any Scale: any Extended: unset Data type: BigInt Length: any Scale: any Extended: unset Data type: TinyInt Length: any Scale: any Extended: unsigned Data type: SmallInt Length: any Scale: any Extended: unsigned Data type: Integer Length: any Scale: any Extended: unsigned Data type: BigInt Length: any Scale: any Extended: unsigned Data type: Numeric Length: p Scale: unset Extended: Data type: Numeric Length: p Scale: s Extended: Data type: Date Length: any Scale: any Extended: Oracle column definition BINARY_FLOAT
NUMBER(3,0)
NUMBER(5,0)
NUMBER(10,0
NUMBER(19,0)
NUMBER(3,0)
NUMBER(5,0)
NUMBER(10,0)
NUMBER(19,0)
NUMBER(p)
NUMBER(p,s)
DATE
79
Table 7. DataStage column definitions and corresponding Oracle column definitions (continued) DataStage column definition Data type: Time Length: any Scale: unset Extended: unset Data type: Time Length: any Scale: sp Extended: unset Data type: Timestamp Length: any Scale: unset Extended: unset Data type: Timestamp Length: any Scale: sp Extended: unset Data type: Time Length: any Scale: unset Extended: Microseconds Data type: Time Length: any Scale: sp Extended: Microseconds Data type: Timestamp Length: any Scale: unset Extended: Microseconds Data type: Timestamp Length: any Scale: sp Extended: Microseconds Data type: Unknown Length: any Scale: any Extended: any Oracle column definition DATE
TIMESTAMP(sp)
DATE
TIMESTAMP(sp)
TIMESTAMP(6)
TIMESTAMP(sp)
TIMESTAMP(6)
TIMESTAMP(sp)
NCLOB
Dictionary views
To complete specific tasks, the Oracle connector requires access to a set of Oracle dictionary views. The following list describes how the Oracle connector uses each view. ALL_CONSTRAINTS The Oracle connector accesses this view to obtain the list of constraints for a table. Obtaining a list of constraints is required for these tasks: importing a table definition, disabling constraints, and enabling constraints. ALL_INDEXES The Oracle connector accesses this view to obtain the list of indexes for a table. Obtaining this information is required for these tasks: importing a
80
table definition, determining the list of indexes to rebuild, and determining how a table is organized, either by heap or by index. ALL_OBJECTS The Oracle connector accesses this view to obtain additional metadata, such as table names and view names, for the objects that the user specifies. For example, for a parallel read that is based on Oracle partitions, the connector accesses this view to determine the object type, either table or view, and the partitions and subpartitions ALL_PART_COL_STATISTICS The Oracle connector accesses this view to determine the boundary (high) value for each partition in a table. This information is used in a partitioned write. ALL_PART_KEY_COLUMNS The Oracle connector accesses this view to determine the list of columns that are in the partition key for a table. This information is used in a partitioned write. ALL_PART_TABLES The Oracle connector accesses this view to determine partitioning method that the table uses. When the Oracle connector value is specified for the Partition type property, the Oracle connector uses the information from this view to determine the partition to which each record belongs and then to direct each record to the node that is associated with that partition. ALL_TAB_COLS The Oracle connector accesses this view to determine column metadata, for example data type, length, precision, and scale; to determine if a column is a virtual column; and to determine if a column exists and if it is of the correct data type when the Modulus or the Minimum and Maximum range partitioned read method is specified. ALL_TAB_PARTITIONS The Oracle connector accesses this view to determine the number and names of the partitions in a partitioned table. This information is required for read and write operations. ALL_TAB_SUBPARTITIONS The Oracle connector accesses this view to determine the number and names of all subpartitions in a composite-partitioned table. This information is required for read and write operations. ALL_TABLES The Oracle connector accesses this view to determine the list of tables that are accessible by the current user. Obtaining this information is required for these tasks: importing a table definition, determining which users have tables with SYSTEM or SYSAUX table space as their default table space, and determining if a specified table is partitioned. ALL_VIEWS The Oracle connector accesses this view to determine the list of views that are accessible by the current user. This information is used to present the user with a list of views that can be imported. ALL_XML_TAB_COLS To determine the XML storage option that was specified in the column definitions, the Oracle connector accesses this view during the metadata import of tables that contain XMLType columns.
81
ALL_XML_TABLES To determine the XML storage option that was specified in the table definitions, the Oracle connector accesses this view during the metadata import of XMLType tables. The Oracle connector accesses this view during metadata import of XMLType tables in order to determine the XML storage option that was specified in the table definitions. DBA_EXTENTS The Oracle connector accesses this view to gather information about the table storage organization. The connector uses the information when the Rowid range partitioned read method is selected. If select access is not granted to this view, the connector automatically switches to the Rowid hash partitioned read method. DUAL To obtain and calculate various intermediate values that the connector needs for its operation, the connector issues SELECT statements on this table. USER_TAB_PRIVS The Oracle connector accesses this table to determine if the current user was granted select privilege on a particular dictionary view, for example the DBA_EXTENTS view. If the current user was not granted select privilege, the connector takes corrective action.
Environment variables
The Oracle connector queries and uses these environment variables. CC_MSG_LEVEL This connector environment variable specifies the minimum severity of the messages that the connector reports in the log file. The default value is 3; Informational messages, as well as messages of a higher severity, are reported to the log file. The Oracle connector does not have any messages that are of the Error severity. The following list contains the valid values: v 1 - Trace v 2 - Debug v 3 - Informational v 4 - Warning v 5 - Error v 6 - Fatal CC_ORA_BIND_KEYWORD This connector environment variable specifies the identifier that indicates a bind parameter in a user-defined SQL statement. The default identifier is ORCHESTRATE. Use this environment variable to specify a different identifier in cases when SQL statements require the use of the literal ORCHESTRATE in the name of a schema, table, or column, for example. CC_ORA_CHECK_CONVERSION This connector environment variable controls whether exceptions are thrown when data loss occurs because of a conversion from the Unicode character set to the native character set of the database. The default value is FALSE. When the value of this variable is TRUE (case- insensitive), an exception is thrown when data loss occurs. CC_ORA_MAX_ERRORS_REPORT This connector environment variable specifies the maximum number of errors to report to the log file when an operation involves writing an array
82
or bulk loading data. This variable is relevant only when a reject link is not defined. The default value is -1, which reports all errors. CC_ORA_NLS_LANG_ENV This connector environment variable controls whether the NLS_LANG character set is used when the connector initializes the Oracle client environment. The default value is FALSE. When the value of this variable is TRUE (case-insensitive), the NLS_LANG character set is used; otherwise, the UTF-16 character set is used. CC_ORA_NODE_USE_PLACEHOLDER This connector environment variable controls whether the connector replaces occurrences of the processing node number placeholder with the current processing node number in all SQL statements that run on processing nodes. When the value of this variable is TRUE (case-insensitive), the connector replaces the occurrences. CC_ORA_NODE_PLACEHOLDER_NAME This connector environment variable specifies the case-sensitive value for the processing node numbers in SQL statements. Library path This variable must include the directory where the Oracle client libraries are stored. The follow list contains the name of the library path variable for each operating system: v HP-UX - LD_LIBRARY_PATH or SHLIB_PATH v IBM AIX - LIBPATH v Linux - LD_LIBRARY_PATH v Microsoft Windows - PATH LOCAL This Oracle environment variable specifies the default remote Oracle service. When this variable is defined, the connector connects to the specified database by going through an Oracle listener that accepts connection requests. This variable is for use on Microsoft Windows only. Use the TWO_TASK environment variable for Linux and UNIX. ORACLE_HOME This Oracle environment variable specifies the location of the home directory of the Oracle client installation. The connector uses the variable to locate the tnsnames.ora configuration file, which is required to make a connection to an Oracle database. The connector looks for the tnsnames.ora file under the ORACLE_HOME/network/admin directory. ORACLE_SID This Oracle environment variable specifies the default local Oracle service. When this variable is defined, the connector connects to the specified database without going through an Oracle listener. On Microsoft Windows, you can specify this environment variable in the Windows registry. Note: If both ORACLE_SID and TWO_TASK or LOCAL are defined, TWO_TASK or LOCAL takes precedence. TWO_TASK This Oracle environment variable specifies the default remote Oracle service. When this variable is defined, the connector connects to the specified database by going through an Oracle listener that accepts connection requests. This variable is for use on Linux and UNIX only. Use the LOCAL environment variable for Microsoft Windows.
Chapter 3. Oracle connector
83
Note: If both ORACLE_SID and TWO_TASK are defined, TWO_TASK takes precedence. If both ORACLE_SID and TWO_TASK or LOCAL are defined, TWO_TASK or LOCAL takes precedence. TNS_ADMIN This Oracle environment variable specifies the location of the directory that contains the tnsnames.ora configuration file. When this variable is specified, it takes precedence over the value of the ORACLE_HOME environment variable when the Oracle connector tries to locate the configuration file. The connector looks for the tnsnames.ora file directly under the TNS_ADMIN directory.
84
85
Procedure
1. Create the user defined environment variable ORACLE_HOME and set this to the $ORACLE_HOME path (for example, /disk3/oracle10g). 2. Add ORACLE_HOME/bin to your PATH and ORACLE_HOME/lib to your LIBPATH, LD_LIBRARY_PATH, or SHLIB_PATH. 3. Have login privileges to Oracle using a valid Oracle user name and corresponding password. These must be recognized by Oracle before you attempt to access it. 4. Have SELECT privilege on: v DBA_EXTENTS v DBA_DATA_FILES v v v v v v DBA_TAB_PARTITONS DBA_TAB_SUBPARTITONS DBA_OBJECTS ALL_PART_INDEXES ALL_PART_TABLES ALL_INDEXES
v SYS.GV_$INSTANCE (Only if Oracle Parallel Server is used) Note: APT_ORCHHOME/bin must appear before ORACLE_HOME/bin in your PATH. You can create a role that has the appropriate SELECT privileges, as follows: CREATE ROLE DSXE; GRANT SELECT on sys.dba_extents to DSXE; GRANT SELECT on sys.dba_data_files to DSXE; GRANT SELECT on sys.dba_tab_partitions to DSXE; GRANT SELECT on sys.dba_tab_subpartitions to DSXE; GRANT SELECT on sys.dba_objects to DSXE; GRANT SELECT on sys.all_part_indexes to DSXE; GRANT SELECT on sys.all_part_tables to DSXE; GRANT SELECT on sys.all_indexes to DSXE; Once the role is created, grant it to users who will run the IBM InfoSphere DataStage and QualityStage jobs, as follows: GRANT DSXE to <oracle userid>;
86
v In the InfoSphere DataStage and QualityStage Administrator, open the Environment Variables dialog for the project in question, and set the environment variable DS_ENABLE_RESERVED_CHAR_CONVERT to true (this can be found in the General\Customize branch). v Avoid using the strings __035__ and __036__ in your Oracle column names. __035__ is the internal representation of # and __036__ is the internal representation of $. When using this feature in your job, you should import meta data using the Plug-in Meta Data Import tool, and avoid hand-editing (this minimizes the risk of mistakes or confusion). Once the table definition is loaded, the internal column names are displayed rather than the original Oracle names both in table definitions and in the Data Browser. They are also used in derivations and expressions. The original names are used in generated SQL statements, however, and you should use them if entering SQL in the job yourself. Generally, in the Oracle stage, you enter external names everywhere except when referring to stage column names, where you use names in the form ORCHESTRATE.internal_name. When using the Oracle stage as a target, you should enter external names as follows: v For Load options, use external names for select list properties. v For Upsert option, for update and insert, use external names when referring to Oracle table column names, and internal names when referring to the stage column names. For example:
INSERT INTO tablename (A#, B$#) VALUES (ORCHESTRATE.A__036__A__035__, ORCHESTRATE.B__035__035__B__036__) UPDATE tablename SET B$# = ORCHESTRATE.B__035__035__B__036__ WHERE (A# = ORCHESTRATE.A__036__A__035__)
When using the Oracle stage as a source, you should enter external names as follows: v For Read using the user-defined SQL method, use external names for Oracle columns for SELECT: For example:
SELECT M#$, D#$ FROM tablename WHERE (M#$ > 5)
v For Read using Table method, use external names in select list and where properties. When using the Oracle stage in parallel jobs as a look-up, you should enter external or internal names as follows: v For Lookups using the user-defined SQL method, use external names for Oracle columns for SELECT, and for Oracle columns in any WHERE clause you might add. Use internal names when referring to the stage column names in the WHERE clause. For example:
SELECT M$##, D#$ FROM tablename WHERE (B$# = ORCHESTRATE.B__035__ B __036__)
v For Lookups using the Table method, use external names in select list and where properties. v Use internal names for the key option on the Inputs page Properties tab of the Lookup stage to which the Oracle stage is attached.
Chapter 4. Oracle enterprise stage
87
Loading tables
There are some special points to note when using the Load method in this stage (which uses the Oracle SQL*Loader utility) to load tables with indexes. By default, the stage sets the following options in the Oracle load control file: v DIRECT=TRUE v PARALLEL = TRUE This causes the load to run using parallel direct load mode. In order to use the parallel direct mode load, the table must not have indexes, or you must include one of the Index Mode properties, 'rebuild' or 'maintenance' (see the Index Mode section). If the only index on the table is from a primary key or unique key constraint, you can instead use the Disable Constraints property (see the Disable Constraints section) which will disable the primary key or unique key constraint, and enable it again after the load. If you set the Index Mode property to rebuild, the following options are set in the file: v SKIP_INDEX_MAINTENANCE=YES v PARALLEL=TRUE If you set the Index Mode property to maintenance, the following option is set in the file: v PARALLEL=FALSE You can use the environment variable APT_ORACLE_LOAD_OPTIONS to control the options that are included in the Oracle load control file. You can load a table with indexes without using the Index Mode or Disable Constraints properties by setting the APT_ORACLE_LOAD_OPTIONS environment variable appropriately. You need to set the Direct option or the PARALLEL option or both to FALSE, for example:
APT_ORACLE_LOAD_OPTIONS=OPTIONS(DIRECT=FALSE,PARALLEL=TRUE)
In this example the stage would still run in parallel, however, since DIRECT is set to FALSE, the conventional path mode rather than the direct path mode would be used. If APT_ORACLE_LOAD_OPTIONS is used to set PARALLEL to FALSE, then you must set the execution mode of the stage to run sequentially on the Advanced tab of the Stage page (see the Advanced tab section). If loading index organized tables (IOTs), you should not set both DIRECT and PARALLEL to true as direct parallel path load is not allowed for IOTs.
88
Table 8. Data type conversion for writing data to an Oracle database InfoSphere DataStage SQL Data Type Date Time Timestamp Timestamp with Extended=Microseconds Decimal Numeric TinyInt TinyInt with Extended=Unsigned SmallInt SmallInt with Extended=Unsigned Integer Integer with Extended=Unsigned BigInt BigInt with Extended=Unsigned Float Real Double Binary with Length undefined VarBinary with Length undefined LongVarBinary with Length undefined Binary with Length=n VarBinary with Length=n LongVarBinary with Length=n Char with Extended undefined and Length undefined NChar with Length undefined Char with Extended=Unicode and Length undefined Char with Extended undefined and Length=n NChar with Length=n Char with Extened=Unicode and Length=n Underlying Data Type date time timestamp timestamp[microseconds] decimal (p, s) int8 uint8 int16 uint16 int32 uint32 int64 uint64 sfloat dfloat raw raw[] Oracle Data Type DATE DATE (does not support microsecond resolution) DATE (does not support microsecond resolution) TIMESTAMP (6) NUMBER (p, s) NUMBER (3, 0) NUMBER (3, 0) NUMBER (5, 0) NUMBER (5, 0) NUMBER (10, 0) NUMBER (10, 0) NUMBER (19) NUMBER (20) BINARY_FLOAT BINARY_DOUBLE RAW (2000) RAW (2000)
raw[n] raw[max=n]
string
CHAR (32)
ustring
NVARCHAR (32)
string[n] ustring[n]
89
Table 8. Data type conversion for writing data to an Oracle database (continued) InfoSphere DataStage SQL Data Type Bit Unknown Underlying Data Type uint16 Oracle Data Type NUMBER (5)
fixed-length string in the NVARCHAR(32) form string[n] and ustring[n]; length <= 255 bytes string[] VARCHAR2 (32)
LongVarChar with Extended undefined and Length undefined VarChar with Extended undefined and Length undefined NVarChar with Length undefined LongNVarChar with Length undefined LongVarChar with Extended=Unicode and Length undefined VarChar with Extended=Unicode and Length undefined LongVarChar with Extended undefined and Length=n VarChar with Extended undefined and Length=n NVarChar with Length=n LongNVarChar with Length=n LongVarChar with Extended=Unicode and Length=n VarChar with Extended=Unicode and Length=n
ustring[]
NVARCHAR2 (32)
string[max=n]
VARCHAR2 (n)
ustring[max=n]
NVARCHAR2 (n)
The default length of VARCHAR is 32 bytes. That is, 32 bytes are allocated for each variable-length string field in the input data set. If an input variable-length string field is longer than 32 bytes, the stage issues a warning.
90
Table 9. Data type conversion for reading data from an Oracle database InfoSphere DataStage SQL Data Type Unknown Char LongVarChar VarChar NChar NVarChar LongNVarChar Unknown Char LongVarChar VarChar NChar NVarChar LongNVarChar Timestamp Decimal Numeric Integer Decimal Numeric Underlying Data Type string[n] or ustring[n] Fixed length string with length = n Oracle Data Type CHAR(n)
VARCHAR(n)
Timestamp decimal (38,10) int32 if precision (p) <11 and scale (s) = 0 decimal[p, s] if precision (p) =>11 and scale (s) > 0 not supported
not supported
v LONG v CLOB v NCLOB v BLOB v INTERVAL YEAR TO MONTH v INTERVAL MONTH TO DAY v BFILE
91
The job is illustrated in the following figure. The stage editor that you use to edit this stage is based on the generic stage editor. The Data_set stage provides the primary input, the Oracle_8 stage provides the lookup data, Lookup_1 performs the lookup and outputs the resulting data to Data_Set_3. In the Oracle stage, specify that you are going to look up the data directly in the Oracle database, and the name of the table you are going to lookup. In the Lookup stage, you specify the column that you are using as the key for the lookup.
The properties for the Oracle stage are given in the following table:
Table 13. Properties for Oracle stage Property name Lookup Type Read Method Setting Sparse Table
92
Table 13. Properties for Oracle stage (continued) Property name Table Setting interest
Specify upsert as the write method and select User-defined Update & Insert as the upsert mode. The existing name column is not included in the INSERT statement. The properties (showing the INSERT statement) are shown below. The INSERT statement is as generated by the IBM InfoSphere DataStage, except the name column is removed.
INSERT INTO horse_health (wormer_type, dose_interval, dose_level) VALUES (ORCHESTRATE.name, ORCHESTRATE.wormer_type, ORCHESTRATE.dose_interval, ORCHESTRATE.dose_level)
Must Do's
The IBM InfoSphere DataStage has many defaults which means that it can be very easy to include Oracle enterprise stages in a job. This section specifies the minimum steps to take to get a Oracle enterprise stage functioning. The InfoSphere DataStage provides a versatile user interface, and there are many shortcuts to achieving a particular end, this section describes the basic method, you will learn where the shortcuts are when you get familiar with the product. The steps required depend on what you are using an Oracle enterprise stage for.
Chapter 4. Oracle enterprise stage
93
Procedure
1. In the Input link Properties tab, under the Target category: a. Specify a Write Method of Load.
94
b. Specify the Table you are writing. c. Specify the Write Mode (by default the IBM InfoSphere DataStage appends to existing tables, you can also decide to create a new table, replace an existing table, or keep existing table details but replace all the rows). Under the Connection category, you can either manually specify a connection string, or have the InfoSphere DataStage generate one for you by using a user name and password you supply. Either way you need to supply a valid user name and password. The InfoSphere DataStage encrypts the password when you use the auto-generate option. By default, the InfoSphere DataStage assumes Oracle resides on the local server, but you can specify a remote server if required. 2. Ensure column metadata has been specified for the write.
95
using a user name and password you supply. Either way you need to supply a valid user name and password. The InfoSphere DataStage encrypts the password when you use the auto-generate option. By default, the InfoSphere DataStage assumes Oracle resides on the local server, but you can specify a remote server if required. 3. Ensure column meta data has been specified for the lookup.
Stage page
The General tab allows you to specify an optional description of the stage. The Advanced tab allows you to specify how the stage executes. The NLS Map tab appears if you have NLS enabled on your system, it allows you to specify a character set map for the stage.
Advanced tab
This tab allows you to specify the following values: v Execution Mode. The stage can run in parallel mode or sequential mode. In parallel mode the data is processed by the available nodes as specified in the Configuration file, and by any node constraints specified on the Advanced tab. In Sequential mode the data is processed by the conductor node. v Combinability mode. This is Auto by default, which allows the IBM InfoSphere DataStage to combine the operators that underlie parallel stages. Then they run in the same process if it is sensible for this type of stage. v Preserve partitioning. You can select Set or Clear. If you select Set read operations will request that the next stage preserves the partitioning as is (it is ignored for write operations). Note that this field is only visible if the stage has output links. v Node pool and resource constraints. Select this option to constrain parallel execution to the node pool or pools or the resource pool or pools specified in the grid. The grid allows you to make choices from drop down lists populated from the Configuration file. v Node map constraint. Select this option to constrain parallel execution to the nodes in a defined node map. You can define a node map by typing node numbers into the text box or by clicking the browse button to open the Available Nodes dialog box and selecting nodes from there. You are effectively defining a new node pool for this stage (in addition to any node pools defined in the Configuration file).
NLS Map
The NLS Map tab allows you to define a character set map for the Oracle enterprise stage. You can set character set maps separately for NCHAR and
96
NVARCHAR2 types and all other data types. This overrides the default character set map set for the project or the job. You can specify that the map be supplied as a job parameter if required. Load performance might be improved by specifying an Oracle map instead of an IBM InfoSphere DataStage map. To do this, add an entry to the file oracle_cs, located at $APT_ORCHHOME/etc, to associate the InfoSphere DataStage map with an Oracle map. The oracle_cs file has the following format:
UTF-8 ISO-8859-1 EUC-JP UTF8 WE8ISO8859P1 JA16EUC
The first column contains the InfoSphere DataStage map names and the second column the Oracle map names they are associated with. By using the example file shown above, specifying the InfoSphere DataStage map EUC-JP in the Oracle stage will cause the data to be loaded using the Oracle map JA16EUC.
Inputs page
The Inputs page allows you to specify details about how the Oracle enterprise stage writes data to a Oracle database. The Oracle enterprise stage can have only one input link writing to one table. The General tab allows you to specify an optional description of the input link. The Properties tab allows you to specify details of exactly what the link does. The Partitioning tab allows you to specify how incoming data is partitioned before being written to the database. The Columns tab specifies the column definitions of incoming data. The Advanced tab allows you to change the default buffering settings for the input link. Details about Oracle enterprise stage properties, partitioning, and formatting are given in the following sections. See the IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide for a general description of the other tabs.
97
Table 15. Input link properties and values (continued) Category/ Property Target/Delete Rows Mode Target/Delete SQL Target/Upsert mode Values Auto-generated delete/userdefined delete String Default Auto-generated delete N/A Required? Dependent of
Y if Write N/A method = Delete Rows Y if Write N/A method = Delete Rows Y (if Write Method = Upsert) N/A
Auto-generated Update & insert/Autogenerated Update Only/Userdefined Update & Insert/Userdefined Update Only Insert then update/Update then insert string number string
Target/Upsert Order Target/Insert SQL Target/Insert Array Size Target/Update SQL Target/Write Method Target/Write Mode
N/A
Delete Rows/Upsert/ Load Append/ Create/ Replace/ Truncate string Auto-generate/ User-defined string
Load
N/A
Append
N/A
Connection/DB Options Connection/DB Options Mode Connection/ User Connection/ Password Connection/ Additional Connection Options Connection/ Remote Server
Y Y
N/A N/A
Y (if DB Options DB Options Mode Mode = Auto-generate) Y (if DB Options DB Options Mode Mode = Auto-generate) N DB Options Mode
string
N/A
string
N/A
string
N/A
N/A
98
Table 15. Input link properties and values (continued) Category/ Property Options/Output Reject Records Values True/False Default False Required? Y (if Write Method = Upsert) Y (if Write Method = Load) Y (if Write Method = Load and Write Mode = Create or Replace) Y (if Write Method = Load) N N N N Dependent of N/A
False
N/A
Heap
N/A
True/False
False
N/A
string
Options/Default number String Length Options/Index Mode Options/Add NOLOGGING clause to Index rebuild Options/Add COMPUTE STATISTICS clause to Index rebuild Options/Open Command Options/Oracle Partition Options/Create Primary Keys Options/Create Statement Maintenance/ Rebuild True/False
True/False
False
Index Mode
N N
N/A N/A
Y (if Write Mode N/A = Create or Replace) N Y (if Write Method = Load) N N N/A N/A Disable Constraints N/A
string
Options/Disable True/False Constraints Options/ string Exceptions Table Options/Table has NCHAR/ NVARCHAR True/False
99
Target category
These are the properties available in the Target category.
Table
Specify the name of the table to write to. You can specify a job parameter if required.
Delete SQL
Only appears for the Delete Rows write method. This property allows you to view an auto-generated Delete statement, or to specify your own (depending on the setting of the Delete Rows Mode property).
Upsert mode
This only appears for the Upsert write method. Allows you to specify how the insert and update statements are to be derived. Select from: v Auto-generated Update & Insert. The InfoSphere DataStage generates update and insert statements for you, based on the values you have supplied for table name and on column details. The statements can be viewed by selecting the Insert SQL or Update SQL properties. v Auto-generated Update Only. The InfoSphere DataStage generates an update statement for you, based on the values you have supplied for table name and on column details. The statement can be viewed by selecting the Update SQL properties. v User-defined Update & Insert. Select this to enter your own update and insert statements. Then select the Insert SQL and Update SQL properties and edit the statement proformas. v User-defined Update Only. Select this to enter your own update statement. Then select the Update SQL property and edit the statement proforma.
Upsert Order
This only appears for the Upsert write method. Allows you to decide between the following values: v Insert and, if that fails, update (Insert then update) v Update and, if that fails, insert (Update then insert)
100
Insert SQL
Only appears for the Upsert write method. This property allows you to view an auto-generated Insert statement, or to specify your own (depending on the setting of the Update Mode property). It has a dependent property: v Insert Array Size Specify the size of the insert host array. The default size is 500 records. If you want each insert statement to be executed individually, specify 1 for this property.
Update SQL
Only appears for the Upsert write method. This property allows you to view an auto-generated Update statement, or to specify your own (depending on the setting of the Upsert Mode property).
Write Method
Select from Delete Rows, Upsert or Load (the default value). Upsert allows you to provide the insert and update SQL statements and uses Oracle host-array processing to optimize the performance of inserting records. Load sets up a connection to Oracle and inserts records into a table, taking a single input data set. The Write Mode property determines how the records of a data set are inserted into the table.
Write Mode
This only appears for the Load Write Method. Select from the following values: v Append. This is the default value. New records are appended to an existing table. v Create. Create a new table. If the Oracle table already exists an error occurs and the job terminates. You must specify this mode if the Oracle table does not exist. v Replace. The existing table is first dropped and an entirely new table is created in its place. Oracle uses the default partitioning method for the new table. v Truncate. The existing table attributes (including schema) and the Oracle partitioning keys are retained, but any existing records are discarded. New records are then appended to the table.
Connection category
These are the properties available in the Connection category.
DB Options
Specify a user name and password for connecting to Oracle in the form:
<user=< user >,password=< password >[,arraysize= < num_records >]
The IBM InfoSphere DataStage does not encrypt the password when you use this option. Arraysize is only relevant to the Upsert Write Method.
101
DB Options Mode
If you select Auto-generate for this property, the InfoSphere DataStage will create a DB Options string for you. If you select User-defined, you have to edit the DB Options property yourself. When Auto-generate is selected, there are three dependent properties: v User The user name to use in the auto-generated DB options string. v Password The password to use in the auto-generated DB options string. The InfoSphere DataStage encrypts the password. Note: If you have a password with special characters, enclose the password in quotation marks. For example: "passw#rd". v Additional Connection Options Optionally allows you to specify additional options to add to the Oracle connection string.
Remote Server
This is an optional property. Allows you to specify a remote server name.
Options category
These are the properties available in the Options category.
Create Statement
This is an optional property available with a Write Method of Load and a Write Mode of Create. Contains an SQL statement to create the table (otherwise the IBM InfoSphere DataStage will auto-generate one).
Disable Constraints
This is False by default. Set True to disable all enabled constraints on a table when loading, then attempt to re-enable them at the end of the load. This option is not available when you select a Table Organization type of Index to use index organized tables. When set True, it has a dependent property: v Exceptions Table This property enables you to specify an exceptions table, which is used to record ROWID information for rows that violate constraints when the constraints are re-enabled. The table must already exist.
102
Table Organization
This appears only for the Load Write Method using the Create or Replace Write Mode. Allows you to specify Index (for index organized tables) or heap organized tables (the default value). When you select Index, you must also set Create Primary Keys to true. In index organized tables (IOTs) the rows of the table are held in the index created from the primary keys.
Close Command
This is an optional property and only appears for the Load Write Method. Use it to specify any command, in single quotes, to be parsed and executed by the Oracle database on all processing nodes after the stage finishes processing the Oracle table. You can specify a job parameter if required.
Index Mode
This is an optional property and only appears for the Load Write Method. Lets you perform a direct parallel load on an indexed table without first dropping the index. You can select either Maintenance or Rebuild mode. The Index property only applies to append and truncate Write Modes. Rebuild skips index updates during table load and instead rebuilds the indexes after the load is complete using the Oracle alter index rebuild command. The table must contain an index, and the indexes on the table must not be partitioned. The Rebuild option has two dependent properties: v Add NOLOGGING clause to Index rebuild This is False by default. Set True to add a NOLOGGING clause.
Chapter 4. Oracle enterprise stage
103
v Add COMPUTE STATISTICS clause to Index rebuild This is False by default. Set True to add a COMPUTE STATISTICS clause. Maintenance results in each table partition's being loaded sequentially. Because of the sequential load, the table index that exists before the table is loaded is maintained after the table is loaded. The table must contain an index and be partitioned, and the index on the table must be a local range-partitioned index that is partitioned according to the same range values that were used to partition the table. Note that in this case sequential means sequential per partition, that is, the degree of parallelism is equal to the number of partitions.
Open Command
This is an optional property and only appears for the Load Write Method. Use it to specify a command, in single quotes, to be parsed and executed by the Oracle database on all processing nodes before the Oracle table is opened. You can specify a job parameter if required.
Oracle Partition
This is an optional property and only appears for the Load Write Method. Name of the Oracle table partition that records will be written to. The stage assumes that the data provided is for the partition specified.
Partitioning tab
The Partitioning tab allows you to specify details about how the incoming data is partitioned or collected before it is written to the Oracle database. It also allows you to specify that the data should be sorted before being written. By default the stage partitions in Auto mode. This attempts to work out the best partitioning method depending on execution modes of current and preceding stages and how many nodes are specified in the Configuration file. If the Oracle enterprise stage is operating in sequential mode, it will first collect the data before writing it to the file by using the default Auto collection method. The Partitioning tab allows you to override this default behavior. The exact operation of this tab depends on: v Whether the Oracle enterprise stage is set to run in parallel or sequential mode. v Whether the preceding stage in the job is set to run in parallel or sequential mode. If the Oracle enterprise stage is set to run in parallel, then you can set a partitioning method by selecting from the Partition type drop-down list. This will override any current partitioning.
104
If the Oracle enterprise stage is set to run in sequential mode, but the preceding stage is executing in parallel, then you can set a collection method from the Collector type drop-down list. The following partitioning methods are available: v (Auto). The IBM InfoSphere DataStage attempts to work out the best partitioning method depending on execution modes of current and preceding stages and how many nodes are specified in the Configuration file. This is the default partitioning method for the Oracle enterprise stage. v Entire. Each file written to receives the entire data set. v Hash. The records are hashed into partitions based on the value of a key column or columns selected from the Available list. v Modulus. The records are partitioned using a modulus function on the key column selected from the Available list. This is commonly used to partition on tag fields. v Random. The records are partitioned randomly, based on the output of a random number generator. v Round Robin. The records are partitioned on a round robin basis as they enter the stage. v Same. Preserves the partitioning already in place. This is the default value for Oracle enterprise stages. v DB2. Replicates the partitioning method of the specified IBM DB2 table. Requires extra properties to be set. Access these properties by clicking the properties button. v Range. Divides a data set into approximately equal size partitions based on one or more partitioning keys. Range partitioning is often a preprocessing step to performing a total sort on a data set. Requires extra properties to be set. Access these properties by clicking the properties button. The following Collection methods are available: v (Auto). This is the default collection method for Oracle enterprise stages. Normally, when you are using the Auto mode, the InfoSphere DataStage will eagerly read any row from any input partition as it becomes available. v Ordered. Reads all records from the first partition, then all records from the second partition, and continuing on. v Round Robin. Reads a record from the first input partition, then from the second partition, and continuing on. After reaching the last partition, the operator starts over. v Sort Merge. Reads records in an order based on one or more columns of the record. This requires you to select a collecting key column from the Available list. The Partitioning tab also allows you to specify that data arriving on the input link should be sorted before being written to the file or files. The sort is always carried out within data partitions. If the stage is partitioning incoming data the sort occurs after the partitioning. If the stage is collecting data, the sort occurs before the collection. The availability of sorting depends on the partitioning or collecting method chosen (it is not available with the default Auto methods). Select the check boxes as follows: v Perform Sort. Select this to specify that data coming in on the link should be sorted. Select the column or columns to sort on from the Available list.
Chapter 4. Oracle enterprise stage
105
v Stable. Select this if you want to preserve previously sorted data sets. This is the default value. v Unique. Select this to specify that, if multiple records have identical sorting key values, only one record is retained. If stable sort is also set, the first record is retained. If NLS is enabled an additional button opens a dialog box allowing you to select a locale specifying the collate convention for the sort. You can also specify sort direction, case sensitivity, whether sorted as ASCII or EBCDIC, and whether null columns will appear first or last for each column. Where you are using a keyed partitioning method, you can also specify whether the column is used as a key for sorting, for partitioning, or for both. Select the column in the Selected list and right-click to invoke the pop-up menu.
Outputs page
The Outputs page allows you to specify details about how the Oracle enterprise stage reads data from a Oracle database. The Oracle enterprise stage can have only one output link. Alternatively it can have a reference output link, which is used by the Lookup stage when referring to a Oracle lookup table. It can also have a reject link where rejected records are routed (used in conjunction with an input link). The Output Name drop-down list allows you to choose whether you are looking at details of the main output link or the reject link. The General tab allows you to specify an optional description of the output link. The Properties tab allows you to specify details of exactly what the link does. The Columns tab specifies the column definitions of the data. The Advanced tab allows you to change the default buffering settings for the output link. Details about Oracle enterprise stage properties are given in the following sections. See the IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide for a general description of the other tabs.
106
Table 16. Output link properties and values Category/ Property Source/Lookup Type Values Normal/ Sparse Default Normal Required? Y (if output is reference link connected to Lookup stage) Y Dependent of N/A
Source/Read Method
Auto-generated SQL /Table/SQL builder generated SQL /User-defined SQL string string string string
N/A
N N N N N Y Y
Source/Partition string Table Connection/DB Options Connection/ DB Options Mode Connection/ User Connection/ Password Connection/ Additional Connection Options Connection/ Remote Server Options/Close Command Options/Open Command Options/Table has NCHAR/ NVARCHAR string Auto-generate/ User-defined string
N/A
Y (if DB Options DB Options Mode Mode = Auto-generate) Y (if DB Options DB Options Mode Mode = Auto-generate) N DB Options Mode
string
N/A
string
N/A
N N N N
Source category
These are the properties available in the Source category.
107
Lookup Type
Where the Oracle enterprise stage is connected to a Lookup stage using a reference link, this property specifies whether the Oracle enterprise stage will provide data for an in-memory look up (Lookup Type = Normal) or whether the lookup will access the database directly (Lookup Type = Sparse).
Read Method
This property specifies whether you are specifying a table or a query when reading the Oracle database, and how you are generating the query. v Select the Table method in order to use the Table property to specify the read. This will read in parallel. v Select Auto-generated SQL to have the IBM InfoSphere DataStage automatically generate an SQL query based on the columns you have defined and the table you specify in the Table property. v Select User-defined SQL to define your own query. By default a user-defined or auto-generated SQL will read sequentially on one node. Read methods of Auto-generated SQL and User-defined SQL operate sequentially on a single node. You can have the User-defined SQL read operate in parallel if you specify the Partition Table property. v Select SQL Builder Generated SQL to open the SQL Builder and define the query using its helpful interface. (See the IBM InfoSphere DataStage and QualityStage Designer Client Guide.) By default, Read methods of SQL Builder Generated SQL, Auto-generated SQL, and User-defined SQL operate sequentially on a single node. You can have the User-defined SQL read operate in parallel if you specify the Partition Table property.
SQL Query
Optionally allows you to specify an SQL query to read a table. The query specifies the table and the processing that you want to perform on the table as it is read by the stage. This statement can contain joins, views, database links, synonyms, and other entities.
Table
Specifies the name of the Oracle table. The table must exist and you must have SELECT privileges on the table. If your Oracle user name does not correspond to the owner of the specified table, you can prefix it with a table owner in the form:
table_owner.table_name
Table has dependent properties: v Where Stream links only. Specifies a WHERE clause of the SELECT statement to specify the rows of the table to include or exclude from the read operation. If you do not supply a WHERE clause, all rows are read. v Select List Optionally specifies an SQL select list, enclosed in single quotes, that can be used to determine which columns are read. You must specify the columns in list in the same order as the columns are defined in the record schema of the input table.
108
Partition Table
Specifies execution of the SELECT in parallel on the processing nodes containing a partition derived from the named table. If you do not specify this, the stage executes the query sequentially on a single node.
Connection category
These are the properties available in the Connection category.
DB Options
Specify a user name and password for connecting to Oracle in the form:
<user=< user >,password=< password >[,arraysize=< num_records >]
The IBM InfoSphere DataStage does not encrypt the password when you use this option. Arraysize only applies to stream links. The default arraysize is 1000.
DB Options Mode
If you select Auto-generate for this property, the InfoSphere DataStage will create a DB Options string for you. If you select User-defined, you have to edit the DB Options property yourself. When Auto-generate is selected, there are two dependent properties: v User The user name to use in the auto-generated DB options string. v Password The password to use in the auto-generated DB options string. The InfoSphere DataStage encrypts the password Note: If you have a password with special characters, enclose the password in quotation marks. For example: "passw#rd". v Additional Connection Options Optionally allows you to specify additional options to add to the Oracle connection string.
Remote Server
This is an optional property. Allows you to specify a remote server name.
Options category
These are the properties available in the Options category.
Close Command
This is an optional property and only appears for stream links. Use it to specify any command to be parsed and executed by the Oracle database on all processing nodes after the stage finishes processing the Oracle table. You can specify a job parameter if required.
Open Command
This is an optional property only appears for stream links. Use it to specify any command to be parsed and executed by the Oracle database on all processing
Chapter 4. Oracle enterprise stage
109
nodes before the Oracle table is opened. You can specify a job parameter if required
110
111
112
v v v v v
For the SHLIB_PATH environment variable, the InfoSphere DataStage library entries must be referenced before any branded-ODBC library entries at run time. Note: You should have read and execute permissions to use libraries in the $ORACLE_HOME/lib and $ORACLE_HOME/bin directories and read permissions on all files in the $ORACLE_HOME directory. Otherwise, you might experience problems using Oracle OCI stage to connect to Oracle.
113
The NLS tab defines a character set map to be used with the stage. (The NLS tab appears only if you have installed NLS.) For details, see "Defining Character Set Mapping" v Input. This page is displayed only if you have an input link to this stage. It specifies the SQL table to use and the associated column definitions for each data input link. This page also specifies the type of update action and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL statement used to write the data and lets you enable case sensitivity for SQL statements. v Output. This page is displayed only if you have an output link to this stage. It specifies the SQL tables to use and the associated column definitions for each data output link. This page also specifies the type of query and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL SELECT statement used to extract the data, and lets you enable case sensitivity for SQL statements.
Procedure
1. Define the connection (see the following section). 2. Optional. Define a character set map. 3. Define the data on the input links. 4. Define the data on the output links.
Procedure
1. Enter the name of the Oracle database alias to access in the Database source name field. (This is the name you created using the Oracle Configuration Assistant.) Unless the database has a guest account, User ID must be a valid user in the database, have an alias in the database, or be a system administrator or system security officer. There is no default value. 2. Enter the user name to use to connect to the Oracle database in the User ID field. This user must have sufficient privileges to access the specified database and source and target tables. This field is required except when Use OS level authentication is selected. There is no default value. 3. Enter the password that is associated with the specified user name to use in the Password field. This field is required except when Use OS level authentication is selected. There is no default value. 4. Select an appropriate transaction isolation level to use from the Transaction Isolation list on the General tab on the Input page or Output page (see the General tab in "Defining Input Data" or "Defining Output Data" ). This level provides the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. Because Oracle does not prevent other transactions from modifying the data read by a
114
query, that data might be changed by other transactions between two executions of the query. Thus, a transaction that executes a given query twice might experience both nonrepeatable reads and phantoms. Use one of the following transaction isolation levels: Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. Read committed is the default ISO level for all transactions. Serializable. Takes exclusive locks on modified data and sharable locks on all other data. Serializable transactions see only those changes that were committed at the time the transaction began. For more information about using these levels, see your Oracle documentation. 5. Enter an optional description of the Oracle OCI stage in the Description field. 6. Select Use OS level authentication to automatically log on using your operating system user name and password. The default value is cleared. For further details on Oracle login information, see your Oracle documentation.
Procedure
Specify information using the following fields: v Map name to use with stage. Defines the default character set map for the project or the job. You can change the map by selecting a map name from the list. v Show all maps. Lists all the maps that are shipped with the IBM InfoSphere DataStage. v Loaded maps only. Lists only the maps that are currently loaded. v Use Job Parameter.... Specifies parameter values for the job. Use the format #Param#, where Param is the name of the job parameter. The string #Param# is replaced by the job parameter when the job is run.
Procedure
1. Choose the name of the input link you want to edit from the Input name list. This list displays all the input links to the Oracle OCI stage. 2. Click Columns... to display a brief list of the columns designated on the input link. As you enter detailed metadata in the Columns tab, you can leave this list displayed.
Chapter 5. Oracle OCI Stages
115
3. Click View Data... to invoke the Data Browser. This lets you look at the data associated with the input link in the database.
Options tab
Use the Options tab to create or drop tables and to specify miscellaneous Oracle link options.
116
v Table name. Names the target Oracle table to which the data is written. The table must exist or be created by choosing generate DDL from the Create table action list. Depending on the operations performed, you must be granted the appropriate permissions or privileges on the table. There is no default value. Click ... (Browse button) to browse the Repository to select the table. v Create table action. Creates the target table in the specified database if Generate DDL is selected. It uses the column definitions in the Columns tab and the table name and the TABLESPACE and STORAGE properties for the target table. The generated Create Table statement includes the TABLESPACE and STORAGE keywords, which indicate the location where the table is created and the storage expression for the Oracle storage-clause. You must have CREATE TABLE privileges on your schema. You can also specify your own CREATE TABLE SQL statement. You must enter the storage clause in Oracle format. (Use the User-defined DDL tab on the SQL tab for a complex statement.) Select one of the following options to create the table: Do not create target table. Specifies that the target table is not created, and the Drop table action field and the Create Table Properties button on the right of the dialog are disabled. Generate DDL. Specifies that the stage generates the CREATE TABLE statement using information from Table name, the column definitions grid, and the values in the Create Table Properties dialog. User-defined DDL. Specifies that you enter the appropriate CREATE TABLE statement. Click the button to open the Create Table Properties dialog to display the table space and storage expression values for generating the DDL. v Drop table action. Drops the target table before it is created by the stage if Generate DDL is selected. This field is disabled if you decide not to create the target table. The list displays the same items as the Create table action list except that they apply to the DROP TABLE statement. You must have DROP TABLE privileges on your schema. v Array size. Specifies the number of rows to be transferred in one call between the IBM InfoSphere DataStage and the Oracle before they are written. Enter a positive integer to indicate how often Oracle performs writes to the database. The default value is 1, that is, each row is written in a separate statement. Larger numbers use more memory on the client to cache the rows. This minimizes server round trips and maximizes performance by executing fewer statements. If this number is too large, the client might run out of memory. Array size has implications for the InfoSphere DataStage's handling of reject rows. v Transaction size. This field exists for backward compatibility, but it is ignored for version 3.0 and later of the stage. The transaction size for new jobs is now handled by Rows per transaction on the Transaction Handling tab. v Transaction Isolation. Provides the necessary concurrency control between transactions in the job and other transactions. Use one of the following transaction isolation levels: Read committed. Takes exclusive locks on modified data and sharable locks on all other data. Each query executed by a transaction sees only data that was committed before the query (not the transaction) began. Oracle queries never read dirty (uncommitted) data. This is the default value.
117
Serializable. Takes exclusive locks on modified data and sharable locks on all other data. Serializable transactions see only the changes that were committed at the time the transaction began. Note: If Enable transaction grouping is selected on the Transaction Handling tab, only the Transaction Isolation value for the first link is used for the entire group. v Treat warning message as fatal error. Determines the behavior of the stage when an error is encountered while writing data to a table. If the check box is selected, a warning message is logged as fatal, and the job aborts. The format of the error message is:
ORA-xxxxx Oracle error text message and row value
If the check box is cleared (the default), three warning messages are logged in the InfoSphere DataStage Director log file, and the job continues. The format of the error message is:
value of the row causing the error ORA-xxxxx Oracle error text message DBMS.CODE=ORA-xxxxx
The last warning message is used for Reject Link Variables. If you want to use the Reject Link Variables functionality, you must clear the check box. v Enable case sensitive table/column name. Enables the use of case-sensitive table and column names. Select to enclose table and column names in SQL statements in double quotation marks (" "). It is cleared by default.
Columns Tab
On the Columns tab, you can view and modify column metadata for the input link. Use the Save button to save any modifications that you make in the column metadata. Use the Load button to load an existing source table. From the Table Definitions window, select the appropriate table to load and click OK. The Select Column dialog is displayed. To ensure appropriate conversion of data types, clear the Ensure all Char columns use Unicode check box.
SQL Tab
The SQL tab contains the Query, Before, After, Generated DDL, and User-defined DDL tabs. Use these tabs to display the stage-generated SQL statement and the SQL statement that you can enter. v Query. This tab is displayed by default. It is similar to the General tab, but it contains the SQL statements that are used to write data to Oracle. It is based on the current values of the stage and link properties. You cannot edit these statements unless Query type is set to Enter custom SQL statement or Load SQL from a file at run time. v Before. Contains the SQL statements executed before the stage processes any job data rows. The parameter on the Before tab corresponds to the Before SQL and Continue if Before SQL fails grid properties. The Continue if Before SQL fails property is represented by the Treat errors as non-fatal check box, and the SQL statement is entered in a resizable edit box. The Before and After tabs look alike. If the property value begins with FILE=, the remaining text is interpreted as a path name, and the contents of the file supplies the property value.
118
The Before SQL is the first SQL statement to be run. Depending on your choice, the job can continue or terminate after failing to execute a Before statement. It does not affect the transaction grouping scheme. The commit or rollback is performed on a per-link basis. Each SQL statement is executed as a separate transaction if the statement separator is a double semi-colon ( ;; ). All SQL statements are executed in a single transaction if a semi-colon ( ; ) is the separator. Treat errors as non-fatal. If selected, errors caused by Before SQL are logged as warnings, and processing continues with the next command batch. Each separate execution is treated as a separate transaction. If cleared, errors are treated as fatal to the job, and result in a transaction rollback. The transaction is committed only if all statements successfully run. v After. Contains the SQL statements executed after the stage processes the job data rows. The parameters on this tab correspond to the After SQL and Continue if After SQL fails grid properties. The Continue if After SQL fails property is represented by the Treat errors as non-fatal check box, and the SQL statement is entered in a resizable edit box. The Before and After tabs look alike. If the property value begins with FILE=, the remaining text is interpreted as a path name, and the contents of the file supplies the property value. The After SQL statement is the last SQL statement to be run. Depending on your choice, the job can continue or terminate after failing to execute an After SQL statement. It does not affect the transaction grouping scheme. The commit or rollback is performed on a per-link basis. Each SQL statement is executed as a separate transaction if the statement separator is a double semi-colon ( ;; ). All SQL statements are executed in a single transaction if a semi-colon ( ; ) is the separator. The behavior of Treat errors as non-fatal is the same as for Before. v Generated DDL. Select Generate DDL or User-defined DDL from the Create table action field on the Options tab to enable this tab. The CREATE TABLE statement field displays the CREATE TABLE statement that is generated from the column metadata definitions and the information provided on the Create Table Properties dialog box. If you select an option other than Do not drop target table from the Drop table action list, the DROP statement field displays the generated DROP TABLE statement for dropping the target table. v User-defined DDL. Select User-defined DDL from the Create table action or Drop table action field on the Options tab to enable this tab. The generated DDL statement is displayed as a starting point to define a CREATE TABLE and a DROP TABLE statement. If the property value begins with FILE=, the remaining text is interpreted as a path name, and the contents of the file supplies the property value. The DROP TABLE statement field is disabled if User-defined DDL is not selected from the Drop table action field. If Do not drop target is selected, the DROP statement field is empty in the Generated DDL and User-defined DDL tabs. Note: Once you modify the user-defined DDL statement from the original generated DDL statement, changes made to other table-related properties do not affect the user-defined DDL statement. If, for example, you add a new column in the column grid after modifying the user-defined DDL statement, the new column appears in the generated DDL statement but does not appear in the user-defined DDL statement.
119
Handling Transactions
About this task
You can specify transaction control information for a transaction group.
Procedure
1. Click the Transaction Handling tab. 2. Select Enable transaction grouping. 3. For transaction groups, Rows per transaction is automatically set to 1, and you cannot change this setting.
120
4. Supply necessary details about the transaction group in the grid. The grid has a line for every link in the transaction group. The links are shown in transaction processing order, which is set in the preceding Transformer stage. Each line contains the following information: v Input name. The non-editable name of the input link. v On Skip. Specifies whether to continue or to roll back the transaction if a link is skipped because of an unsatisfied constraint on it. Rows arriving at its link are skipped until the controlling link starts another transaction. Choose Continue or Rollback from the list. v On Fail. Specifies whether to continue or rollback if the SQL statement fails to execute. Choose Continue or Rollback from the list.
Procedure
1. Set Array Size to 1. 2. Use a Transformer stage to redirect the rejected rows.
What to do next
You can design your job by selecting an appropriate target for the rejected rows, such as a Sequential stage. Reuse this target as an input source once you resolve the issues with the offending row values.
121
Procedure
1. Select Use SQL Builder tool as the Query Type from the General tab of the input or output link or from the SQL tab. 2. Click the SQL Builder button. The SQL Builder window opens.
Procedure
1. Select Generate Update actions from Options and Columns tabs from the Query Type list. 2. Specify how you want the data to be written by choosing a suitable option from the Update action list. Select one of these options for a generated statement: v Clear table then insert rows v Truncate table then insert rows v Insert rows without clearing v Delete existing rows only v Replace existing rows completely Update existing rows only Update existing rows or insert new rows Insert new rows or update existing rows User-defined SQL User-defined SQL file See "Defining Input Data" for a description of each update action. 3. Enter an optional description of the input link in the Description field. v v v v v 4. Enter a table name in the Table name field on the Options page. 5. Click the Columns tab on the Input page. The Columns tab appears. 6. Edit the Columns grid to specify column definitions for the columns you want to write. The SQL statement is automatically constructed using your chosen update action and the columns you have specified. 7. Click the SQL tab on the Input page, then the Generated tab to view this SQL statement. You cannot edit the statement here, but you can click this tab at any time to select and copy parts of the generated statement to paste into the user-defined SQL statement. 8. Click OK to close the ORAOCI9 Stage dialog box. Changes are saved when you save your job design.
122
Procedure
1. Select Enter custom SQL statementfrom the Query Type list. 2. Click the User-defined tab on the SQL tab. 3. Enter the SQL statement you want to use to write data to the target Oracle tables. This statement must contain the table name, the type of update action you want to perform, and the columns you want to write. Only two SQL statements are supported for input links. When writing data, the INSERT statements must contain a VALUES clause with a colon ( : ) used as a parameter marker for each stage input column. UPDATE statements must contain SET clauses with parameter markers for each stage input column. UPDATE and DELETE statements must contain a WHERE clause with parameter markers for the primary key columns. The parameter markers must be in the same order as the associated columns listed in the stage properties. For example:
insert emp (emp_no, emp_name) values (:1, :2)
If you specify two SQL statements, they are executed as one transaction. Do not use a trailing semicolon. You cannot call stored procedures as there is no facility for parsing the row values as parameters. Unless you specify a user-defined SQL statement, the stage automatically generates an SQL statement. 4. Click OK to close the ORAOCI9 Stage dialog box. Changes are saved when you save your job design.
123
The following sections describe the differences when you use SQL SELECT statements for generated or user-defined queries that you define on the Output page in the ORAOCI9 Stage window of the GUI.
124
Options Tab
Use this tab to specify transaction isolation, array size, prefetch memory size, and case-sensitivity. The Options tab contains the following parameters: v Transaction Isolation. Specifies the transaction isolation levels that provide the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. Because Oracle does not prevent other transactions from modifying the data read by a query, that data might be changed by other transactions between two executions of the query. Thus, a transaction that executes a given query twice might experience both non-repeatable reads and phantoms. Use one for the following transaction isolation levels: Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. Each query executed by a transaction sees only data that was committed before the query (not the transaction) began. Oracle queries never read dirty, that is, uncommitted data. This is the default value. Serializable. Takes exclusive locks on modified data and sharable locks on all other data. It sees only those changes committed when the transaction began plus those made by the transaction itself through INSERT, UPDATE, and DELETE statements. Serializable transactions do not experience non-repeatable reads or phantoms. Read-only. Sees only those changes that were committed when the transaction began. This level does not allow INSERT, UPDATE, and DELETE statements. v Array size. Specifies the number of rows read from the database at a time. Enter a positive integer to indicate the number of rows to prefetch in one call. This value is used both for prefetching rows and for array fetch. Larger numbers use more memory on the client to cache the rows. This minimizes server round trips and maximizes performance by executing fewer statements. If this number is too large, the client might run out of memory. v Prefetch memory setting. Sets the memory level for top-level rows to be prefetched. See Oracle documentation for further information. Express the value in number of bytes. v Disable array fetch. Enables or disables Oracle array fetch. Array fetch is enabled by default. The value in Array size is used for array fetch size. v Enable case sensitive table/column name. Enables the use of case-sensitive table and column names. Select to automatically enclose table and column names in SQL statements in double quotation marks (" "). It is cleared by default. Note: If Enable case sensitive table/column name is selected, when qualified column names are specified in the Derivation cell on the Columns tab, you must enclose these table and column names in double quotation marks (" ").
Columns Tab
This tab contains the column definitions for the data being output on the chosen link. The column tab page behaves the same way as the Columns tab in the ODBC stage, and it specifies which columns are aggregated.
125
The column definitions for output links contain a key field. Key fields are used to join primary and reference inputs to a Transformer stage. For a reference output link, the Oracle OCI key reads the data by using a WHERE clause in the SQL SELECT statement. The Derivation cell on the Columns tab contains fully-qualified column names when table definitions are loaded from the IBM InfoSphere DataStage Repository. If the Derivation cell has no value, Oracle OCI uses only the column names to generate the SELECT statement displayed in the Generated tab of the SQL tab. Otherwise, it uses the content of the Derivation cell. Depending on the format used in the Repository, the format is owner.table.name.columnname or tablename.columnname. The column definitions for reference links require a key field. Key fields join reference inputs to a Transformer stage. Oracle OCI key reads the data by using a WHERE clause in the SQL SELECT statement. See the IBM InfoSphere DataStage and QualityStage Designer Client Guide for v A description of how to enter and edit column definitions v Details on how key fields are specified and used
SQL Tab
Use this tab page to build the SQL statements used to read data from Oracle. It contains the Query, Before, and After tab pages: v Query. This tab is read-only if you select Use SQL Builder tool or Generate SELECT clause from column list; enter other clauses for Query Type. If Query Type is Enter Custom SQL statement, this tab contains the SQL statements executed to read data from Oracle. The GUI displays the stage-generated SQL statement on this tab as a starting point. However, you can enter any valid, appropriate SQL statement. If Query Type is Load SQL from a file at run time, enter the path name of the file. v Before. Contains the SQL statements executed before the stage processes any job data rows. The Before is the first SQL statement to be executed, and you can specify whether the job continues or aborts after failing to run a Before SQL statement. It does not affect the transaction grouping scheme. The commit/rollback is performed on a per-link basis. If the property value begins with FILE=, the remaining text is interpreted as a path name, and the contents of the file supplies the property value. v After. Contains the After SQL statement executed after the stage processes any job data rows. It is the last SQL statement to be executed, and you can specify whether the job continues or aborts after failing to run an After SQL statement. It does not affect the transaction grouping scheme. The commit/rollback is performed on a per-link basis. If the property value begins with FILE=, the remaining text is interpreted as a path name, and the contents of the file supplies the property value.
126
The column definitions for reference links must contain a key field. You use key fields to join primary and reference inputs to a Transformer stage. Oracle OCI key reads the data by using a WHERE clause in SQL SELECT statements.
Procedure
1. Select Generate SELECT clause from column list; enter other clauses. Data is extracted from an Oracle database by using an SQL SELECT statement constructed by the InfoSphere DataStage. Also, the SQL Clauses button appears. 2. Click SQL Clauses. The SQL Clauses window opens. SQL SELECT statements have the following syntax:
SELECT clause FROM clause [WHERE clause] [GROUP BY clause] [HAVING clause] [ORDER BY clause];
Procedure
1. Select Enter custom SQL statement from the Query type list on the General tab on the Output page. The SQL tab is enabled.
127
2. You can edit or drag the selected columns into your user-defined SQL statement. Only one SQL statement is supported for an output link. You must ensure that the table definitions for the output link are correct and represent the columns that are expected. 3. If your entry begins with {FILE}, the remaining text is interpreted as a path name, and the contents of the file supplies the text for the query. 4. Click OK to close this window. Changes are saved when you save your job design.
The results vary, depending on whether the Oracle OCI stage is used as an input or an output link: v Input link. The stage generates the following SQL statement:
insert into dsdate(one) values(TO_DATE(:1, yyyy-mm-dd hh24:mi:ss))
128
Table 18. Oracle's character data types and the InfoSphere DataStage's corresponding data types Oracle Data Type CHAR (size) InfoSphere DataStage SQL Type Length Char (size) size Notes Fixed length character data of length size. Fixed for every row in the table (with trailing spaces). Maximum size is 255 bytes per row, default size is 1 byte per row. VARCHAR2 (size) VarChar (size) size Variable length character data. A maximum size must be specified. VarChar is variable for each row, up to 2000 bytes per row.
129
Table 19. Oracle's numeric data types and the InfoSphere DataStage's corresponding data types Oracle Data Type NUMBER (p,s) InfoSphere DataStage SQL Type Decimal Double Float Numeric Integer Real
Length pp
Scale ss
Notes The InfoSphere DataStage SQL type definition used depends on the application of the column in the table, that is, how the column is used. Decimal values have a maximum precision of 38 digits. Decimal and Numeric are synonyms. The full range of Oracle NUMBER values are supported without loss of precision.
130
Table 20. Additional numeric data types and the corresponding data type in InfoSphere DataStage Oracle Data Types BINARY_DOUBLE InfoSphere DataStage SQL Type Double Notes v When a table is read, the InfoSphere DataStage converts columns with a data type of BINARY_DOUBLE to SQL_DOUBLE. v When a table is updated, the InfoSphere DataStage converts columns with a data type of SQL_DOUBLE to BINARY_DOUBLE. Note: Perform the following steps to determine the data type of the source column. When importing metadata definitions, select Import > Table Definitions > Plug-in Meta Data Definitions. Select ORAOCI9. If you select Include Column Description, the metadata import includes the description column on the Columns tab.
131
Table 20. Additional numeric data types and the corresponding data type in InfoSphere DataStage (continued) Oracle Data Types BINARY_FLOAT InfoSphere DataStage SQL Type Float Notes v When a table is read, the InfoSphere DataStage converts columns with a data type of either BINARY_FLOAT or FLOAT to SQL_FLOAT. Note: Perform the following steps to determine the data type of the source column. When importing metadata definitions, select Import > Table Definitions > Plug-in Meta Data Definitions. Select ORAOCI9. If you select Include Column Description, the metadata import includes the description column on the Columns tab. v When a table is updated, the InfoSphere DataStage converts SQL_FLOAT to either BINARY_FLOAT or FLOAT. To indicate BINARY_FLOAT, place the keyword BINARY_FLOAT anywhere in the column description field on the Columns tab. If BINARY_FLOAT is present, the InfoSphere DataStage converts SQL_FLOAT to BINARY_FLOAT. If BINARY_FLOAT is not present, the InfoSphere DataStage converts SQL_FLOAT to FLOAT (with precision).
132
Table 21. Oracle's date data types and the InfoSphere DataStage's corresponding data types Oracle Data Type DATE InfoSphere DataStage SQL Type Timestamp Notes The default format for the default InfoSphere DataStage data type Timestamp is YYYY-MM-DD HH24:MI:SS. If the InfoSphere DataStage data type is Timestamp, the InfoSphere DataStage uses the to_date function for this column when it generates the INSERT statement to write an Oracle date. If the InfoSphere DataStage data type is Timestamp or Date, the InfoSphere DataStage uses the to_char function for this column when it generates the SELECT statement to read an Oracle date. For more information, see "DATE Data Type Considerations"
133
For a list of unsupported Oracle data types, see "Functionality of Oracle OCI Stages" .
v Select v Where clause For example, for an update you might enter:
UPDATE tablename SET ##B$ = :1 WHERE $A# = :2
Particularly note the key in this statement ($A#) is specified using the external name.
134
135
Platforms
Your Oracle client and server machines must have the same operating system type, such as UNIX to UNIX or Windows 2000 to Windows 2000, in order to run successfully. If you mix the UNIX and Windows platforms for your Oracle client and Oracle server machines, the IBM InfoSphere DataStage job will fail, for example, if the Oracle client is on an UNIX workstation and the Oracle server is on a Windows 2000 workstation.
oraclehome is the location where your Oracle software is installed. oraclemanager is the name of the Oracle Enterprise Manager home directory. Any changes to system environment variables might require a system reboot before the values of the variables take effect. The configuration of SQL*Net using a configuration program, for example, SQL*Net Easy Configuration, to set up and add database aliases is also required.
Load Modes
Load mode specifies whether to load the data into the target file in automatic or manual mode. The Load Mode property specifies whether to populate the Oracle database immediately or generate a control file and a data file to populate the database later. The load modes are automatic and manual.
136
Procedure
1. 2. 3. 4. Add an Oracle OCI Load stage to an InfoSphere DataStage job Link the Oracle OCI Load stage to its data source Specify column definitions using the Columns tab Determine the appropriate load mode, as documented in "Load Modes"
5. Add the appropriate property values on the Stage tab, as documented in "Properties" 6. Compile the job 7. If the job compiles correctly, you can select one of the following actions: v Run the job from within the InfoSphere DataStage and QualityStage Designer v Run or schedule the job by using the InfoSphere DataStage and QualityStage Director 8. If the job does not compile correctly, correct the errors and recompile.
Properties
Use the Properties tab to specify the load operation. Each stage property is described in the order in which it appears.
Prompt Service Name Type String Default Description The name of the Oracle service. It is the logical representation of the database, which is the way the database is presented to clients. The service name is a string that is the global database name, a name consists of the database name and domain name, which is entered during installation or database creation. The user name for connecting to the service. The password for "User Name." The name of the target Oracle table to load the files into.
User Name
String
String String
137
Type String
Default
Description The name of the schema where the table being loaded resides. If unspecified, the schema name is "User Name." The name of the partition or subpartition that belongs to the table to be loaded. If not specified, the entire table is loaded. The name must be a valid partition or subpartition name.
Partition Name
String
Date Format
String List
DD-MON-YYYY
The date format to be used. Use one of the following values: DD.MM.YYYY YYYY-MM-DD DD-MON-YYYY MM/DD/YYYY
Time Format
String List
hh24:mi:ss
The time format to be used. Use one of the following values: hh24:mi:ss hh:mi:ss am
Long
100
Specifies the maximum number of input records in a batch. This property is used only if "Load Mode" is set to Automatic.
138
Default Automatic
Description The method used to load the data into the target file. This property specifies whether to populate the Oracle database or generate a control file and a data file to populate the database. Use one of the following values: Automatic (immediate mode). The stage populates an Oracle database immediately after loading the source data. Automatic data loading can occur only when the IBM InfoSphere DataStageserver resides on the same system as an Oracle server. Manual (delayed mode). The stage generates a control file and a data file that you can edit and run on any Oracle host system. The stage does not establish a connection with the Oracle server.
Directory Path
String
The path name of the directory where the Oracle SQL*Loader files are generated. This property is used only when "Load Mode" is set to Manual.
139
Type String
Description The name of the Oracle SQL*Loader control file generated when "Load Mode" is set to Manual. This text file contains the sequence of commands telling where to find the data, how to parse and interpret the data, and where to insert the data. You can modify and execute this file on any Oracle host system. This file has a .ctl extension. The name of the Oracle SQL*Loader sequential data file created when "Load Mode" is set to Manual. This file has a .dat extension. The character used to delimit fields in the loader input data. The indicator specifying whether SQL*Loader should preserve blanks in the data file. If No, SQL*Loader treats blanks as nulls. The indicator specifying whether both uppercase and lowercase characters can be used in column names. If No, all column names are handled as uppercase. If Yes, a combination of uppercase and lowercase characters is acceptable.
String
servicename_ tablename.dat
Delimiter
String
, (comma)
Preserve Blanks
String List
No
String List
No
140
You can use the SQL from various connectivity stages that IBM InfoSphere DataStage supports. Different databases have slightly different SQL syntax (particularly when it comes to more complex operations such as joins). The exact form of the SQL statements that the SQL builder produces depends on which stage you invoke it from. You do not have to be an SQL expert to use the SQL builder, but it helps to have some familiarity with the basic structure of SQL statements in this documentation.
Procedure
1. In the Reference Provider pane, click Browse. The Browse Providers dialog box opens.
Copyright IBM Corp. 2008, 2011
141
2. In the Select a Reference Provider type list, select Federation Server. In the Select a Federated Datasource tree, the list of database aliases opens. 3. Click a database alias. The list of schemas opens as nodes beneath each database alias. 4. In the SQL Type list, select the type of SQL query that you want to construct. 5. Click the SQL builder button. The SQL Builder - DB2 / UDB 8.2 window opens. In the Select Tables pane, the database alias appears as a node.
Procedure
1. Click the Selection tab. 2. Drag any tables you want to include in your query from the repository tree to the canvas. You can drag multiple tables onto the canvas to enable you to specify complex queries such as joins. You must have previously placed the table definitions in the IBM InfoSphere DataStage repository. The easiest way to do this is to import the definitions directly from your relational database. 3. Specify the columns that you want to select from the table or tables on the column selection grid. 4. If you want to refine the selection you are performing, choose a predicate from the Predicate list in the filter panel. Then use the expression editor to specify the actual filter (the fields displayed depend on the predicate you choose). For example, use the Comparison predicate to specify that a column should match a particular value, or the Between predicate to specify that a column falls within a particular range. The filter appears as a WHERE clause in the finished query. 5. Click the Add button in the filter panel. The filter that you specify appears in the filter expression panel and is added to the SQL statement that you are building. 6. If you are joining multiple tables, and the automatic joins inserted by the SQL builder are not what is required, manually alter the joins. 7. If you want to group your results according to the values in certain columns, select the Group page. Select the Grouping check box in the column grouping and aggregation grid for the column or columns that you want to group the results by. 8. If you want to aggregate the values in the columns, you should also select the Group page. Select the aggregation that you want to perform on a column from the Aggregation drop-down list in the column grouping and aggregation grid. 9. Click on the Sql tab to view the finished query, and to resolve the columns generated by the SQL statement with the columns loaded on the stage (if necessary).
Procedure
1. Click the Insert tab. 2. Drag the table you want to insert rows into from the repository tree to the canvas. You must have previously placed the table definitions in the IBM
142
InfoSphere DataStage repository. The easiest way to do this is to import the definitions directly from your relational database. 3. Specify the columns that you want to insert on the column selection grid. You can drag selected columns from the table, double-click a column, or drag all columns. 4. For each column in the column selection grid, specify how values are derived. You can type a value or select a derivation method from the drop-down list. v Job Parameters. The Parameter dialog box appears. Select from the job parameters that are defined for this job. v Lookup Columns. The Lookup Columns dialog box appears. Select a column from the input columns to the stage that you are using the SQL builder in. v Expression Editor. The Expression Editor opens. Build an expression that derives the value. 5. Click on the Sql tab to view the finished query.
Procedure
1. Click the Update tab. 2. Drag the table whose rows you want to update from the repository tree to the canvas. You must have previously placed the table definitions in the IBM InfoSphere DataStage repository. The easiest way to do this is to import the definitions directly from your relational database. 3. Specify the columns that you want to update on the column selection grid. You can drag selected columns from the table, double-click a column, or drag all columns. 4. For each column in the column selection grid, specify how values are derived. You can type a value or select a derivation method from the drop-down list. Enclose strings in single quotation marks. v Job Parameters. The Parameter dialog box appears. Select from the job parameters that are defined for this job. v Lookup Columns. The Lookup Columns dialog box appears. Select a column from the input columns to the stage that you are using the SQL builder in. v Expression Editor. The Expression Editor opens. Build an expression that derives the value. 5. If you want to refine the update you are performing, choose a predicate from the Predicate list in the filter panel. Then use the expression editor to specify the actual filter (the fields displayed depend on the predicate you choose). For example, use the Comparison predicate to specify that a column should match a particular value, or the Between predicate to specify that a column falls within a particular range. The filter appears as a WHERE clause in the finished statement. 6. Click the Add button in the filter panel. The filter that you specify appears in the filter expression panel and is added to the update statement that you are building. 7. Click on the Sql tab to view the finished query.
143
Procedure
1. Click the Delete tab. 2. Drag the table from which you want to delete rows from the repository tree to the canvas. You must have previously placed the table definitions in the IBM InfoSphere DataStage repository. The easiest way to do this is to import the definitions directly from your relational database. 3. You must choose an expression which defines the rows to be deleted. Choose a predicate from the Predicate list in the filter panel. Then use the expression editor to specify the actual filter (the fields displayed depend on the predicate you choose). For example, use the Comparison predicate to specify that a column should match a particular value, or the Between predicate to specify that a column falls within a particular range. The filter appears as a WHERE clause in the finished statement. 4. Click the Add button in the filter panel. The filter that you specify appears in the filter expression panel and is added to the update statement that you are building. 5. Click on the Sql tab to view the finished query.
Toolbar
The SQL builder toolbar contains the following tools. v Clear Query removes the field entries for the current SQL query. v Cut removes items and placed them on the Microsoft Windows clipboard so they can be pasted elsewhere. v Copy copies items and place them on the Windows clipboard so they can be pasted elsewhere. v Paste pastes items from the Windows clipboard to certain places in the SQL builder. v SQL properties opens the Properties dialog box. v Quoting toggles quotation marks in table and column names in the generated SQL statements. v Validation toggles the validation feature. Validation automatically occurs when you click OK to exit the SQL builder. v View Data is available when you invoke the SQL builder from stages that support the viewing of data. It causes the calling stage to run the SQL as currently built and return the results for you to view. v Refresh refreshes the contents of all the panels on the SQL builder. v Window View allows you to select which panels are shown in the SQL builder window. v Help opens the online help.
Tree Panel
This displays the table definitions that currently exist within the IBM InfoSphere DataStage repository. The easiest way to get a table definition into the repository is
144
to import it directly from the database you want to query. You can do this via the Designer client, or you can do it directly from the shortcut menu in the tree panel. You can also manually define a table definition from within the SQL builder by selecting New Table... from the tree panel shortcut menu. To select a table to query, select it in the tree panel and drag it to the table selection canvas. A window appears in the canvas representing the table and listing all its individual columns. A shortcut menu allows you to: v Refresh the repository view v Define a new table definition (the Table Definition dialog box opens) v Import metadata directly from a data source (a sub menu offers a list of source types) v Copy a table definition (you can paste it in the table selection canvas) v View the properties of the table definition (the Table Definition dialog box opens) You can also view the properties of a table definition by double-clicking on it in the repository tree.
145
v Add a related table (select queries only). A submenu shows you tables that have a foreign key relationship with the currently selected one. Select a table to insert it in the canvas, together with the join expression inferred by the foreign key relationship. v Remove the selected table. v Select all the columns in the table (so that you could, for example, drag them all to the column selection grid). v Open a Select Table dialog box to allow you to bind an alternative table for the currently selected table (select queries only). v Open the Table Properties dialog box for the currently selected table. With a join selected in the canvas (select queries only), a shortcut menu allows you to: v Open the Alternate Relation dialog box to specify that the join should be based on a different foreign key relationship. v Open the Join Properties dialog box to modify the type of join and associated join expression. From the canvas background, a shortcut menu allows you to: v Refresh the view of the table selection canvas. v Paste a table that you have copied from the tree panel. v View data - this is available when you invoke the SQL builder from stages that support the viewing of data. It causes the calling stage to run the SQL as currently built and return the results for you to view. v Open the Properties dialog box to view details of the SQL syntax that the SQL builder is currently building a query for.
Selection Page
The Selection page appears when you are using the SQL builder to define a Select statement. Use this page to specify details of your select query. It has the following components.
Column expression
Identifies the column to be included in the query. You can specify: v Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear).
146
v Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. v Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) v Lookup Column. You can directly select a column from one of the tables in the table selection canvas.
Table
Identifies the table that the column belongs to. If you populate the column grid by dragging, copying or double-clicking on a column from the table selection canvas, the table name is filled in automatically. You can also choose a table from the drop-down list. To specify the table name at runtime, choose a job parameter from the drop-down list.
Column Alias
This allows you to specify an alias for the column.
Output
This is selected to indicate that the column will be output by the query. This is automatically selected when you add a column to the grid.
Sort
Choose Ascending or Descending to have the query sort the returned rows by the value of this column. Selecting to sort results in an ORDER BY clause being added to the query.
Sort Order
Allows you to specify the order in which rows are sorted if you are ordering by more than one column.
Context Menu
A shortcut menu allows you to: v Paste a column that you've copied from the table selection canvas. v Insert a row in the grid. v Show or hide the filter panel. v Remove a row from the grid.
Filter Panel
The filter panel allows you to specify a WHERE clause for the SELECT statement you are building. It comprises a predicate list and an expression editor panel, the contents of which depends on the chosen predicate. See Expression Editor for details on using the expression editor that the filter panel provides.
147
Group Page
The Group page appears when you are using the SQL builder to define a select statement. Use the Group page to specify that the results of a select query are grouped by a column, or columns. Also, use it to aggregate the results in some of the columns, for example, you could specify COUNT to count the number of rows that contain a not-null value in a column. The Group tab gives access to the toolbar, tree panel, and the table selection canvas, in exactly the same way as the Selection page.
Grouping Grid
This is where you specify which columns are to be grouped by or aggregated on. The grid is populated with the columns that you selected on the Selection page. You can change the selected columns or select new ones, which will be reflected in the selection your query makes. The grid has the following fields: v Column expression. Identifies the column to be included in the query. You can modify the selections from the Selection page, or build a column expression. Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression Editor. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear). Lookup Column. You can directly select a column from one of the tables in the table selection canvas. v Column Alias. This allows you to specify an alias for the column. If you select an aggregation operation for a column, SQL builder will automatically insert an alias of the form Alison; you can edit this if required. v Output. This is selected to indicate that the column will be output by the query. This is automatically selected when you add a column to the grid. v Distinct. Select this check box if you want to add the DISTINCT qualifier to an aggregation. For example, a COUNT aggregation with the distinct qualifier will count the number of rows with distinct values in a field (as opposed to just the not-null values). For more information about the DISTINCT qualifier, see SQL Properties Dialog Box. v Aggregation. Allows you to select an aggregation function to apply to the column (note that this is mutually exclusive with the Group By option). See Aggregation Functions for details about the available functions.
148
v Group By. Select the check box to specify that query results should be grouped by the results in this column.
Aggregation Functions
The aggregation functions available vary according to the stage you have opened the SQL builder from. The following are the basic ones supported by all SQL syntax variants. The following aggregation functions are supported. v AVG. Returns the mean average of the values in a column. For example, if you had six rows with a column containing a price, the six rows would be added together and divided by six to yield the mean average. If you specify the DISTINCT qualifier, only distinct values will be averaged; if the six rows only contained four distinct prices then these four would be added together and divided by four to produce a mean average. v COUNT. Counts the number of rows that contain a not-null value in a column. If you specify the DISTINCT qualifier, only distinct values will be counted. v MAX. Returns the maximum value that the rows hold in a particular column. The DISTINCT qualifier can be selected, but has no effect on this function. v MIN. Returns the minimum value that the rows hold in a particular column. The DISTINCT qualifier can be selected, but has no effect on this function. v STDDEV. Returns the standard deviation for a set of numbers. v VARIANCE. Returns the variance for a set of numbers.
Filter Panel
The filter panel allows you to specify a HAVING clause for the SELECT statement you are building. It comprises a predicate list and an expression editor panel, the contents of which depends on the chosen predicate. See Expression Editor for details on using the expression editor that the filter panel provides.
Insert Page
The Insert page appears when you are using the SQL builder to define an insert statement. Use this page to specify details of your insert statement. This page has the component insert columns grid.
Insert Column
Identifies the columns to be included in the statement. You can populate this in a number of ways:
Chapter 7. Building SQL statements
149
v v v v
drag columns from the table in the table selection canvas. choose columns from a drop-down list in the grid. double-click the column name in the table selection canvas. copy and paste from the table selection canvas.
Insert Value
Identifies the values that you are setting the corresponding column to. You can specify one of the following in giving a value. You can also type a value directly into this field. v Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). v Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. v Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) v Lookup Column. You can directly select a column from one of the tables in the table selection canvas.
Update Page
The Update page appears when you are using the SQL builder to define an update statement. Use this page to specify details of your update statement. It has the following components.
Update Column
Identifies the columns to be included in the statement. You can populate this in a number of ways: v drag columns from the table in the table selection canvas. v choose columns from a drop-down list in the grid. v double-click the column name in the table selection canvas. v copy and paste from the table selection canvas.
Update Value
Identifies the values that you are setting the corresponding column to. You can specify one of the following in giving a value. You can also type a value directly into this field. v Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). v Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query.
150
v Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) v Lookup Column. You can directly select a column from one of the tables in the table selection canvas.
Filter Panel
The filter panel allows you to specify a WHERE clause for the update statement you are building. It comprises a predicate list and an expression editor panel, the contents of which depends on the chosen predicate. See Expression Editor for details on using the expression editor that the filter panel provides.
Delete Page
The Delete page appears when you are using the SQL builder to define a delete statement. Use this page to specify details of your delete statement. It has the following components.
Filter Panel
The filter panel allows you to specify a WHERE clause for the delete statement you are building. It comprises a predicate list and an expression editor panel, the contents of which depends on the chosen predicate. See "Expression Editor" for details on using the expression editor that the filter panel provides.
Sql Page
Click the Sql tab to view the generated statement. Using the shortcut menu, you can copy the statement for use in other environments. For select queries, if the columns you have defined as output columns for your stage do not match the columns that the SQL statement is generating, use the Resolve columns grid to reconcile them. In most cases, the columns match.
151
152
Expression Editor
The Expression Editor allows you to specify details of a WHERE clause that will be inserted in your select query or update or delete statement. You can also use it to specify WHERE clause for a Join condition where you are joining multiple tables, or for a HAVING clause. A variant of the expression editor allows you to specify a calculation, function, or a case statement within an expression. The Expression Editor can be opened from various places in the SQL builder.
Between
The expression editor when you have selected the Between predicate contains: v Column. Choose the column on which you are filtering from the drop-down list. You can also specify: Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) Column. You can directly select a column from one of the tables in the table selection canvas. v Between/Not Between. Choose Between or Not Between from the drop-down list to specify whether the value you are testing should be inside or outside your specified range. v Start of range. Use this field to specify the start of your range. Click the menu button to the right of the field and specify details about the argument you are using to specify the start of the range, then specify the value itself in the field.
153
v End of range. Use this field to specify the end of your range. Click the menu button to the right of the field and specify details about the argument you are using to specify the end of the range, then specify the value itself in the field.
Comparison
The expression editor when you have selected the Comparison predicate contains: v Column. Choose the column on which you are filtering from the drop-down list. You can specify one of the following in identifying a column: Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) Column. You can directly select a column from one of the tables in the table selection canvas. v Comparison operator. Choose the comparison operator from the drop-down list. The available operators are: = equals <> not equal to < less than <= less than or equal to > greater than >= greater than or equal to v Comparison value. Use this field to specify the value you are comparing to. Click the menu button to the right of the field and choose the data type for the value from the menu, then specify the value itself in the field.
In
The expression editor when you have selected the In predicate contains: v Column. Choose the column on which you are filtering from the drop-down list. You can specify one of the following in identifying a column: Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) Column. You can directly select a column from one of the tables in the table selection canvas. v In/Not In. Choose IN or NOT IN from the drop-down list to specify whether the value should be in the specified list or not in it.
154
v Selection. These fields allows you to specify the list used by the query. Use the menu button to the right of the single field to specify details about the argument you are using to specify a list item, then enter a value. Click the double right arrow to add the value to the list. To remove an item from the list, select it then click the double left arrow.
Like
The expression editor when you have selected the Like predicate is as follows. The fields it contains are: v Column. Choose the column on which you are filtering from the drop-down list. You can specify one of the following in identifying a column: Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) Column. You can directly select a column from one of the tables in the table selection canvas. v Like/Not Like. Choose LIKE or NOT LIKE from the drop-down list to specify whether you are including or excluding a value in your comparison. v Like Operator. Choose the type of Like or Not Like comparison you want to perform from the drop-down list. Available operators are: Match Exactly. Your query will ask for an exact match to the value you specify. Starts With. Your query will match rows that start with the value you specify. Ends With. Your query will match rows that end with the value you specify. Contains. Your query will match rows that contain the value you specify anywhere within them.
v Like Value. Specify the value that your LIKE predicate will attempt to match.
Null
The expression editor when you have selected the Null predicate is as follows. The fields it contains are: v Column. Choose the column on which you are filtering from the drop-down list. You can specify one of the following in identifying a column: Job parameter. A dialog box appears offering you a choice of available job parameters. This allows you to specify the value to be used in the query at run time (the stage you are using the SQL builder from must allow job parameters for this to appear). Expression. An expression editor dialog box appears, allowing you to specify an expression that represents the value to be used in the query. Data flow variable. A dialog box appears offering you a choice of available data flow variables (the stage you are using the SQL builder from must support data flow variables for this to appear) Column. You can directly select a column from one of the tables in the table selection canvas.
Chapter 7. Building SQL statements
155
v Is Null/Is Not Null. Choose whether your query will match a NULL or NOT NULL condition in the column.
Join
This predicate is only available when you are building an Oracle 8i query with an `old style' join expression. The Expression Editor is as follows. v Left column. Choose the column to be on the left of your join from the drop-down list. v Join type. Choose the type of join from the drop-down list. v Right column. Choose the column to be on the right of your query from the drop-down list.
Calculation
The expression editor when you have selected the Calculation predicate contains these fields: v Left Value. Enter the argument you want on the left of your calculation. You can choose the type of argument by clicking the menu button on the right and choosing a type from the menu. v Calculation Operator. Choose the operator for your calculation from the drop-down list. v Right Value. Enter the argument you want on the right of your calculation. You can choose the type of argument by clicking the menu button on the right and choosing a type from the menu.
Functions
The expression editor when you have selected the Functions predicate contains these fields: v Function. Choose a function from the drop-down list. The list of available functions depends on the database you are building the query for. v Description. Gives a description of the function you have selected. v Parameters. Enter the parameters required by the function you have selected. The parameters that are required vary according to the selected function.
Case
The case option on the expression editor enables you to include case statements in the SQL you are building. You can build case statements with the following syntax.
CASE WHEN condition THEN value CASE WHEN... ELSE value
or
156
The expression editor when you have selected the Case predicate contains these fields: v Case Expression. This is the subject of the case statement. Specify this if you are using the second syntax described above (CASE subject WHEN). By default, the field offers a choice of the columns from the table or tables you have dragged to the table selection canvas. To choose an alternative, click the browse button next to the field. This gives you a choice of data types, or of specifying another expression, a function, or a job parameter. v When. This allows you to specify a condition or match value for your case statement. By default, the field offers a choice of the columns from the table or tables you have dragged to the table selection canvas. To choose an alternative, click the browse button next to the field. This gives you a choice of data types, or of specifying another expression, a function, or a job parameter. You can access the main expression editor by choose case expression editor from the menu. This allows you to specify expressions such as comparisons. You would typically use this in the first syntax example. For example, you would specify grade=3 as the condition in the expression WHEN grade=3 THEN 'first class'. v Then. Use this to specify the value part of the case expression. By default, the field offers a choice of the columns from the table or tables you have dragged to the table selection canvas. To choose an alternative, click the browse button next to the field. This gives you a choice of data types, or of specifying another expression, a function, or a job parameter. v Add. Click this to add a case expression to the query. This clears the When and Then fields so that you can specify another case expression. v Else Expression. Use this to specify the value for the optional ELSE part of the case expression.
157
v Function. You can specify a function as an argument to an expression. Selecting this causes the Functions Form dialog box to open. The functions available depend on the database that the query you are building is intended for. Selecting this causes the Function dialog box to open. v Job Parameter. You can specify that the argument is a job parameter, the value for which is supplied when you actually run the IBM InfoSphere DataStage job. Selecting this opens the Parameters dialog box. v Integer. Choose this to specify that the argument is of integer type. v String. Select this to specify that the argument is of string type. v Time. Specifies that the argument is the current local time. You can edit the value. v Timestamp. Specifies that the argument is a timestamp. You can edit the value. The SQL builder inserts the current date and time in the format that the database that the query is being built for expects.
158
that the SQL builder does not check that the type of parameter you are inserting matches the type expected by the argument you are using it for.
Joining Tables
When you use the SQL builder to help you build select queries, you can specify table joins within the query. When you drag multiple tables onto the table selection canvas, the SQL builder attempts to create a join between the table added and the one already on the canvas to its left. If foreign key metadata is available for the tables, the SQL builder uses it. The join is represented by a line joining the columns the SQL builder has decided to join on. After the SQL builder automatically inserts a join, you can amend it. When you add a table to the canvas, SQL builder determines how to join the table with tables that are on the canvas. The process depends on whether the added table is positioned to the right or left of the tables on the canvas. To construct a join between the added table and the tables to its left: 1. SQL builder starts with the added table. 2. Determine if there is a foreign key between the added table and the subject table. v If a foreign key is present, continue to Step 3. v If a foreign key is not present, skip to Step 4. 3. Choose between alternatives for joining the tables that is based on the following precedence. v Relations that apply to the key fields of the added tables v Any other foreign key relation Construct an INNER JOIN between the two tables with the chosen relationship dictating the join criteria. 4. Take the subject as the next table to the left, and try again from step 2 until either a suitable join condition has been found or all tables, to the left, have been exhausted. 5. If no join condition is found among the tables, construct a default join. If the SQL grammar does not support a CROSS JOIN, an INNER JOIN is used with no join condition. Because this produces an invalid statement, you must set a suitable condition, either through the Join Properties dialog box, or by dragging columns between tables. An INNER JOIN is used with no join condition. Because this produces an invalid statement, you must set a suitable condition, either through the Join Properties dialog box, or by dragging columns between tables. To construct a join between the added table and tables to its right: 1. SQL builder starts with the added table. 2. Determine if foreign key information exists between the added table and the subject table. v If a foreign key is present, continue to Step 3. v If a foreign key is not present, skip to Step 4. 3. Choose between alternatives based on the following precedence: v Relations that apply to the key fields of the added tables
Chapter 7. Building SQL statements
159
v Any other joins Construct an INNER JOIN between the two tables with the chosen relationship dictating the join criteria. 4. Take the subject as the next table to the right and try again from step 2. 5. If no join condition is found among the tables, construct a default join. If the SQL grammar does not support a CROSS JOIN, an INNER JOIN is used with no join condition. Because this produces an invalid statement, you must set a suitable condition, either through the Join Properties dialog box, or by dragging columns between tables. An INNER JOIN is used with no join condition. Because this produces an invalid statement, you must set a suitable condition, either through the Join Properties dialog box, or by dragging columns between tables.
Specifying Joins
There are three ways of altering the automatic join that the SQL builder inserts when you add more than one table to the table selection canvas: v Using the Join Properties dialog box. Open this by selecting the link in the table selection canvas, right clicking and choosing Properties from the shortcut menu. This dialog allows you to choose a different type of join, choose alternative conditions for the join, or choose a natural join. v Using the Alternate Relation dialog box. Open this by selecting the link in the table selection canvas, right clicking and choosing Alternate Relation from the shortcut menu. This dialog allows you to change foreign key relationships that have been specified for the joined tables. v By dragging a column from one table to another column in any table to its right on the canvas. This replaces the existing automatic join and specifies an equijoin between the source and target column. If the join being replaced is currently specified as an inner or outer join, then the type is preserved, otherwise the new join will be an inner join. Yet another approach is specify the join using a WHERE clause rather than an explicit join operation (although this is not recommended where your database supports explicit join statements). In this case you would: 1. Specify the join as a Cartesian product. (SQL builder does this automatically if it cannot determine the type of join required). 2. Specify a filter in the Selection tab filter panel. This specifies a WHERE clause that selects rows from within the Cartesian product. If you are using the SQL builder to build Oracle 8i, Microsoft SQL Server, IBM Informix, or Sybase queries, you can use the Expression Editor to specify a join condition, which will be implemented as a WHERE statement. Oracle 8i does not support JOIN statements.
160
example, if you selected from two tables, the database would pair every row in the first table with every row in the second table. If each table had 6 rows, the Cartesian product would return 36 rows. If the SQL builder cannot insert an explicit join based on available information, it will default to a Cartesian product that is formed with the CROSS JOIN syntax in the FROM clause of the resulting SQL statement: FROM FirstTable CROSS JOIN SecondTable. You can also specify a Cartesian product by selecting the Cartesian product option in the Join Properties dialog box. The cross join icon is shown on the join. v Table join. Select the Table Join option to specify that your query will contain join condition for the two tables being joined. The Join Condition panel is enabled, allowing you to specify further details about the join. v Join Condition panel. This shows the expression that the join condition will contain. You can enter or edit the expression manually or you can use the menu button to the right of the panel to specify a natural join, open the Expression Editor, or open the Alternate relation dialog box. v Include. These fields allow you to specify that the join should be an outer join, where the result of the query should include the rows as specified by one of the following: Select All rows from left table name to specify a left outer join Select All rows from right table name to specify a right outer join Select both All rows from left table name and All rows from right table name to specify a full outer join v Join Icon. This tells you the type of join you have specified.
Properties Dialogs
Depending where you are in the SQL builder, choosing Properties from the shortcut menu opens a dialog box as follows: v The Table Properties dialog box opens when you select a table in the table selection canvas and choose Properties from the shortcut menu. v The SQL Properties dialog box opens when you select the Properties icon in the toolbox or Properties from the table selection canvas background. v The Join Properties dialog box opens when you select a join in the table selection canvas and choose Properties from the shortcut menu. This dialog is described in Join Properties Dialog Box.
161
v Alias. The alias that the SQL builder uses to refer to this table. You can edit the alias if required. If the table alias is used in the selection grid or filters, changing the alias in this dialog box will update the alias there.
162
163
164
{}
Note: v The maximum number of characters in an argument is 256. v Enclose argument values that have embedded spaces with either single or double quotation marks. For example: wsetsrc[-S server] [-l label] [-n name] source The source argument is the only required argument for the wsetsrc command. The brackets around the other arguments indicate that these arguments are optional. wlsac [-l | -f format] [key... ] profile In this example, the -l and -f format arguments are mutually exclusive and optional. The profile argument is required. The key argument is optional. The ellipsis (...) that follows the key argument indicates that you can specify multiple key names. wrb -import {rule_pack | rule_set}... In this example, the rule_pack and rule_set arguments are mutually exclusive, but one of the arguments must be specified. Also, the ellipsis marks (...) indicate that you can specify multiple rule packs or rule sets.
165
166
Product accessibility
You can get information about the accessibility status of IBM products. The IBM InfoSphere Information Server product modules and user interfaces are not fully accessible. The installation program installs the following product modules and components: v IBM InfoSphere Business Glossary v IBM InfoSphere Business Glossary Anywhere v IBM InfoSphere DataStage v IBM InfoSphere FastTrack v v v v IBM IBM IBM IBM InfoSphere InfoSphere InfoSphere InfoSphere Information Analyzer Information Services Director Metadata Workbench QualityStage
For information about the accessibility status of IBM products, see the IBM product accessibility information at http://www.ibm.com/able/product_accessibility/ index.html.
Accessible documentation
Accessible documentation for InfoSphere Information Server products is provided in an information center. The information center presents the documentation in XHTML 1.0 format, which is viewable in most Web browsers. XHTML allows you to set display preferences in your browser. It also allows you to use screen readers and other assistive technologies to access the documentation.
167
168
Contacting IBM
You can contact IBM for customer support, software services, product information, and general information. You also can provide feedback to IBM about products and documentation. The following table lists resources for customer support, software services, training, and product and solutions information.
Table 24. IBM resources Resource IBM Support Portal Description and location You can customize support information by choosing the products and the topics that interest you at www.ibm.com/support/ entry/portal/Software/ Information_Management/ InfoSphere_Information_Server You can find information about software, IT, and business consulting services, on the solutions site at www.ibm.com/ businesssolutions/ You can manage links to IBM Web sites and information that meet your specific technical support needs by creating an account on the My IBM site at www.ibm.com/account/ You can learn about technical training and education services designed for individuals, companies, and public organizations to acquire, maintain, and optimize their IT skills at http://www.ibm.com/software/swtraining/ You can contact an IBM representative to learn about solutions at www.ibm.com/connect/ibm/us/en/
Software services
My IBM
IBM representatives
Providing feedback
The following table describes how to provide feedback to IBM about products and product documentation.
Table 25. Providing feedback to IBM Type of feedback Product feedback Action You can provide general product feedback through the Consumability Survey at www.ibm.com/software/data/info/ consumability-survey
169
Table 25. Providing feedback to IBM (continued) Type of feedback Documentation feedback Action To comment on the information center, click the Feedback link on the top right side of any topic in the information center. You can also send comments about PDF file books, the information center, or any other documentation in the following ways: v Online reader comment form: www.ibm.com/software/data/rcf/ v E-mail: [email protected]
170
Notices
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web
171
sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation J46A/G4 555 Bailey Avenue San Jose, CA 95141-1003 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to
172
IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. Copyright IBM Corp. _enter the year or years_. All rights reserved. If you are viewing this information softcopy, the photographs and color illustrations may not appear.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. The following terms are trademarks or registered trademarks of other companies: Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office UNIX is a registered trademark of The Open Group in the United States and other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
173
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. The United States Postal Service owns the following trademarks: CASS, CASS Certified, DPV, LACSLink, ZIP, ZIP + 4, ZIP Code, Post Office, Postal Service, USPS and United States Postal Service. IBM Corporation is a non-exclusive DPV and LACSLink licensee of the United States Postal Service. Other company, product or service names may be trademarks or service marks of others.
174
175
176
Index A
access control setting 13 accessing Oracle databases 86 adding deprecated stages to palette 7 Advanced tab 96 ALL_CONSTRAINTS dictionary view 80 ALL_INDEXES dictionary view 80 ALL_OBJECTS dictionary view 80 ALL_PART_COL_STATISTICS dictionary view 80 ALL_PART_KEY_COLUMNS dictionary view 80 ALL_PART_TABLES dictionary view 80 ALL_TAB_COLS dictionary view 80 ALL_TAB_SUBPARTITIONS dictionary view 80 ALL_TABLES dictionary view 80 ALL_VIEWS dictionary view 80 APT_CONFIG_FILE environment variable parallel configuration file and 17 array size setting 40, 51 ASB agent described 14 automatic loading, Oracle OCI Load 136 CC_MSG_LEVEL environment variable (continued) configuring 60 CC_ORA_BIND_KEYWORD environment variable 82 CC_ORA_CHECK_CONVERSION environment variable 82 CC_ORA_MAX_ERRORS_REPORT environment variable 82 CC_ORA_NLS_LANG_ENV environment variable 82 CC_ORA_NODE_PLACEHOLDER_ NAME environment variable 82 configuring 47 CC_ORA_NODE_USE_PLACEHOLDER environment variable 82 configuring 47 Char data type 76 characters case-sensitivity of 53 NULL 54 white space 54 CLOB data type 74 column definitions DataStage 76 Oracle 76 columns defining 30 mapping 30, 76 Columns mapping 41 propagating at runtime 41 command-line syntax conventions 165 commands syntax 165 configuration Oracle connector 9 configuring bulk load options 53 Configuring 44 manual mode 53 Connection category 101, 109 connections defining 29 Oracle resource managers and 55 Connector Migration Tool command line interface 3 Connector Migration Tool user interface 2 constraints node 17 containers 1 migrating to use connectors 2, 3 customer support contacting 169 data type conversion (continued) writing to Oracle 88 data types mapping 74, 76 Oracle datetime 70 Oracle LOB 71 XMLType 71 data types data type CHAR data type 74 NCHAR data type 74 NVARCHAR 74 Oracle to DataStage mapping 74 VARCHAR data type 74 VARCHAR2 data type 74 database connections defining 29 DataStage column definitions 76 date cache configuring 50 statistics 50 Date data type 76 DATE data type 74 dates NLS session parameters and 70 DBA_EXTENTS dictionary view 80 accessing 13 Decimal data type 76 DECIMAL data type 74 default.apt file location 17 Deleting rows from an Oracle database 94 deprecated stages 7 dictionary views accessing 13 described 80 Distributed Transaction stage Oracle connector and 55 Double data type 76 DOUBLE data type 74 DUAL dictionary view 80
B
BFILE contents transferring 71 BFILE data type 74 BigInt data type 76 Binary data type 76 BINARY_DOUBLE data type 74 BINARY_FLOAT data type 74 Bit data type 76 BLOB data type 74 bulk loading from external files Oracle OCI Load stages 135, 137 bulk loads array size, configuring 51 buffer size, configuring 51 date cache and 50 exceptions table and 48 indexes and 50 table constraints, managing 48 triggers, managing 48
E
end-of-wave markers configuring 40 environment variables described 82 required 9 error conditions configuring 45, 50 examples constraining nodes 18 lookups 12 parallel configuration file 18 reading data 10 reject link 11 transferring XMLType data 72 transparent application failover 57 writing data 11
C
cannot find on palette 7 case-sensitivity examples 53 case-sensitivity 53 preserving 53 CC_MSG_LEVEL environment variable 82 Copyright IBM Corp. 2008, 2011
D
data type conversion reading from Oracle 90
177
F
failover configuring 56 Fields dropping unmatched Float data type 76 FLOAT data type 74 42
H
handling special characters (# and $) 86
I
index organized tables (Oracle) 88, 103 indexes bulk loads and 50 rebuilding 50 Input link dropping unmatched fields 42 Input Link Properties tab 97 input links ordering 16 ordering records 16 Inputs Page 97 Integer data type 76 INTERVAL DAY TO SECOND data type 74 INTERVAL YEAR TO MONTH data type 74 isolation level setting 39
load modes, Oracle OCI Load stages 136 loading an Oracle database 94 Loading an Oracle Database 94 loading tables 88 LOB data transferring 71 LOCAL environment variable 82 configuring 29 log file date cache statistics 50 messages, reporting 60 logging NLS database parameters 59 NLS session parameters 59 NLS_LANG environment variable 59 Oracle environment information 59 Logging messages 44 LONG data type 74 LONG RAW data type 74 LongNVarChar data type 76 LONGNVARCHAR data type 74 LongVarBinary data type 76 LONGVARBINARY data type 74 LONGVARCHAR data type 74 looking up an Oracle table 91 lookups configuring 12 links, required 15
NLS_CHARACTERSET database parameter 74 NLS_DATE_FORMAT session parameter 70 NLS_DATE_LANGUAGE session parameter 70 NLS_LENGTH_SEMANTICS database parameter 74 NLS_TIME_FORMAT session parameter 70 NLS_TIMESTAMP_FORMAT session parameter 70 NLS_TIMESTAMP_TZ_FORMAT session parameter 70 node numbers configuring use of 47 node placeholders substituting with numbers 47 node pools configuring 17 nodes configuring 17 non-IBM Web sites links to 175 not on palette 7 NUMBER data type 74 Numeric data type 76 NVarChar data type 76
O
OBJECT_RELATIONAL data type 74 Options category 102, 109 Oracle column definitions 76 Oracle connector 43 Oracle connector partitioned write type described 27 Oracle enterprise stage 85 Oracle environment logging information about 59 Oracle OCI Load stages automatic loading 136 configuration requirements 135, 136 description 135 functionality 135 introduction 135 load modes 136 loading manually 136 properties 137 Oracle OCI stages array size specifying 117 case-sensitive table or column names 118 character data types 128 character set mapping 114, 115 clearing tables 116, 122 CLOB data type 133 column-generated SQL queries. See generated SQL queries. 124 configuration requirements 113 connecting to Oracle databases 114 CREATE TABLE statement 117, 119, 123 creating tables 117, 119, 128 Data Browser 116
M
mapping data types 74, 76 Mapping columns 41 messages debug 68 fatal 60 informational 68 severity levels 60 trace 68 warning 65 metadata importing 14 migrating to use connectors 1 Modulus partitioned read method described 20 example 23 Minimum and maximum range partitioned read method described 20 example 23 multiple matches 44 must do's 93
J
jobs 1 compiling 38 constraints, configuring 17 creating 10, 11, 12, 15 failure, controlling 59 migrating to use connectors 2, 3 parameters, specifying 17 running 38
L
legal notices 171 length character sets specifying 74 specifying 74 library path environment variable 82 library path environment variableNLS_LANG environment variable setting 9 links creating 15 ordering 16 record processing 16 reject 36
N
NChar data type 76 NCLOB data type 74 NLS Map 96 NLS session parameters specifying 70 NLS_CALENDAR session parameter
70
178
Oracle OCI stages (continued) data types character 128 CLOB 133 DATE 128, 132 numeric 129 support for 128 DATE data type 128, 132 defining character set mapping 114, 115 OCI connections 115 OCI input data 115, 121 OCI output data 123, 126 DELETE statement 121, 122, 123 dialog boxes ORAOCI9 Stage 114, 123 dollar sign ($) character 134 DROP TABLE statement 117, 119 dropping tables 117 editing an ORAOCI9 stage 114 error handling 118 FROM clause 124, 127 functionality 112 generated SQL queries 124, 126, 127 generated SQL statements writing data to Oracle 122 generating SQL statements for reading data 126 for writing data 117, 118, 122 GROUP BY clause 124, 127 handling errors 118 rejected rows 121 HAVING clause 124, 127 input links 111, 115, 116, 120, 123, 128 Input page 114, 115, 121 General tab 116 table name for 117 update action for 116 INSERT statement 121, 122, 123 introduction 111 numeric data types 129 Oracle database, connecting to 114 ORAOCI9 Stage dialog box 123 ORAOCI9 Stage window 114, 115, 122, 124 ORDER BY clause 124, 127 output links 111, 114, 123, 126, 127, 128 Output page 114, 124, 126 pound (#) character 134 Query Type 116 reading data from Oracle 126, 128 reference links 111 reject row handling 121 Repository 117 SELECT statement 124, 126, 127 special characters 134 SQL 122 SQL builder 122 SQL Builder 116, 123, 124 SQL Clauses window 124, 127 SQL queries defined at run time 124 generated 124, 126, 127 in file 128
Oracle OCI stages (continued) SQL queries (continued) SQL Builder 124 user-defined 124, 127 SQL statements DELETE 123 examples 123, 127, 128 FROM clause 124, 127 GROUP BY clause 124, 127 HAVING clause 124, 127 INSERT 123 ORDER BY clause 124, 127 SELECT 124, 126, 127 syntax 127 UPDATE 123 WHERE clause 124, 126, 127 SQL, user-defined 117, 123, 124 Stage page 113, 115 table name 117 tables clearing 116, 122 creating 117, 119, 128 reading from 126, 128 writing to 121, 123 transaction grouping 112, 118, 119, 120 transaction handling 120 update action, input pages 116 UPDATE statement 121, 122, 123 user-defined SQL 117, 123, 124, 127 user-defined SQL statements writing data to Oracle 123 warning messages 118 WHERE clause 124, 126, 127 windows ORAOCI9 Stage 115, 122, 124 SQL Clauses 124, 127 Oracle partitions partitioned read method described 20 example 23 Oracle services connecting to 29 Oracle stages 85 input properties 97 output properties 106 Oracle syntax parameters and 30 ORACLE_HOME environment variable 82 setting 9 ORACLE_SID environment variable configuring 29 ORACLE-SID environment variable 82 Orchestrate syntax parameters and 30 Other data type 74 Output Link Properties tab 106 Outputs page 106
parallel configuration file (continued) location, specifying 17 parallel reads configuring 19 parameters lookups described 12 lookups and 12 specifying 30 partitioned read methods described 20 partitioned reads configuring 19 examples 23 partitioned write types described 27 Partitioning tab 104 partitions job constraints and 17 performing an in memory lookup on an Oracle database table 96 PL/SQL block required keywords 33 Preserving trailing blanks 43 product accessibility accessibility 167 product documentation accessing 163 properties Oracle stage input 97 Oracle stage output 106
Q
quotation marks using to preserve case-sensitivity 53
R
RAW data type 74 read mode specifying 33 reading data from Oracle tables Oracle OCI stages 123, 128 reads See also parallel reads See also partitioned reads configuring 10 links, required 15 parallel 10 partitioned 10 records ordering 16 prefetching 40 processing 16 rejected, managing 36 redo log managing 50 reject links configuring 36 roles creating 13 ROWID data type 74 Rowid hash partitioned read method described 20 Index
P
palette displaying stages 7 parallel configuration file APT_CONFIG_FILE environment variable and 17 default 17
179
Rowid hash partitioned read method (continued) example 23 Rowid range partitioned read method described 20 example 23 Rowid round robin partitioned read method described 20 example 23 rows prefetching 40 rejected, managing 36
S
scripts exception table 48 SELECT statements specifying 33 XMLType data and 71 server stages SQL builder 122 services connecting to 29 default 29 SmallInt data type 76 software services contacting 169 Source category 107 special characters in command-line syntax 165 SQL builder server stages 122 SQL Builder Oracle OCI stages 122 SQL statements case-sensitivity and 53 running before or after processing data 45 stage not on palette 7 Stage page 96 stages adding to palette 7 support customer 169 syntax command-line 165 Easy Connect string 29 parameters 30 table name 19
Time data type 76 Timestamp data type 76 TIMESTAMP data type 74 TIMESTAMP WITH LOCAL TIME ZONE data type 74 TIMESTAMP WITH TIME ZONE data type 74 timestamps NLS session parameters and 70 TinyInt data type 76 TNS_ADMIN environment variable 82 setting 9 tnsnames.ora file FAILOVER mode, configuring 56 location of 9, 29 trademarks list of 171 transactions committing 39, 45 record count, specifying 40 transparent application failover configuring 56 examples 57 triggers managing 48 troubleshooting log files and 59 TWO_PHASE environment variable configuring 29 TWO_TASK environment variable 82 configuring 29
writes (continued) links, required 15 parallel See parallel writes partitioned See partitioned writes writing data to Oracle tables Oracle OCI stages 115, 123
X
XMLType data examples of transferring transferring 71 XMLTYPE data type 74 72
U
Unknown data type 76 UNKNOWN data type 74 updating an Oracle database 94 updating an Oracle table 93 UROWID data type 74 user privileges setting 13 USER_TAB_PRIVS dictionary view utlexcpt.sql script location of 48 utlexcpt1.sql script location of 48
80
V
values empty string 54 padding 54 text 54 VARBINARY data type VarBinary type 76 VarChar data type 76
T
table actions configuring 45 table constraints managing 48 table names syntax for specifying 19 tables creating 76 creating before writing 45 replacing before writing 45 truncating before writing 45 Target category 100
74
W
Web sites non-IBM 175 write mode specifying 34 types, described writes configuring 11
34
180
Printed in USA
SC19-3441-00
Spine information:
Version 8 Release 7