Spotfire TDV 7.0.7 UsersGuide
Spotfire TDV 7.0.7 UsersGuide
User Guide
Version 7.0.7
Two-Second Advantage®
Important Information
SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED
OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY (OR PROVIDE LIMITED
ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE EMBEDDED OR BUNDLED
SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY OTHER TIBCO SOFTWARE OR FOR
ANY OTHER PURPOSE.
USE OF TIBCO SOFTWARE AND THIS DOCUMENT IS SUBJECT TO THE TERMS AND CONDITIONS OF A
LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED SOFTWARE LICENSE
AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE CLICKWRAP END USER
LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD OR INSTALLATION OF THE
SOFTWARE (AND WHICH IS DUPLICATED IN THE LICENSE FILE) OR IF THERE IS NO SUCH SOFTWARE
LICENSE AGREEMENT OR CLICKWRAP END USER LICENSE AGREEMENT, THE LICENSE(S) LOCATED
IN THE “LICENSE” FILE(S) OF THE SOFTWARE. USE OF THIS DOCUMENT IS SUBJECT TO THOSE TERMS
AND CONDITIONS, AND YOUR USE HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN
AGREEMENT TO BE BOUND BY THE SAME.
This document contains confidential information that is subject to U.S. and international copyright laws and
treaties. No part of this document may be reproduced in any form without the written authorization of TIBCO
Software Inc.
TIBCO and the TIBCO logo are either registered trademarks or trademarks of TIBCO Software Inc. in the United
States and/or other countries
TIBCO, Two-Second Advantage, TIBCO Spotfire, TIBCO ActiveSpaces, TIBCO Spotfire Developer, TIBCO EMS,
TIBCO Spotfire Automation Services, TIBCO Enterprise Runtime for R, TIBCO Spotfire Server, TIBCO Spotfire
Web Player, TIBCO Spotfire Statistics Services, S-PLUS, and TIBCO Spotfire S+ are either registered trademarks
or trademarks of TIBCO Software Inc. in the United States and/or other countries.
All other product and company names and marks mentioned in this document are the property of their
respective owners and are mentioned for identification purposes only.
THIS SOFTWARE MAY BE AVAILABLE ON MULTIPLE OPERATING SYSTEMS. HOWEVER, NOT ALL
OPERATING SYSTEM PLATFORMS FOR A SPECIFIC SOFTWARE VERSION ARE RELEASED AT THE SAME
TIME. SEE THE README FILE FOR THE AVAILABILITY OF THIS SOFTWARE VERSION ON A SPECIFIC
OPERATING SYSTEM PLATFORM.
THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. THIS DOCUMENT COULD INCLUDE
TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED
TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF
THIS DOCUMENT. TIBCO SOFTWARE INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE
PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME.
THE CONTENTS OF THIS DOCUMENT MAY BE MODIFIED AND/OR QUALIFIED, DIRECTLY OR
INDIRECTLY, BY OTHER DOCUMENTATION WHICH ACCOMPANIES THIS SOFTWARE, INCLUDING
BUT NOT LIMITED TO ANY RELEASE NOTES AND "READ ME" FILES.
Copyright © 2002-2018 TIBCO Software Inc. All rights reserved.
TIBCO Software Inc. Confidential Information
Contents 1
|
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
Product-Specific Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
How to Access TIBCO Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
How to Contact TIBCO Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
How to Join TIBCO Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
About Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Creating a SQL Script Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Working with SQL Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Designing Parameters for a SQL Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Promoting Procedures to Custom Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Pushing a Custom Function to the Native Data Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Using Pipes and Cursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .391
About Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Creating a JMS Event Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
Creating a System Event Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Creating a Timer Event Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Creating a User-Defined Event Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Creating Email Alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Preface
Documentation for this and other TIBCO products is available on the TIBCO
Documentation site. This site is updated more frequently than any documentation
that might be included with the product. To ensure that you are accessing the
latest available help topics, please visit:
• htps://docs.tibco.com
Product-Specific Documentation
The following documents form the TIBCO® Data Virtualization(TDV)
documentation set:
• TIBCO TDV and Business Directory Release Notes Read the release notes for a
list of new and changed features. This document also contains lists of known
issues and closed issues for this release.
• TDV Installation and Upgrade Guide
• TDV Administration Guide
• TDV Reference Guide
• TDV User Guide
• TDV Security Features Guide
• Business Directory Guide
• TDV Application Programming Interface Guide
• TDV Tutorial Guide
• TDV Extensibility Guide
• TDV Getting Started Guide
• TDV Client Interfaces Guide
• TDV Adapter Guide
• TDV Discovery Guide
• TDV Active Cluster Guide
• TDV Monitor Guide
• TDV Northbay Example
Overview of Studio
This topic describes Studio, the main tool for working with TIBCO® Data
Virtualization (TDV). It covers how to run Studio and describes its user interface.
• About Studio, page 21
• Opening Studio and Switching User, page 27
• Customizing Studio, page 29
• Modifying Studio Memory Usage, page 35
• Changing Your TDV Password, page 36
• Studio Unicode Support, page 36
• Security Features, page 38
About Studio
Studio is the primary tool you use to work with TDV. Studio has three major
components, accessed using tabs along the left side of the Studio window:
• Modeler
— Model, manage, and publish data sources, transformations, views, SQL
scripts, parameterized queries, packaged queries, definition sets, triggers,
TDV databases, and Web services.
— Define and refine resource properties and parameters using one of the
work space editors.
— Archive, export, and import TDV resources.
— Configure TDV components and operational controls.
• Manager
— Monitor and manage resource activities, including data source interactions
with TDV. See the TDV Administration Guide for more information.
• Discovery
— Reveal hidden correlations in enterprise data so you can build data models
for data virtualization and reporting.
— Scan data and metadata from across data repositories, whether they are
applications, databases, or data warehouses.
— Using Discovery’s graphical tools, create data models based on system
metadata and discovered relationships. For details, see the Discovery User
Guide.
The following graphic identifies the main components of the Modeler window.
Menus Toolbar
For more information about the Modeler resource tree, see Modeler Resource
Tree, page 23.
The contents of the Modeler work space depends on the type of resource you
have opened. Each resource has its own resource editor, and the editor usually
has multiple tabs. To learn about the work space contents and how to work with
each type of resource, refer to the appropriate topic of this manual.
The parent container path plus the resource name make up a unique identifier for
invoking and referencing any TDV-defined resource. For example, the
inventorytransactions table is referred to as
/shared/examples/ds_inventory/inventorytransactions. This reference to this
table is different from the XML schema definition set with the same name—
/shared/examples/InventoryTransactions—both because the parent container
path is different and because the name and path used to refer to the resource are
case-sensitive.
Note: The resource tree displays only the resources the current user has
permissions to view and use.
The resource tree displays all system-created containers and all resources
currently defined in TDV. When you create a new resource to use in the system,
you add that resource to a container. None of the top-level system-created
containers can be edited or deleted. The system-created nodes in the resource tree
are described in the following sections:
• Desktop, page 24
• Data Services, page 24
• My Home, page 25
• Shared, page 26
• <TDV Host>, page 26
Desktop
The Desktop represents the current user’s virtual work area on TDV Server. Some
container names shown on the Desktop are shortcuts that provide easy access to
containers deeper in the server’s resource hierarchy (Data Services, My Home,
Shared). The fourth container on the desktop, <TDV Host>:<port id>, represents
the complete resource hierarchy of the server, including the contents of the other
three Desktop containers.
Note: You cannot add a resource directly to the Desktop node of the resource tree.
Data Services
The Data Services folder contains two system-created containers, neither of which
can be edited or deleted:
• Databases, into which you publish the resources that you want to make
available to client applications that connect to TDV Server.
• Web Services, into which you publish the resources that you want to expose to
Web Services clients.
Databases and Web Services within Data Services display all the resources
published by any user in the system. Each published resource is tied to an
unpublished resource residing elsewhere in the system.
For details on publishing, see Publishing Resources, page 407. For details on how
clients can access the published resources, see the TDV Client Interfaces Guide.
The Databases/system folder of the resource tree contains system tables that are
used by the TDV system. For details on the contents of Databases/system, see the
TDV Reference Manual.
Note: The system tables are subject to change with new releases of the system.
My Home
My Home is a semi-private (visible only to you and an administrator with full
privileges) work area where you can design, develop, and test resources prior to
exposing them as shared resource definitions or publishing them as externally
available resources.
My Home is a shortcut to the directory folder
<hostname>/users/<Domain>/<CurrentUser>. The name of the Studio user
appears in parentheses at the top of the Studio navigation tree.
Shared
Shared is a system-created container that contains all resources made available to
all users in the system with appropriate access privileges. It is a place to store
projects for teams to work on. For details about access privileges, see the TDV
Administration Guide.
Shared/examples
This folder contains sample resources. For more information, see the TDV Getting
Started Guide.
<TDV Host>
<TDV Host> is the name and port ID (in the form <TDV Host:Port ID>) of the
TDV server to which Studio is connected. If Studio is connected to TDV on the
local computer, the hostname is localhost by default. The default port number is
9400. <TDV Host> represents the entire contents of the TDV Server on your
machine and has the following system-created containers:
Containers Description
services The services container has the same contents as the desktop folder named Data
Services.
lib The lib container contains all the system API calls. For details, see the TDV
Application Programming Interfaces Guide.
security policy This container holds security policies. TDV comes with a set of prebuilt security
policies and if you define custom security policies, they are saved in this location.
shared The shared container has the same contents as the desktop folder named Shared.
When you add any resource to <TDV Host>/shared, the change is reflected in
the structure of Desktop/Shared. For details, see Location where Resource Tree
Items Are Saved, page 23.
Containers Description
users The users container has one container for each security domain in the server. The
default domain composite is system-created, and cannot be edited or deleted.
The system-created user named admin belongs to the domain composite.
Each domain in users is represented by a folder-container, and each domain has a
container for each user in that domain. These containers are referred to as those
users’ “home folders.”
You cannot add a resource directly to the users node, but when an administrator
adds a user to a domain in the system, the new user’s name and resources are
displayed here. For details on adding users to the system, see the TDV
Administration Guide.
From users, you can view other users in the system and use their resources if you
have appropriate access privileges.
If you view your home folder from users, you see your Desktop/My Home,
because My Home represents your home folder.
Starting Studio
Studio provides access to TDV and its resources. When you start Studio, you need
to know the TDV Server instance to which you want to connect, the domain, and
the port ID. For details, see the TDV Installation and Upgrade Guide or the TDV
Getting Started Guide.
To start Studio
1. Select Start > All Programs > TIBCO > Studio <ver> > Start Studio <ver> from
the Windows desktop. This command opens the Studio sign-in window.
2. Enter your user credentials and information for the TDV to which you are
connecting. For details, see the TDV Installation and Upgrade Guide or the TDV
Getting Started Guide.
3. Click Connect.
4. Click OK.
Studio opens on the Modeler tab (top tab along the left edge of the window.
See About Studio, page 21 for information about how the Studio user interface is
organized.
Command Description
studio.bat Starts Studio from a DOS prompt.
studio.bat -server=hostname Starts Studio from a DOS prompt for a specific TDV Server.
studio.sh -server=hostname Starts Studio from a DOS prompt for a specific TDV Server.
3. Enter your user credentials and information for the TDV to which you are
connecting. For details, see the TDV Installation and Upgrade Guide or the TDV
Getting Started Guide.
4. Click Connect.
5. Click OK.
Studio opens on the Modeler tab (top tab along the left edge of the window.
See About Studio, page 21 for information about how the Studio user interface is
organized.
To change the user, port, domain, or connection type, and then restart
Studio
1. On the File menu, choose Switch User.
2. In the dialog box that appears, confirm that you want to switch user by
clicking Yes.
3. Supply your username and password, and enter the domain, server, and port
information.
4. (optional) If the connection must be secured, check the Encrypt check box to
initiate an SSL connection between Studio and TDV.
5. Click Connect. Studio reopens.
Customizing Studio
This section contains topics that you can use to customize your Studio experience.
• Customizing Studio Using the Studio Options Dialog, page 29
• Customizing Studio Result Panel XML Text, page 35
To customize Studio
1. Open Studio and log in.
2. Select Edit > Options.
Studio opens the Options dialog box with the General tab displayed.
3. Edit the options as necessary.
Display custom name in Enter a string to display your text in the Studio application title bar.
the title bar
Store execution Makes this Studio instance remember the procedure input
parameters on disk and parameters used during Studio test execution, so the test is easier to
reuse when Studio runs duplicate.
again
Copy privileges from Sets the default privileges to apply creating new resources, making
parent folder when child resources accessible to the same groups and users.
creating new resources
Use a colored background Toggles the View Model background between blue and white.
in the View Model
Show the status bar at Toggles display of the Studio status bar. This bar shows encryption
bottom of the main status of the connection between the Studio and TDV, Studio
window memory usage, and introspection progress (when underway).
Show the navigation tree Toggles display of the Studio resource tree in Studio Modeler. This
in main window setting also toggles display of the Manager navigation panel.
Enable VCS Select this option to enable PDTool features. PDTool is available as an
open source tool through GitHub and professional services can be
contacted to help you with your implementation.
Logging - Enable Adds logged detail to assist with debugging. This can add a lot of
additional logging within information to the log file but has little affect on performance.
Studio Additional logging information is sent to the
<TDV_install_dir>/logs/sc_studio.log file, which you can view
using the View Studio Log button.
Session This value adjusts the Studio session timeout. After a timeout, Studio
Heartbeat(millisecond)- prompts you to sign in again.
Session Heartbeat Interval
Warn before execution When checked, warns that a procedure will return results based on
when other editors are the resource metadata definitions saved on the server, and not as-yet
unsaved unsaved definitions in editors currently open on the Studio desktop.
Warn before data lineage When checked, warns that the data lineage calculation will return
calculation when current results based on resource metadata definitions on the server, rather
editor is unsaved than metadata changed but not yet saved in an open editor.
Display confirmation
prior to removing the
view model.
Display confirmation
prior to removing the
OLAP view model.
Option Description
Automatic Capitalization Capitalizes and formats SQL keywords when they are typed or pasted.
You can set font, color and size using the Studio Options dialog.
Automatic Indentation Automatically indents text based on the SQL keyword on the previous
line. The following keywords have the same indentation, and the SQL
Editor indent the line that follows a keyword in this list:
SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, FULL OUTER JOIN,
LEFT OUTER JOIN, RIGHT OUTER JOIN, INNER JOIN, UNION, UNION ALL,
EXCEPT, EXCEPT ALL, INTERSECT, and INTERSECT ALL.
Additionally newlines made with an Enter keystroke start with the
indentation present on the preceding line, whether indentation is done
with spaces or tabs. A single Backspace keystroke puts the cursor back
at the far left position of the line.
Option Description
Indent with Spaces (not Allows insertion of spaces (ASCII character 32) for indentation instead
tabs) of tabs (ASCII character 9). Both are handled as required for querying
the underlying data sources. By default, tabs are used for indentation.
Tab width lets you change to the default tab width as represented by
spaces. Manually entered tabs can differ from the default SQL display
of four spaces used as indentation.
Tab Width Lets you change tab spacing from four (default) character widths to any
value between 1 and 32.
Font list Lets you select a font family from the scroll list.
Font size list Lets you select the font size from the scroll lists.
Bold Check box that lets you select or unselect bold font.
Italic Check box that lets you select or unselect italic font.
scrollable field Lets you preview what the SQL text looks like with your font family,
size, and bold or italic choices.
6. Select and edit the options on the SQL Scratchpad tab as necessary.
Some Studio SQL Scratchpad editor options are configurable in the SQL
Scratchpad tab of the Studio Options panel. See Views and Table Resources,
page 215 for more information about how to use the SQL Scratchpad. Each
group has a Reset Settings button on the right, and the SQL Scratchpad tab has
an overall Restore Defaults button.
Item Description
Open SQL Scratchpad When checked, causes the SQL Scratchpad editor to open
Editor when Studio starts automatically when Studio restarts in cases where it was open when
if it was open at exit check Studio was last exited. Default: checked.
box
Automatically save When checked, causes Studio to save the SQL query currently being
content when the editor is displayed in the SQL Scratchpad editor, as well as all queries in the
deactivated check box History list, when you close the SQL Scratchpad editor. Default:
checked.
Maximum History Size Lets you retain between 2 and 50 of the most recent SQL queries you
drop-down list have created in the SQL Scratchpad editor. Locked queries in the
History list do not count toward this maximum. Default: 12.
7. Select and edit the options on the Transformation Editor tab as necessary.
Item Description
Maximum Operation Set the maximum width of the operations shown on the
Width Transformation Editor model page.
Error Level to Show in the Defining a transform can be complicated, you can configure the level
Editor of error reporting displayed by Studio while you are defining your
transform.
Show Insert Cast Dialog When adding a cast function to your transform, you are shown a
dialog to help you define the details of the cast.
Prefixes are also used when defining namespaces. Seeing the
namespace can help you avoid namespace conflicts.
Show Prefixes in Type The prefix values are used to indicate function type. They can be used
Names to identify if a function is SQL or custom.
Show Operation For the operations shown on the Transformation Editor, display any
Annotations annotations in the heading portion of the operations.
You can change the extended memory and maximum memory allocation pool
values for Studio.
Xmx The maximum memory allocation pool for a Java Virtual Machine (JVM).
You can change your password using Studio’s File menu, if you have the Access
Tools right.
Changes made to the user rights profile take effect nearly immediately because
TDV checks for appropriate rights every time feature access is attempted.
The TDV Server returns Unicode characters in all messages carrying data. The
TDV Server transforms messages to Unicode. Studio can be configured to display
all the Unicode TrueType fonts to display those characters:
• Upgrading Studio to Display Unicode, page 37.
Security Features
Security features are discussed throughout this guide, but especially in these
topics and sections:
• HTTPS
Publishing resources (Publishing Resources, page 407)
• Kerberos
Configuring and connecting data sources (Configuring Relational Data
Sources, page 77, Configuring Web-Based Data Sources, page 105,Configuring
File Data Sources, page 165)
Publishing resources (Publishing Resources, page 407)
• Passwords
In studio (Changing Your TDV Password, page 36)
Working with data sources (Working with Data Sources, page 65)
Working with views (Views and Table Resources, page 215)
Publishing resources (Publishing Resources, page 407)
Setting up and using caching (TDV Caching, page 465)
Setting up and using data ship (Data Ship Performance Optimization,
page 597)
• SSL
In Studio (Customizing Studio, page 29)
Working with data sources (Working with Data Sources, page 65)
Publishing to a database service (Publishing Resources to a Database Service,
page 416)
This topic describes how to create and work with TDV resources in Studio.
• About Resource Management, page 39
• Creating a TDV Resource, page 40
• Opening a Resource, page 42
• Getting Information about a Resource, page 42
• Annotating a Resource, page 44
• Moving, Copying, Deleting, or Renaming a Resource, page 44
• Searching for Resources in the Server, page 46
• Impacted Resources, page 47
• Exporting (Archiving) and Importing Resources, page 50
• Using Studio for a Full Server Backup, page 60
• Annotating a Resource, page 44
• Built-in procedures are available in the TDV system library (in /lib/resource)
for managing resources. Refer to the TDV Reference Guide and the TDV
Application Programmer Interface Guide for details.
When you create a resource, you are adding an instance of the resource to the
metadata repository. For a description of the hierarchy of user-created resources,
refer to Modeler Resource Tree, page 23.
The process you follow and the resource access rights depend on the type of
resource, as described in these sections:
• Locating a Container for a TDV Resource, page 40
• Creating a Resource, page 41
• Copying Privileges to Another Resource, page 41
You must grant access and use privileges explicitly to other users and groups
before they can use the resource you create. For details on access privileges, see
the TDV Administration Guide.
Creating a Resource
This section describes the common procedure for creating TDV resources. Several
basic resource types are created in the same way. These resource types include:
• Definition sets (SQL, XML, or WSDL)
• Folders
• Models
• Parameterized queries
• SQL scripts
• Triggers
• Views
• XQuery procedures
Creation and configuration of other resource types are described elsewhere:
• Data sources are discussed in Working with Data Sources, page 65 through
Configuring File Data Sources, page 165.
• Packaged queries are discussed in Procedures, page 267.
• Transformations are discussed in Using Transformations, page 311.
Opening a Resource
To open a resource
• Double-click the resource name in the resource tree.
• Right-click and select Open from the popup menu.
• Select the resource and from the File menu, choose Open <resource name>.
• Select the resource and type Ctrl-o.
You can get information about a resource like who owns it, who created it, and
when it was created as well as any annotations about the resource that might have
been added.
The data source tabs display the following information about the resource.
Field Description
Name User-provided name and location.
Field Description
Owner Name of the owner.
Orig Creation Date Date the resource was created. If created in a release prior to TDV 6.2,
this field contains Not Recorded.
Orig Owner Name of the owner who created this resource. If created in a release
prior to TDV 6.2, this field contains Not Recorded.
Last Modified Date Date when the resource was last edited.
Last Modified User Name of the user who last edited the resource.
Maximum Number In (All triggers except JMS triggers; in the Queue Properties section)
Queue Displays and lets you set the number of times a trigger can be queued
simultaneously before duplicates are dropped.
A value of 1 (default) means that the queue for this trigger can contain
only the trigger currently being processed. If the trigger repeats before
the previous one finishes its task, it is lost.
When the value is set to 2 or greater, additional triggers start execution
as soon as the current trigger finishes. Each refresh failure results in an
email notification. A larger number of queues for a trigger reduces
event loss but might use more memory.
(JMS triggers; in the Queue Properties section) TDV sets this field to 0.
You cannot change it, because no message queueing is done inside
TDV.
Target Definition Set (XQuery transforms) The path to the definition set containing the
schema.
Target Schema (XQuery transforms) The fully qualified name space for the schema.
Execute only once per (XQuery procedures) If other procedures are called multiple times
transaction for each with the same parameter and values from within a procedure, it is
unique set of input values invoked only once and the original results are reused. This ensures that
the execution results of a procedure are kept intact for the transaction,
as long as the procedure’s input parameter values remain the same.
For details, see Setting Transaction Options for a Procedure, page 304.
Annotating a Resource
You can edit any resource that you own or for which you have editing privileges.
Whenever a resource has been edited or changed, Studio indicates this by
enabling the Save button on the Studio toolbar, displaying the name on the tab
editor in blue italic type with an asterisk, and displaying the editor tab in which
the change was made in blue type with an asterisk.
Saving the resource removes the asterisk and displays the tab text in black type.
To annotate a resource
1. Verify that you have Read Access and Write privilege on the specific resource.
2. Open the resource.
3. Select the Info tab to view and modify annotations.
You can move a resource by cutting a resource and pasting it into another
location, or by renaming it, subject to the following rules:
• You cannot move an unpublished resource into Data Services, which only
stores resources that have been published. See About Publishing Resources,
page 408.
• You cannot move a published resource out of a Data Services container. You
must delete it, and then, if you want, republish it.
• The only way to move the children of a data source is to move the entire data
source.
• You can move a container (except a system-created container). When you do,
all of the resources it contains are moved.
• You can move a leaf, such as a view or a procedure.
• As with any cut and paste sequence, if you cut a resource and do not paste it
in its new location before you cut or copy another resource, the original
resource is lost.
You can rename any resource in the resource tree. All references to the original
filename in the SQL within resources in Studio are also changed to the new name.
Resources with dependencies that are currently opened are closed to update and
accommodate the changed dependency name.
To move a resource
1. Right-click the name of the resource and select Cut, or select the resource
name and click the Cut toolbar button.
2. Right-click the name of the container in which to place the resource, and select
Paste into; or select the container name and select the Paste into button on the
toolbar.
To copy a resource
1. Right-click the name of the resource and select Copy; or select the resource
name and click the Copy toolbar button.
2. Right-click the name of the destination container and select Paste into; or
select the container name and select the Paste into button on the toolbar.
To delete a resource
1. Right-click the resource, and select Delete; or select the resource and click the
Delete button.
To select multiple resources for deleting, hold down Ctrl while selecting the
resources one by one, or hold down Shift while selecting a set of adjacent
resources.
2. In the Confirmation window, click Yes.
To rename a resource
1. Right-click the resource, and select Rename; or select the resource and then
the Edit > Rename option.
2. Place the cursor anywhere on the resource name, type a new name, and press
Enter.
You can search for any resource in the server using the Studio search feature.
The search field includes a timer that starts and waits about 0.3 seconds for you to
stop typing. Each time you type a character, the timer restarts. When you stop
typing, the server is queried for any resources whose name begins with the text
you have typed, and resource references containing the resource name, path, and
type are returned.
Impacted Resources
An impacted resource in TDV is displayed with a red impacted icon in the Studio
resource tree, because the resource definition is no longer valid. For example, if a
composite view includes tables from a data source that is later deleted, the
resource becomes impacted. The resource itself, any resource that depends upon
it, and its containers display a red circle with an exclamation point in it.
5. Click OK.
For more information, see Customizing Studio, page 29.
Mismatched Resources A resource depends on another resource that has changed and no
longer matches—for example, if the number of input variables changes
for a procedure referenced in a view.
Security Issue A resource refers to another resource that you no longer have
permission to access—for example, if the permission on a referenced
data source is made more restrictive and you no longer have the
required level of permission.
Syntax Error A syntax error is introduced in the resource—for example, if you type
an invalid SQL statement and save it.
You can save selected resources—or an entire TDV Server instance—to a TDV
archive (CAR) file so that you can port the resources to another TDV instance or
preserve a snapshot of TDV.
Archiving a resource is called exporting; deploying it to the same or another
location is called importing. When you export a resource, a copy of the resource is
archived into a CAR file and saved in a specified location, while the original
repository configuration for the resource tree remains intact. Archived resources
can be redeployed by importing this CAR file. This file is referred to as the export
file or import file.
If you want to archive an entire TDV Server instance, it is best to first perform a
full server backup. You can retain the resulting CAR file as backup, or use it to
restore the TDV Server to a known state. See Using Studio for a Full Server
Backup, page 60, for more information.
You can export and import resources using Studio or using TDV command-line
utilities. See the TDV Administration Guide for information about using
backup_export, backup_import, pkg_export, and pkg_import. With appropriate
access rights, you can export a single resource, a group of resources, or the entire
TDV configuration (except for local machine settings).
By default, importing adds CAR file resources to existing resources in identical
container paths. Whenever both the resource path and name in the CAR file
match the resource path and name on the target server, the CAR file version
overwrites the target resource.
• Access Rights for Export and Import, page 50
• Exporting Resources, page 51
• Export and Import Locked Resources, page 53
• Marking Rebindables, page 53
• Rules for Importing Resources, page 54
• Importing a Resource, page 55
• Rebinding an Imported Resource, page 59
Exporting Resources
You can export selected nodes and resources in the Studio resource tree to a CAR
file that contains the relevant resource definition metadata. Resources selected are
exported with all of their child objects.
Exported CAR files can include caching configurations and the cached data, data
source connections, access privileges, custom JAR files, dependencies, cardinality
statistics, Discovery files, and the owners of the exported resources (called
“required users”).
Custom Jars JAR files created to support new data sources. JAR drivers in the
conf\adapters\system folder are exported, but JAR files in the
apps\dlm\<DATASOURCE>\lib folder are not exported, regardless
of this setting.
Custom JAR settings overwrite system JAR settings, which in turn
overwrite DLM JAR settings.
Cardinality Statistics Includes data source statistics, and configurations such as refresh
mode, refresh timeframe, manual cardinality overwrites, and specific
column settings.
Discovery Files Be sure to include Discovery files, such as indexes and relationship
files.
Marking Rebindables
Many resources, such as views, procedures, and Web services, are dependent on
one or more underlying resources. Dependent resources are considered ‘bound’
to the underlying resources. Imported resources often need to be bound again in
their new location.
Binding a resource to its dependencies is also often necessary in the import
process. For example, a procedure P1 might retrieve data from table T1 in the
development data source DS1, and refer to it accordingly in its FROM clause. In
this case, P1 is said to depend on T1, and P1 and T1 are said to be bound. When
importing P1, the user must ensure that the connection to T1 is fed from either
DS1 or another data source—for example, a live production data source rather
than a QA or development data source.
For additional information, see Rebinding a View, page 248 and Rebinding a
Procedure, page 303.
Exported resources are often bound to underlying sources, and the binding
should be reestablished (or changed) when the resources are imported.
When a resource is marked as rebindable in the CAR file, Studio import of this
resource displays the path and description in a dialog box that lets the user
specify a new path for the dependency resource.
Or
1. Use the command-line pkg_import program.
2. Specify the following command:
-rebind <OldPath> <NewPath>
permission but not Sue, Jean’s privileges are updated to just Read, but Sue’s
are left intact.
— If the resource is a folder or data source, its child resources are not removed.
Importing a Resource
You import CAR files to replicate TDV resources and resource configurations.
CAR files are created by exporting TDV resources, and they usually contain many
object resources—possibly even a full server backup. For details on creating a
CAR file, see Exporting Resources, page 51 and Using Studio for a Full Server
Backup, page 60.
During the import process, you can choose to import or not import caching
information, privileges, custom adapter JARS, and data source connections. You
can also override locks, overwrite resources, merge folders, and create caching
tables.
Resources imported from an archive using Studio are placed in the selected folder
in Studio’s resource tree. Import performs a path relocation on all folder path
roots in the archive. See Rules for Importing Resources, page 54 for details about
the rights and privileges related to importing resources.
Prior to performing the import, you can view information about the archive file to
make sure it is the one you want. You can also preview the messages generated
during the import without actually performing the import.
To import a resource
1. Right-click the container into which you want to import a resource, and select
Import; or select the resource and then the option File > Import Into
<container>.
The Import into window opens.
2. Use the Browse button to locate and upload the CAR file, or type the full path
to the import file in the File field.
3. Optionally, type a password. If the CAR was created using a password, you
must type that password here. If the CAR was created without using a
password, you can leave this field blank.
4. Clear items in the Include Resource Information section that you do not want
to import.
Note: A resource information box is checked only if that resource is present in
the CAR file and you have appropriate user rights. For example, the Privileges
check box is enabled only for users with the Read All Users right.
Option Description
Caching Includes or excludes resource caching configurations, including
storage type, location, scheduling information, and materialized view
data.
Override Locks Clears existing resources from the target and sets the TDV Server to
the resource definitions present in the full server backup CAR file. The
importing user must have the Unlock Resources right to use Override
Locks.
Create Caching Tables If selected, the cache status, tracking and target tables required by
cached resources are created, if not already present in the database for
your data source.
Option Description
Overwrite Overwrites or stops import of resources that have identical names and
folder paths. Owner and user privileges are not overwritten unless
they are explicitly changed to new privilege setting.
Use this option to guarantee that the new TDV workspace matches the
folders and content present in the CAR file.
This option clears target folders before copying the CAR file contents
to them.
Custom Jars Includes or excludes any custom JAR created to support a new data
source.
Merge Users Folder Imports all users present in the CAR file who are not already present
in the server target. This option takes precedence over the Include
Users Folder option. This option’s behavior might be altered by the
Overwrite option.
Include Users Folder Includes or excludes resource owners who might be present in the
CAR file.
Imports all users present in the exported CAR file unless Merge Users
Folder is specified. By default, domain, group, and user information
are not included in export or import packages. When users are not
included, the importing user becomes the owner of the resource.
Include Cache Policies Includes or excludes cache policies that might be present in the CAR
file.
If your CAR file includes cache policies, import the CAR from the
Studio Desktop directory level. Cache policies must be save and used
from the /policy/cache Studio directory.
Discovery Files Check it to include Discovery files (such as indexes and relationship
files) in the import.
This table shows all possible combinations of these options. Check marks on
the left side of the table show the import options used, and check marks on the
right side show the results on the TDV target after the import.
Note: If Overwrite and Include Users Folder are both checked, all existing
user definitions are removed and replaced with the users present in the
import CAR file. The user performing the import must have Read and Modify
All Users rights to use this combination.
Choosing check box options in this section is equivalent to enabling options in
the pkg_import command-line utility. See the TDV Administration Guide for
more information.
5. If you edit the entry in the File field, click Refresh to view the following
information about the import file:
— File Type—archive type (whether partial or the entire server)
— Date—day, date, and time when the resource was exported
— User—name of the user who exported the resource
— Server—host machine and port from which the resource was exported
6. Select the Show Rebinding Options box to bind a resource to an underlying
source for the first time. (For details about resource binding, see Marking
Rebindables, page 53.)
If you check the Show Rebinding Options box the Import Into rebinding
window appears. See Rebinding an Imported Resource, page 59.
7. Optionally, click Preview to view a list of the contents of the import file
without actually performing any imports.
The Preview button works like the -messagesonly option to the pkg_import
command-line utility. This option displays the messages generated in a
package import without actually performing the import.
8. Optionally, type search criteria in Find, and click Search. Use the up or down
arrow key to jump to the next instance of the search text. The text that you are
searching for is highlighted in yellow.
9. Click Import to import the resources.
10. Click Done to complete the importing process.
Studio can perform a full TDV Server backup. This capability is available for
those with these administrative rights: Access Tools, Read All Resources, Read All
Users, and Read All Config.
This option is equivalent to the backup_export command-line utility.
To lock a resource
1. Verify that you are the resource owner.
2. Select one or more resources or containers in the resource tree container,
right-click, and select Lock Resource.
To unlock a resource
1. Verify that you are the resource owner or that you have the Unlock Resources
right.
2. Save the resource prior to unlocking it.
3. Right-click the locked resource and select Unlock Resource.
All resources that can be unlocked by the current user are shown.
TDV lets you work with a wide variety of industry data sources, including
relational databases, file data sources, WSDL data sources, application data
sources, and so on, by providing built-in data adapters that help you define and
configure the data sources for TDV. This topic describes how to work with data
sources in TDV.
• About Data Sources and TDV, page 65
• Data Source Categories, page 67
• Adding a Data Source, page 68
• Editing or Reviewing Configuration Information for a Data Source, page 71
• Viewing Data Source Table Details, page 73
• Adding and Removing Data Source Resources, page 74
• Testing the Connection to Your Data Source, page 75
TDV integrates your data sources into a single virtual data layer that can be used
to model rich information resources. You can add many types of data sources to
the TDV platform using Studio, defining connection profiles, specific source
capabilities, and detailed source introspection metadata. TDV views and
procedures built on the data modeling layer can take advantage of capabilities
unique to each data source, allowing you to integrate those disparate parts into a
single virtual data layer.
Adding a data source means creating a metadata representation of the underlying
native data source in the TDV repository. It does not mean replication of the data
or replication of the source. The data source metadata for each data source type
includes things like how the data source:
• Defines and stores data, including catalogs, schemas, tables, and columns.
• Accepts and responds to requests for data.
• Handles insert, update, and delete transactions.
• Executes stored procedures.
• Makes data-related comparisons.
A virtual layer of information about data source capabilities allows the TDV
Server Query Engine to create efficient query execution plans that leverage data
source strengths and inherent advantages of preprocessing data at the source.
The following sections elaborate on the preparation and use of data sources:
• About TDV Adapters, page 66
• About Introspection, page 66
• About Federated Queries, page 66
About Introspection
Introspection is the process of retrieving metadata and data from your data
sources. For more information, see About Introspection and Reintrospection,
page 190.
The following table lists where in the documentation you can find further
information about such issues.
Suppressing push of SQL The five DISABLE_PUSH sections of TDV Query Engine
operators Optimizations, page 593.
Pushing custom functions Pushing a Custom Function to the Native Data Source,
page 274.
TDV function support “TDV Support for SQL Functions” section of the TDV Reference
Guide.
Miscellaneous function support “Function Support Issues when Combining Data Sources”
issues, such as data precision, section of the TDV Reference Guide.
truncation vs. rounding, collation,
time zones, and interval
calculations
Type Description
Relational Data Sources See the TDV Installation and Upgrade Guide for supported data sources.
Specific connection details, parameters, and settings are slightly
different for each adapter.
For more information on using these, see Configuring Relational Data
Sources, page 77.
Type Description
File Data Sources See the TDV Installation and Upgrade Guide for supported data sources.
For more information on using these, see Configuring File Data
Sources, page 165.
WSDL and Web Services TDV supports Web Services Definition Language (WSDL), SOAP, and
Data Sources REST. Web services can be “bound” to any message format and
protocol, but these bindings are the most popular: SOAP, REST, HTTP,
and MIME. TDV supports several profiles for SOAP binding, including
the following: SOAP over HTTP, SOAP over WSIF JMS, and SOAP over
TIBCO JMS.
See Configuring Web-Based Data Sources, page 105 for more
information.
LDAP Data Sources You can use LDAP sources as data sources or (in some cases) as
authentication servers. The TDV LDAP basic adapter supports the
LDAP v3 protocol, and it is used when you want to use an LDAP
source as an introspected data source. Most LDAP sources that are
LDAP v3 compliant can be used with the TDV LDAP driver to connect
and introspect the selected resource nodes for use in the modeling layer.
When you want to use an LDAP source as an authentication service, the
TDV Web Manager > Domain Management page is used to add and
initially configure the LDAP domain. See the TDV Installation and
Upgrade Guide for a list of compatible LDAP directory services. Refer to
the TDV Administration Guide section on LDAP Domain Administration
for more details on how to implement an LDAP domain for TDV user
and group authentication.
See Adding an LDAP Data Source, page 174.
This section is an overview of the process for adding a data source. It includes the
following topics:
• Custom Adapters, page 70).
• Auto-Discovering Data Sources, page 71
3. Optionally, if one of the built-in adapters do not fit your data source, you can
create a new, custom adapter (see Custom Adapters, page 70).
4. Optionally, if you are unsure of the data sources you want to define or that are
available to you on your network, you can auto-discover resources (see
Auto-Discovering Data Sources, page 71).
5. Search for and select the adapter to use to connect to your data source, and
click Next.
The configuration window that is displayed depends on which data source
adapter you have selected.
6. Provide the data source connection information to configure your data source:
a. For Datasource Name, enter a unique user-defined name for the
introspected data source. It can be any name that is valid in TDV. This
name will be displayed in the resource tree when the data source is added
to the TDV Server.
b. For Connection Information on the Basic tab, supply the required
information. Required information will vary depending on the type of
data source you are adding.
— Username and Password—Valid username and password to your data
source.
— Save Password and Pass-through sign-in —These fields work together.
Discussion for how they work is in the TDV User Guide.
c. Click the Advanced tab, specify details about the connection.
Some properties refer directly to the configuration of the data source Server
and must be provided by an Administrator. Other properties are specific to
TDV and how it interacts with the data source.
— Connection Pool Minimum Size, Maximum Size, Idle Timeout(s), and
Maximum Connection Lifetime—These control the JDBC connections
made to the data source, specifying timeout in seconds, the minimum and
maximum number of simultaneous connections in a connection pool.
— Connection Validation Query—A simple query to test whether the
connection is valid or not. You can leave it empty, or use simple queries like
“select * from dual.”
— Organization Id Cache Timeout Seconds—Length of time to cache user
security information. After this time the security information is refreshed
from the data source. If a user's security information is changed before the
cache is refreshed, the changes will not be available to the Suite data source
until the cache is refreshed.
For the specific connection information needed to configure each data source
see:
— Configuring Relational Data Sources, page 77
— Configuring Web-Based Data Sources, page 105
— Configuring File Data Sources, page 165
For information on configuring any of the Advanced Data Sources, see the
additional adapter guides that are provided as online help or in PDF form.
7. Create the data source using one of these buttons:
— Click Create & Close to create this data source which can be introspected
later.
— Click Create & Introspect to select the data resources and introspect them.
See Retrieving Data Source Metadata, page 189 for how to do this.
Custom Adapters
You can create a new custom adapter with the New Adapter button on the right
side of New Physical Data Source dialog. Both system adapters and custom
adapters that were created earlier are listed in the Select Data Source Adapter list
in the left panel. There are specific reasons why you might want to create a
custom adapter. See the TDV Extensibility Guide for more information see Adding
and Removing Data Source Resources, page 74.
See the Extensibility Guide to see what capability changes might require use of a
custom adapter.
After a data source has been configured and introspected as a new physical data
source, the configuration information can be reviewed by opening the data source
in Studio. The tabs display details about the data source connection and other
configuration information. The details differ depending on data source type
(relational, web-based, file, and so on).
If you make changes, you need to save changes when you close the data source.
You also need to reintrospect the data source to apply your changes. For more
information, see Reintrospecting Data Sources, page 206.
Test Connection Provides a way to validate the connection definition for the data
source.
— Caching
Tracking Table Tracks which views and procedures are currently using the data source for
caching, and what tables in the data source are in use.
Connection
information Description
Basic tab Displays the basic connection details TDV uses to connect
and sign in to the data source.
Field Description
Name User-defined name given to the data source during introspection.
Field Description
Owner Domain to which the owner of the data source belongs, or the sign-in name of
the user who created the data source.
Orig Creation Date Date the data source was originally created.
Orig Owner Name of the owner who originally created this data source.
Last Modified Date Date when the resource was last edited.
Last Modified User Name of the user who last edited the resource.
Lock Owner of the lock on the resource, or Not Locked. TDV resources can be
locked by a user to prevent concurrent and conflicting changes. See the
Annotating a Resource, page 44.
Cluster Health Displays the table used to monitor the health of an Active Cluster. It contains
Monitor Table the heartbeat messages from all nodes. Its purpose is to help caching
determine the health of other nodes in the cluster in case of a cluster split.
Refer to the Active Cluster Guide for more information.
Refresh Capability Use to refresh the capability information for this data source.
Information
Case Sensitivity Reports whether the case sensitivity and trailing spaces settings do or do not
match between TDV and the data source capability file. A mismatch does not
Trailing Spaces necessarily reduce performance unless it results in a full-table scan performed
locally by TDV. See the “Configuring Case and Space Settings” topic in the
TDV Administration Guide.
Annotation Area at the bottom of the panel that displays notes about the data source. Any
user with Write permission can edit the annotation.
Data source tables introspected in TDV can be opened to view the table’s
metadata attributes.
Column Description
Name Column name. A key icon to the left of the name indicates
that this column is a primary key in this table.
Native Type Displays the native data type of this column in the original
data source.
3. See the Discovery User Guide for information about the additional columns.
You can add or remove data resources at any time. Adding resources requires
introspection of the data source to obtain the metadata about the new resources.
When you remove resources from TDV, you are deleting the TDV metadata about
the resource and not the actual resource. If you remove a resource that had
dependencies, the dependent resource names in the Studio resource tree turn red,
alerting you to resource problems.
After adding a data source, you can test whether the connection details you
supplied are working.
If the connection fails, examine the error message and review the following
common remedies.
Password expired. Login to the data source through its Web site and reset your
password. If your organization’s password policy forces
users to change their passwords every certain number of
days, you should enable the Password Never Expires option
on your profile.
The SAP account is not authorized Contact SAP or your organization’s SAP administrator for
for API access. assistance in enabling your organization's account for API
access.
The Siebel account is not Contact Siebel or your organization's Siebel administrator to
authorized for API access. have your organization's account enabled for API access.
Username and password are Verify that you have provided the correct user name and
invalid or locked out. password by logging into the data source’s Web site itself.
Contact your organization's Administrator for assistance.
This topic describes the configuration options that apply generally to relational
data sources for connection with TDV. Follow the steps in Adding Relational Data
Sources, page 85 for every relational data source.
Data type and function support are described in the TDV Reference Guide.
For details on versions supported for each type of data source, see the TDV
Installation and Upgrade Guide.
The following topics are covered:
• About Pass-Through Login, page 77
• About Data Source Native Load Performance Options, page 79
• Data Source Limitations, page 80
• Netezza Setup, page 84
• Adding Relational Data Sources, page 85
— Basic Settings for a Relational Data Source, page 87
— Advanced Settings for a Relational Data Source, page 93
• Enabling Bulk Loading for SELECT and INSERT Statements, page 103
In pass-through mode:
• If you save the password, you can introspect without resupplying the
password.
• You can perform the following operations even if you did not save the
password:
— Query, update, insert and delete operations (but you need to resupply the
original login credentials for the current session).
— Reintrospect, add, or remove data source resources. You are prompted to
resupply the password that was used when the data source was originally
introspected.
• You cannot perform the following operations if you do not save the password:
— Schedule reintrospection
— Gather statistics using the query optimizer
— Perform scheduled automated cache updates
Note: Pass-through authentication is not allowed for TDV-built system-level
accounts.
When using scheduled cache refreshes, you must specify a login-password pair in
the data source, and the pair must be the same for the resource owner (or creator)
who is accessing TDV and the target data source. If this is not the case, the cache
refresh fails.
If a view uses a data source that was added to TDV Server using pass-through
mode, and the password has not been saved, row-based security may affect the
cache refresh. For example, a cached view named CachedCommonView uses the
SQL statement SELECT * FROM db2.T1; user John is allowed to view only 10
rows; user Jane is allowed to scan 20 rows. Whenever Jane refreshes the view
cache, both Jane and John are able to view 20 rows, but whenever John refreshes
the view, Jane can view only 10 rows.
The operations you can and cannot perform in pass-through mode depend on
if Save Password is checked as follows:
Save
password? Operations you can perform Operations you cannot perform
Most data sources have proprietary ways to optimize performance, and they are
typically enabled by default. If TDV detects that your operation (for example, a
direct SELECT or INSERT) can be made faster using a native loading option, TDV
attempts to use it.
Depending on the data source type and the TDV feature you are using, native
load performance options vary as follows.
Data Source Type INSERT and SELECT Caching and Data Ship Performance Options
DB2 Native load with • Native load with INSERT and SELECT
INSERT and SELECT
• Native and bulk load with LOAD utility
is supported.
Netezza Native load with • Native load with INSERT and SELECT
INSERT and SELECT
• External tables
is supported.
PostgreSQL • COPY
(Greenplum)
Data Source Type INSERT and SELECT Caching and Data Ship Performance Options
Sybase IQ • Location
• iAnywhere JDBC driver
Teradata Native load with • Native load with INSERT and SELECT
INSERT and SELECT
• FastLoad/FastExport
INTO is supported.
As:
select cast(null as varchar) a13, id
from /shared/postgreDrill/DRILL/"postgres.ga"/all_datatype_1k as
table1
To re-enable 3DES_EDE_CBC
1. Navigate to the <TDV_install_dir>/jre/lib/security/java.security.
2. Open the file and remove 3DES_EDE_CBC from the
jdk.tls.disabledAlgorithms setting.
3. Restart the TDV Server.
Note: Oracle supports all ANSI-supported joins including LEFT OUTER JOIN
and RIGHT OUTER JOIN.
• Because Sybase IQ does not count trailing spaces in its LENGTH function, it
might be best to set the TDV configuration parameter to Ignore Trailing
Spaces.
• The Sybase data type UNSIGNED BIGINT is not fully supported by TDV
queries when its value is larger than +9,223,372,036,854,775,807, because of the
limitations of the TDV JDBC type LONG.MAX_VALUE.
• Queries submitted to Sybase IQ cannot contain column names in the IN
clause. The IN clause can contain constants only.
• ORDER BY is not allowed in a Sybase IQ subquery. (See Sybase IQ
documentation.)
Netezza Setup
The TDV Server can push Netezza SQL analytic functions, aggregate functions,
regular expressions, and associated keywords. For more information, see the TDV
Reference Guide.
<CurrentCatalog>.<CurrentSchema> Tables
System Tables and Views and Views
• ADMIN._T_AGGREGATE • _v_function
• ADMIN._T_PROC • _v_aggregate
• _V_JDBC_PROCEDURE_COLUMNS2
• _V_JDBC_PKFK2
• _V_JDBC_PROCEDURES2
• _V_JDBC_PRIMARYKEYS2
• _V_JDBC_INDEXINFO2
• _V_JDBC_COLUMNS2
• _V_JDBC_TABLES2
• _V_JDBC_PROCEDURES2
After successful connection with the data source, Netezza tables, views,
user-defined aggregates (UDAs), and user-defined functions (UDFs) are
displayed for TDV introspection.
The process of adding relational data sources for use with TDV is fundamentally
the same for all types.
In the New Physical Data Source dialog, you need to provide the connection
details that are used each time TDV accesses the data source. You can edit this
information later if necessary when you open this data source in Studio. TDV
automatically populates some fields for each data source type.
For descriptions of the fields on the two data source configuration tabs, go to:
• Basic Settings for a Relational Data Source, page 87
• Advanced Settings for a Relational Data Source, page 93
If You Are Using Sybase Version Use Sybase Option Driver Required
Sybase v12.x Sybase 12 jConnect for JDBC 16
Sybase IQ v15 using JDBC Type 2 Sybase IQ (Type 2) SQL Anywhere Database Client
driver v12.0.1
Note: If none of the data source types are appropriate, see Custom Adapters,
page 70 for more information.
6. Enter the name for the data source as you want it to appear in Studio resource
tree.
7. Enter the basic connection information for the data source. See Basic Settings
for a Relational Data Source, page 87, for a description of the basic settings.
8. To enable Kerberos security through keytab files, select the Kerberos option on
the Basic tab. You can then type values for the Keytab File and the Service
Principal Name fields.
9. Enter the connection values and related fields. See Advanced Settings for a
Relational Data Source, page 93, for an overall description of the advanced
settings.
Note: Many of these fields control connection pools. Connection pool
configuration uses the standard settings that you can find on the internet.
10. Scroll down to make sure that you have specified all necessary information.
For additional fields and options, see the section about the specific data source
you are adding.
11. Click one of these options:
— Create & Introspect—Introspect the data source immediately. See
Introspecting a Data Source, page 192.
— Create & Close—Save the data source definition but do not introspect now.
Basic
Property Data Source Description
Create tables SAP HANA The number of partitions you want used when tables are created.
with this
This number affects whether and how table partitioning is done
number of
when TDV creates new tables in SAP HANA, for example as
partitions...
cache targets.
• If you specify a positive number x (3 to 5 recommended per
SAP HANA node), the CREATE TABLE DDL will contain
PARTITION BY ROUNDROBIN PARTITIONS x.
• If you specify zero, the CREATE TABLE DDL will not include
a PARTITION BY clause.
Ultimately, the number of partitions affects performance when
querying the resulting table, so you should optimize it for your
SAP HANA instance and usage.
Complete Apache Drill A URL to connect to the physical data source. TDV does not
connection validate modifications. The data source adapter might not
URL string validate changes.
jdbc:drill:drillbit=<hostIP>;schema=<scemaname>
Database DataDirect For DB2 DBMS types, name of the underlying data source. For
Mainframe other DBMS types, enter NONE.
Database All except Name or alias of the underlying data source. TDV Server uses
Name Composite this name to find and connect to the data source.
and Netezza
DSN Microsoft The Data Source Name. You might need to create a new User or
Access System DSN using the ODBC Data Source Administrator utility
(available with Windows Administrative Tools).
Host All Name or IP address of the machine hosting the data source.
Hostname Oracle data sources: you cannot enter an Oracle database link
name in this field.
Net Service Oracle with The TNS name that is set up through Oracle Net Configuration
Name OCI Driver Assistant.
Plan DataDirect Data isolation plan. The default value, SDBC1010, specifies
Mainframe cursor stability. Other values specify repeatable reads, read
stability, or uncommitted reads. See the Shadow RTE
Client/adapters Installation and Administration Guide for more
information.
Save All By default, the login and password are saved to create a reusable
Password TDV Server system connection pool, usable only by the resource
(check box) owner, explicitly authorized groups and users, and TDV
administrators. They can:
• Introspect or reintrospect the current data source
• Add or remove data source resources
• Perform queries, updates, and inserts on tables
• Invoke a stored procedure
• Refresh a cached view based on data source resources
• Use the query optimizer to gather statistics
To disable saving of the password, enable Pass-through Login.
Server SAP HANA Name or IP address of the machine hosting the data source.
Kerberos Greenplum This field is available only if you choose Kerberos authentication.
Server Name
Include Greenplum This field is available only if you choose Kerberos authentication.
Realm
Keytab File MS SQL This field is available only if you choose Kerberos authentication.
Server
Use to enable Kerberos security through keytab files. Type the
Greenplum full path to the keytab file.
Transaction All The degree to which transactions are isolated from data
Isolation modifications made by other transactions. Netezza and Oracle
have only Read Committed (default) and Serializable.
Read Uncommitted—Dirty reads, nonrepeatable reads, and
phantom reads can occur.
Read Committed—Nonrepeatable reads and phantom reads can
occur.
Repeatable Read—Only phantom reads can occur.
Serializable—Dirty reads, nonrepeatable reads, and phantom
reads are prevented.
None
Data
Advanced Property Description
Source
Collation Sensitive All TDV does not use the SORT MERGE join algorithm if any data
source involved in the join is marked Collation Sensitive.
Concurrent Request All Works with the Massively Parallel Processing engine
Limit configuration parameters to control the amount of
parallelization for the queries for a particular data source.
Connection Check-out All A procedure that returns a valid SQL statement that can be
Procedure used to initialize the connection—for example, an Oracle
Virtual Private Database (VPD) system.
Connection checkout
procedure VPD is a method of doing row-level security. After the
connection is made, often with a generic account, the client
enables certain sets of access rights by setting a security
context. In this case, the init procedure returns something like
dbms_session.set_identifier ('username'). This is executed on
the connection, changing connection privileges from the
default to those associated with the user.
Other parameters can also be changed. For example (Oracle), a
block like this might be returned by the init procedure:
BEGIN
dbms_session.set_identifier('username');
EXECUTE IMMEDIATE 'alter session set
optimizer_index_cost_adj=10';
EXECUTE IMMEDIATE 'alter session set
optimizer_index_caching=90';
EXECUTE IMMEDIATE 'alter session set
"_complex_view_merging"=true';
END;
Write the code in such a way that the init procedure revokes
rights if it is not called within the appropriate context.
Connection Pool Idle All Number of seconds (default 30) that a connection can remain
Timeout idle without being dropped from the pool when there are more
than the minimum number of connections.
Connection Pool All Maximum number of connections (both active and idle)
Maximum Size allowed for the data source. When the maximum is reached,
new requests must wait until a connection is available.
If the maximum number of connections is in use when a
request comes in (even with pass-through authentication), the
new request is blocked and queued until a connection is
available or the Connection Pool Idle Timeout is reached.
If no connection was made available within the specified
timeout, a check is made for an available connection by the
same user. If none is available, the least recently used
connection for another user is dropped and a new connection
is opened.
Studio reuses pooled connections if they continue to be valid
after changes (such as connection name), but JDBC requests
are forced to use new connections if any part of the data source
connection configuration has changed.
Connection Pool All Minimum number of connections in the pool even when the
Minimum Size pool is inactive.
When a connection has been idle, a validation query is used to
verify whether an open connection is still valid just prior to
submission of a request. If the connection is invalid, the
connection is discarded and another is used.
Connection Attributes Apache Lets you specify property-value pairs to pass to the data
Drill source. For example:
bootPassword=key attribute
collation=collation attribute
dataEncryption=true attribute
drop=true attribute
encryptionKey=key attribute
encryptionProvider=providerName attribute
encryptionAlgorithm=algorithm attribute
failover=true attribute
Connection URL All The URL string generated from the connection URL pattern
String with the connection information you provide. This string is
used by the JDBC adapter to connect to the physical data
source. This field cannot be edited. For details, see the section
“Connecting through JDBC Adapters” in the TDV
Administration Guide.
Data source driver Apache Select or clear the check box. If cleared, specify an Execution
doesn’t support query Drill timeout value in seconds
timeout
Enable Bulk Load Various Related to the data ship feature capability.
Several fields are available only if others are checked. For
details, see Data Ship Performance Optimization, page 597.
Enable Export To Vertica Several of these fields are available only if others are checked.
Another Vertica For details, see Data Ship Performance Optimization,
Database page 597.
For setting up caching using bulk load features, see TDV
Caching, page 465.
Note: All Netezza data sources should be configured to act as
data ship targets.
Enable Native Data All Let the data source use its proprietary functionality to
Loading optimize performance. See About Data Source Native Load
Performance Options, page 79.
Enable Oracle Oracle Check to improve performance if you plan to use this data
Database Link source for data caching or data ship optimization. Also add
one or more Oracle database links. See Configuring Native
Caching for Oracle, page 515, and Data Ship Performance
Optimization, page 597.
Enable Pass-Through Oracle For pass-through to work, the prepared statement must call
Prepared Statements data from only one Oracle database instance. Prepared
statements can use data from multiple tables within a single
Oracle database instance.
Execute SELECTs All Lets a SELECT statement be executed using a new connection
Independently from the connection pool, and committed immediately after
completion. INSERT, UPDATE, and DELETE statements are
executed using the same connection as part of the transaction.
Execute SELECTs in Apache Lets a SELECT statement be executed using a new connection
separate transactions Drill from the connection pool, and committed immediately after
from INSERTs and completion. INSERT and UPDATE, statements are executed
UPDATEs using the same connection as part of the transaction.
Execution Timeout All The number of seconds an execution query on the data source
can run before being canceled. Zero seconds (the default
value) disables execution timeout, allowing processes to run to
completion—for example, resource-intensive cache updates
scheduled for non-peak processing hours.
Ignore Procedure Sybase Check to suppress the return parameter for stored procedures.
Return Parameter By default (unchecked), return parameters are inserted into
procedure definitions by the JDBC adapter.
Include Invalid Oracle Check to return all objects during introspection, including
Introspection Objects invalid objects.
Introspect Procedures Oracle Checked by default. Ignoring procedures speeds up the initial
introspection when only tables are wanted.
Introspect comments Oracle During the introspection process, TDV can retrieve table and
column level comments and add them to the annotations field
for each resource. Introspecting Data Source Table and
Column Comment Metadata, page 202.
Introspect Should Use Sybase Check to return column aliases when introspecting data
Column Alias sources. By default (unchecked), introspection returns column
names.
Introspect Using Oracle Oracle maintains multiple metadata views. By default, TDV
DBA_* Views introspection uses ALL_* views, which list resources for which
the user has access to both data and metadata. DBA_* views
show all resources in the database regardless of data access
permissions. Refer to Oracle documentation for differences
and privileges.
Max Source Side All See the documentation for semijoins and the TDV
Cardinality for Semi Administration Guide for more information.
Join
Max Source Side of All See the documentation for semijoins and the TDV
Semi Join to Use OR Administration Guide for more information.
Syntax
Maximum Connection All The number of minutes that a connection that was returned to
Lifetime the pool persists if there are more open connections than the
minimum pool size.
The duration is calculated from connection creation. Default
value is 60 minutes. Set a smaller value if the pool is likely to
run out of connections. Be sure to add a validation query. Set a
larger value if you want the connections to be held for a longer
period. Set a value of 0 to keep connections alive indefinitely.
Query Banding Teradat Turns the query banding feature on (At Session Level) or off
a (Off). Stores query context information in the Teradata session
table so it can be recovered after a system reset. Query banding
takes effect when the data source is next used.
Connection pooling has no effect on query band data.
QueryBand Properties Teradat Click QueryBand Properties on the right to open a dialog box
a in which to specify property-value pairs to store in the session
table.
Four properties are available by default. For the first three, if
the default value (including brackets) appears alone in the
value field, it is replaced with the actual value at run time.
• TDV_USER. ID of the TDV user, application or report that
originated the request from a middle-tiered application.
The default value is <TDV_USER>.
• DOMAIN. The default value is <DOMAIN>.
• SESSION_NAME. The default value is
<SESSION_NAME>.
• SYS. The default value is TDV.
TDV administrator can click Add property to add custom
name-value pairs.
Select Mode Microso Choose one of two values: Cursor or Direct. These two values
ft SQL affect how result sets are created and retrieved when a query is
Server executed against the data source.
Cursor—Generates a server-side cursor. Rows are fetched
from the server in blocks. Use JDBC statement method
setFetchSize to control the number of rows fetched per request.
Useful for queries that return more data than what can be
cached on the client.
Streaming Results MySQL If selected (default), streams the result set row by row from
Mode MySQL to TDV. If not selected, MySQL does not send results
to TDV until all results are gathered. For details, see the
README.txt file in MySQL JDBC adapter’s docs directory.
Supports Star Schema All Check only if this data source supports very large predicates
and very large cardinalities for star schema semijoins. Refer to
the section Star Schema Semijoin, page 587, for more
information.
Transaction Lock Wait SAP The length of time a transaction is to wait on the lock before
Timeout HANA quitting.
Transaction isolation Apache Valid values: none, Read committed, Read uncommitted,
Drill Repeatable read, Serializable.
Use global temp space DB2 Option that can be used to improve performance when using
for temp tables this data source with the TDV data ship feature. This option
would allow you to manage the temp tables created for the
data source like any other temp table that you have defined in
your source database.
Use pass-through Oracle When pass-through login is configured for use with an Oracle
user’s certificate for data source, check this box to include the user’s certificate in
encryption the SSL negotiation (handshake).
Use X Views Teradat Returns data only for rows containing information on objects
a that the requesting user owns, created, has privileges on, or
has been granted access through a current or nested role.
Optionally, enable bulk loading for SELECT and INSERT statements. Netezza,
Teradata, and SQL Server require extra steps to use this feature. Other data
sources might support this feature, but they require no extra configuration.
This topic describes how to configure Web-based data sources including WSDL
(Web Services Definition Language) and REST (Representational State Transfer)
data sources.f
• About OAuth Configuration for SOAP and REST Data Sources, page 105
• About WSDL Bare and Wrapper Parameter Styles, page 106
• WSDL and SOAP Data Sources, page 108
• REST Data Sources, page 128
• Client Authentication for Web Data Sources, page 145
• TDV SOAP and REST OAUTH Examples, page 147
• TDV OAuth Tab XML Processors Field Reference, page 153
• Partial Example of Using OAuth Customized Flow for WSDL, SOAP, and
REST, page 161
OAuth is a standard method for obtaining secure authorization from the Web.
The OAuth authorization framework enables a third-party application to obtain
limited access to an HTTP service. You are expected to be familiar with the OAuth
2.0 Authorization Framework (RFC 6749).
If 5 seconds is not the appropriate time for the data source to wait for a web page
execution to occur, you can modify the Default OAuth Page Execution Timeout
configuration parameter value.
The OAuth grant-flows between the client and the resource owner are:
• Authorization code grant—OAuth uses an authorization server as an
intermediary to obtain an authorization code. The authorization server
authenticates the resource owner and obtains authorization. The resource
owner's credentials are not shared with the client.
• Implicit grant—This simplified flow is optimized for browser clients using a
scripting language. The client is issued an access token directly; the
authorization server does not authenticate the client. However, the access
token may be exposed to the resource owner or other applications with access
to the resource server.
WSDL bare and wrapped parameter styles control the consumption of data that is
returned in a response. Typically if you have a parameter where you expect the
response values to be small, you can put the parameter in the WSDL header. If,
however, you expect there to be significant volumes of data returned in the
response for the parameter, then it is best to place the parameter or set of
parameters in the body of the WSDL.
TDV conforms to the open source standard for XML-based web services. Oracle
provides useful reference content that describes the standards fully. This section
attempt to summarize how TDV interprets those standards for using Bare and
Wrapped parameters for your WSDL.
Style Description
WRAPPE For multiple parameters that use the BODY location.
D
The wrapper element is the child of the SOAP BODY and the parameter elements
are children of the wrapper element.
BARE For exactly one parameter with a location of BODY and its element is the only
child of the SOAP BODY.
All other parameters of the same direction must be mapped to another location, for
example, HEADER.
WRAPPED
A parameter style of wrapped means that all of the input parameters are wrapped
into a single element on a request message and that all of the output parameters
are wrapped into a single element in the response message. If you set the style to
RPC you must use the wrapped parameter style. The wrapped style tells the Web
service provider that the root element of the message represents the name of the
operation and that children of the root element must map directly to parameters
of the operation's signature.
With wrapped all sub-parameters are in the first level, and the client
automatically wraps them in another top element.
Wrapped requests contain properties for each in and in/out non-header
parameter. The properties for the method return values of each out non-header
parameter, and in/out non-header parameter. The order of the properties in the
request is the same as the order of parameters in the method signature. The order
of the properties in the response is the property corresponding to the return value
followed by the properties for the parameters in the same order as the parameters
in the method signature.
Most Web services engines use positional parameter binding with wrapper style.
If a service signature changes, clients must also change. With the wrapper style,
all parameters have to be present in the XML message, even if the element
corresponding to the parameter can be defined as optional in the schema.
The wrapped request:
• Must be named the same as the method and the wrapper response bean class
must be named the same as the method with a “Response” suffix.
• Can only have one message part in WSDL, which guarantees that the message
(method element with parameters) is represented by one XML document with
a single schema.
• Must have parameters present in the XML message, even if the element
corresponding to the parameter can be defined as optional in the schema.
BARE
A parameter style of BARE means that each parameter is placed into the message
body as a child element of the message root. With BARE there is only a top
element parameter with sub-parameters inside. This one BARE parameter is sent
directly.
Whether a given SOAP message is valid is determined by the WSDL that it
conforms to. Web services using the bare style can have multiple parameters. It is
only the actual invocation of the web service where a single parameter is passed.
The Java API for XML-based Web Services requires that parameters for bare
mapping must meet all the following criteria:
• It must have at most one in or in/out non-header parameter.
• If it has a return type other than void it must have no in/out or out
non-header parameters.
• If it has a return type of void it must have at most one in/out or out
non-header parameter.
Adding a new optional element does not affect clients, because binding is always
name-based. To get around having to have a contract defined by the schema of the
web service, you can define all child elements of the wrapper root element as
required. Optional elements must be made nillable.
• TDV supports the Document Literal message style for WSDL and SOAP.
Document Literal style messages have a single part whose schema defines the
message payload.
This section contains the following topics:
• Adding a WSDL or SOAP Data Source, page 109
• Enabling Tube Logs for SOAP Data Sources, page 113
• Subscribe to a WSDL Offered as a JMS Data Source, page 114
• Defining SOAP Message Pipelines, page 115
• Message Level Security (Pipelines) for Legacy Web Services Imported from
CAR Files, page 118
Field Description
URL URL to the WSDL or SOAP data source. WSDLs can be available by URL
over any accessible network. A locally mapped WSDL can be introspected
using a URL format like the following:
file:///Z:/test.wsdl
Field Description
Save Password This check box is enabled only if Pass-through Login is enabled. See About
Pass-Through Login, page 77 for more information.
Authentication Choose the method of authentication for this data source: BASIC, NTLM,
or NEGOTIATE, OAuth, Digest.
When selecting OAuth as the authentication mode, another tab will
display. Authentication for the data source must be designated as OAuth
2.0 when the physical data source was first added.
Service Principal For NEGOTIATE authentication with Kerberos only, enter the service
Name principal name.
See “Configuring Kerberos Single Sign-On” in the TDV Administration
Guide for more information.
SAML Header Class SAML assertions are defined in the headers of a class. Type the class name
that owns the SAML header.
5. If the data source requires client authentication, click the Advanced tab. See
Client Authentication for Web Data Sources, page 145 for how to configure
client authentication.
If the data source requires OAuth, select the OAuth 2.0 tab. Specify the values
appropriate to the OAuth flow you want to use. For examples, see TDV SOAP
and REST OAUTH Examples, page 147. (This tab and the fields are available
to edit after creation of the data source.)The following table describes the
values the user is to provide on the OAuth 2.0 tab:
Field Description
OAuth Flow These OAuth flows:
• AUTHORIZATION_CODE
• IMPLICIT—Client Secret and Access Token URI are disabled.
• CLIENT_CREDENTIALS—Resource Owner Authentication fields are
disabled.
• RESOURCE_OWNER_PASSWORD_CREDENTIALS—Client
Authentication fields are disabled.
• CUSTOMIZED—User-specified flow.
Client Used in the request-body of token requests. A unique string representing the
Identification identifier issued to the client during registration. It is exposed to the resource
owner. Format: string of printable characters.
Client Secret Used in the request-body of token requests. Enabled only for
AUTHORIZATION_CODE, and OAuth flow. Format: string of printable
characters.
Authorization URI to use for establishing trust and obtaining the required client properties.
URI
Access Token URI to use for communicating access and refresh tokens to the client or resource
URI server. Disabled only for IMPLICIT OAuth flow.
Redirect URI Client’s redirection end point, established during client registration or when
making an authorization request. Must be an absolute URI, and must not
contain a fragment component. The authorization server redirects the resource
owner to this end point.
Scope Authorization scope, which is typically limited to the protected resources under
the control of the client or as arranged with the authorization server. Limited
scope is necessary for an authorization grant made using client credentials or
other forms of client authentication. Format: one or more strings, separated by
spaces.
State A request parameter used in lieu of a complete redirection URI (if that is not
available), or to achieve per-request customization
Field Description
Expiration Time The lifetime of the access token.
(Sec)
Use Refresh Checking this box enables the use of refresh tokens to obtain access tokens,
Token To Get rather than obtaining them manually.
Access Token
Login and User credentials with which the client or resource owner registers with the
Password authentication server before initiating the OAuth protocol.
Domain Domain to which the client or resource owner belongs; for example, composite.
Access Token Bearer—User of a bearer token does not need to prove possession of
Type cryptographic key material. Simple inclusion of the access token in the request is
sufficient.
Query—The query string “?access_token=<token>” is appended to the URL.
Not available for SOAP data sources.
Access Token Credentials used to access protected resources, with a specific scope and
duration. Usually opaque to the client.
Get Token Initiates acquisition of an access token. Proper information must be configured
button for this request to succeed.
Refresh Token Credentials used to obtain access tokens when they become invalid or have
expired. Intended for use with authorization servers, not resource servers.
Custom Flow The name of the custom flow for a data source with an OAuth flow of
CUSTOMIZED.
Field Description
Using Check this box to use processors. The editable text field allows you to enter
Processors JavaScript and XML. You can use this field to add JavaScripts that log in
check box automatically or use XML to customize any part of the authorization or access
token that does not conform to specifications.
and editable
text field
Editable text field
The editable text field underneath the Using Processors check box can be used to
type the XML elements necessary to establish authorization and access tokens.
For example:
<Authorization> <AuthorizationProcessors>
<AuthorizationProcessor>
document.getElementById('email').value='[email protected]';
document.getElementById('pass').value='jellypassword';
document.getElementById('loginbutton').click();
</AuthorizationProcessor> </AuthorizationProcessors>
</Authorization>
<AccessToken> <RequestMsgStyle>QUERY</RequestMsgStyle>
<ResponseMsgStyle>FORM</ResponseMsgStyle>
<ExpireTime>1000</ExpireTime> </AccessToken>
Sensitive Click the plus-sign icon one or more times to add tag-keyword pairs to
Keyword in substitute into the JavaScript specified for a custom authorization flow. The
JavaScript values sent to the server are encrypted, and then replaced with their decrypted
values where the tags are found in the JavaScript.
These pairs are used for the user name, email ID, password, and other sensitive
information.
Tag—The name of the tag to find in the JavaScript.
Keyword—The encrypted value to decrypt and substitute for the tag in the
JavaScript.
Parameter Description
Enable Tube log after Enable logging of SOAP messages or set to ALL for all tubes.
executing
Valid values are Terminal, Handler, Validation, MustUnderstand,
Monitoring, Addressing, At, Rm, Mc, Security, PacketFilter, Transport,
and Customized.
Enable Tube log before Enable logging of SOAP messages or set to ALL for all tubes.
executing
Valid values are Terminal, Handler, Validation, MustUnderstand,
Monitoring, Addressing, At, Rm, Mc, Security, PacketFilter, Transport,
and Customized.
administrator. See “Configuring TDV for Using a JMS Broker” in the TDV
Administration Guide.
— The specified JMS Destination can be changed to take advantage of
different queue destination aliases that offer the same service.
— Delivery Mode can be set to persistent so that messages are written to disk
as a safeguard against broker failure. Non-persistent messages are not
written to disk before acknowledgment of receipt.
— Message Expiry specifies the period of message validity in the broker
queue. An entry of 0 specifies no expiration, while a null entry for an
operation specifies that the port setting is to take precedence.
— Operations or messaging priority can be set to an integer of 1 through 9,
where 9 is the highest priority.
— Default Timeout is a setting for the consuming client, and it can be set to
some duration (in milliseconds). An entry of zero means no timeout, and a
null entry specifies that the default takes precedence.
— Individual JMS operations under the port can be configured with a
Message Type of Bytes or Text and with specific time-outs tailored to the
operation.
If you need to review or change configurations for a Web service, open the Web
service from Studio, click the Add/Remove Resources button, make required
changes, and reintrospect the data source. For more details about introspection,
see Retrieving Data Source Metadata, page 189.
1. Create a SOAP request by executing the 6. The SOAP response message is received
operation. from the service side.
2. Message pipelines that are defined as request. 7. Message pipelines that are defined as
response.
3. The SOAP request message is handled 8. The SOAP response message is handled
according to the SOAP specifications that are according to the SOAP specifications that are
defined in the contract. defined in the contract.
4. Message pipelines that are defined as request.. 9. Message pipelines that are defined as
response.
5. The SOAP request message is sent to the service 10. The SOAP message is received.
side.
3. Use the green plus buttons to add steps to the Request Message Pipeline or
Response Message Pipeline. The following types can be added:
Sign Element
4. Follow the prompts on the screens for what is required for each of the
different step types.
5. You can delete, edit, and reorder the steps at anytime after adding them.
Message Level Security (Pipelines) for Legacy Web Services Imported from
CAR Files
If messaging passes through intermediate sources in the transport layer (indirect
messaging), you must define message-level security. For indirect messaging, or
for multiple message receivers or senders, the message needs to be secured at
different levels to ensure data integrity.
A pipeline defines multiple instructions that are processed simultaneously. Except
for the Custom step, each pipeline step corresponds to a system built-in
procedure available at /lib/services/ in the server.
• Viewing Message Pipeline Steps, page 118
• Creating a Log Message to File Pipeline Step, page 119
• Adding a Username Token Pipeline Step, page 120
• Creating an Element Pipeline Step, page 122
• Creating a Pipeline Step to Process a Procedure, page 123
• Creating a Pipeline Step to Delete an Element, page 123
• Creating a Pipeline Step to Encrypt an Element, page 124
• Creating a Pipeline Step to Process a Security Header, page 125
• Creating a Pipeline Step to Set Environment From Node, page 126
• Creating a Pipeline Step to Set Node From Environment, page 127
• Creating a Pipeline Step to Sign an Element, page 128
3. In the File Path field, specify a file where the messages are to be logged. The
file path is relative to the TDV Server log directory. If the file or the directory
does not yet exist, it is created on the local file system of the server on
execution.
4. In the File Mode field, specify how the message should be logged.
APPEND—Adds new messages to the end of the log file.
OVERWRITE—Causes new messages to overwrite the log file.
5. (optional) Text entered into the Header field is added to the given text in front
of the new SOAP envelope message. The text supplies a header note, which is
written to the file right before the message contents. This value can be null.
6. (optional) Text entered into the Footer field is added to the end of the
processed message. This value can be null.
7. In the Pretty Print drop-down list, select true (default) if you want the message
to be formatted with tabbed indents, and false if you do not want the message
to be formatted.
Formatting the message can make it easier to read.
8. Click OK, and save the step.
The output of this pipeline step is the modified XML document or element. For
example:
3. Values for the following fields need to be supplied in the Add Username Token
window:
Username Valid user name to access the Web service server. joeuser
Password Specify the password type, to determine how the password is DIGEST
Type encoded in the Username Token:
DIGEST—Password is rendered in a digested text in the message
TEXT—Password is rendered in clear text in the message
4. Click OK after supplying all the required information, and save the step.
The UsernameToken is added to the SOAP header that is identified by the entry
supplied in the Actor field and Must Understand field. If the SOAP message does
not contain a SOAP header with the specified Actor and Must Understand values, a
header is created with those values.
UI Element Description
Procedure path The path to the pipeline step procedure (by default,
/lib/services/EncryptElement).
Actor Type a URI. See To create the pipeline step named Add Username Token,
page 120.
UI Element Description
Element Name The default value of the Element Name field specifies the schema of the SOAP
message that is encrypted. This procedure can be used to encrypt the SOAP
message body.
The Element Name field can be null, but the default value of
{http://schemas.xmlsoap.org/soap/envelope/}Body
Encryption Accept the default algorithm AES_128 or select a different one if you have
Algorithm installed an unrestricted Java Cryptography Extension (JCE) policy file in the
server’s JVM.
Determines the method of encryption. The default value of AES_128 is
sufficient for most purposes. Stronger encryption algorithms such as AES_192
or AES_256 require that an unrestricted Java Cryptography Extension (JCE)
policy file be installed in the server’s JVM.
• If any header security element indicates that the envelope contains encrypted
elements, those encrypted elements are decrypted.
This pipeline step corresponds to the system procedure ProcessSecurityHeader,
which is available in /lib/services/.
5. In the Prefix field, specify the namespace prefix used in the XPath field.
6. In the Namespace field, specify the namespace URIs used in the XPath
expression.
7. Optionally, click the Add button next to Declare namespaces used in XPath
expressions to define an additional prefix-namespace pair.
8. Optionally, click the Delete button next to a prefix-namespace pair to delete
the pair.
9. Click OK.
When you define a REST data source, you define its URL connection information
and its operations which include the input and output parameters. After
definition, the REST data source in Studio contains a Web Service Operation for
each defined REST operation. The Web Service Operations in Studio are similar to
a stored procedure and can be used in the same way. See Procedures, page 267 for
more information.
This section contains the following topics:
• Adding a REST Data Source, page 129
• Setting a Value for NULL JSON Values in REST Responses, page 137
• Passing a Full XML Document in a POST Request to a REST Service, page 137
• Using an HTTP Header Parameter to Specify the Content Type, page 138
• Using Design By Example to Infer Parameter Data Types, page 139
• Cross-Origin Resource Sharing (CORS), page 142
Field Description
Base URL The base URL to access the REST data source. This is in the form:
http://<web site name>
Example: http://search.twitter.com
Login Optionally, provide a valid username to access the REST data source.
Password Optionally, provide a valid password to access the REST data source.
Save Password This check box is enabled only if Pass-through Login is enabled. See
Basic Settings for a Relational Data Source, page 87, for more
information.
Authentication Choose the method of authentication for this data source: BASIC,
NTLM, or NEGOTIATE, OAuth, Digest.
When selecting OAuth as the authentication mode, another tab will
display. Authentication for the data source must be designated as OAuth
2.0 when the physical data source was first added. For more information,
see Client Authentication for Web Data Sources, page 145.
Service Principal For NEGOTIATE authentication with Kerberos only: enter the service
Name principal name.
See “Configuring Kerberos Single Sign-On” in the TDV Administration
Guide for more information.
Field Description
Access Token For OAUTH 2.0 authentication, type the access token.
JSON Format Check this box if you want to use the JSON (JavaScript Object Notation)
standard for data interchange. When this box is checked, each HTTP
request-response pair is encoded and decoded in the JSON format.
BadgerFish Enabled Check this box to enable BadgerFish if the REST services outside of TDV
use the BadgerFish convention to translate XML into the JSON format.
JSON Format must also be checked.
Primitive Value Check this box to read and send the value in its primitive presentation.
Format
JSON Format must also be checked.
Package Name Prefix for each element to make the service name unique. The package
name plays the same role as namespace in the XML format. Package
name can consist of any characters that are legal in Java class names.
JSON Format must also be checked.
Wrapper of JSON Bare Type in the wrapper name. It can be the wrapper key of the whole JSON
Response response. This value makes it possible to convert JSON responses to
well-formed XML value types.
JSON Format must also be checked.
5. On the bottom part of the Basic tab, define the REST operations.
a. Under Operations, click the Add Operation button to add an operation.
b. In the New Operation dialog box, enter a name for the operation and click
OK.
c. For each operation, you must define an HTTP Verb and Parse an
Operation URL.
Field Description
HTTP Verb Choose the operation type: GET, POST, PUT, or DELETE.
Operation Automatically filled in with the operation name you entered and selected in the
Name Operations box.
Field Description
Operation Enter the URL for the REST operation in the format:
URL <operation_api_name>?<query_parameters>
Parse Click Parse to add all URL parameters specified in curly brackets in the
Operation URL to the URL Parameters list.
If the parser finds the syntax correct, it adds it to the URL Parameters list. For
example, if you enter tweet={mytweet} in the Operation URL field and click
Parse, mytweet appears under Param Name in the URL Parameters section.
Multi-Part/Form To support the transfer of multiple binary types of data for POST operations.
For example, you could use this to transfer multiple GIF files.
e. For URL parameters, you can edit parameters and their data types for
those parameters that you are passing through the operation URL.
f. For Header/Body Parameters, define the header and body parameters
and their information. If you want to use the design by example feature,
you must save your data source definition to activate the Design By
Example button. For more information on how to use this feature, see
Using Design By Example to Infer Parameter Data Types, page 139.
— Click Add Operation to add a parameter. A new row appears in the
Header/Body Parameters section with no name but with a default location
(Body), data type (VARCHAR) and in/out direction (IN).
— Double-click under Param Name and type the name of the parameter.
— Click under Location, and from the drop-down list choose one of these
options:
HTTP Header—Parameter is located in the HTTP header.
Body—Parameter is located in the body of the XML file.
— Click under Data Type, and from the drop-down list choose one of these
options:
Option Description
Decimal DECIMAL, DOUBLE, FLOAT, or NUMERIC.
Browse Choose this to browse for an XSD file that defines the data type
being used.
— Click under In/Out, and from the drop-down list choose IN for input
parameters or OUT for output parameters.
6. If the data source requires client authentication, click the Advanced tab. See
Client Authentication for Web Data Sources, page 145 for how to configure
client authentication.
7. If the data source requires OAuth, select the OAuth 2.0 tab. Specify the values
appropriate to the OAuth flow you want to use.
The following table describes the values the user is to provide on the OAuth
2.0 tab:
Field Description
OAuth Flow These OAuth flows:
• AUTHORIZATION_CODE
• IMPLICIT—Client Secret and Access Token URI are disabled.
• CLIENT_CREDENTIALS—Resource Owner Authentication fields are
disabled.
• RESOURCE_OWNER_PASSWORD_CREDENTIALS—Client
Authentication fields are disabled.
• CUSTOMIZED—User-specified flow.
Client Used in the request-body of token requests. A unique string representing the
Identification identifier issued to the client during registration. It is exposed to the resource
owner. Format: string of printable characters.
Client Secret Used in the request-body of token requests. Enabled only for
AUTHORIZATION_CODE, and OAuth flow. Format: string of printable
characters.
Authorization URI to use for establishing trust and obtaining the required client properties.
URI
Access Token URI to use for communicating access and refresh tokens to the client or resource
URI server. Disabled only for IMPLICIT OAuth flow.
Redirect URI Client’s redirection end point, established during client registration or when
making an authorization request. Must be an absolute URI, and must not
contain a fragment component. The authorization server redirects the resource
owner to this end point.
Field Description
Scope Authorization scope, which is typically limited to the protected resources under
the control of the client or as arranged with the authorization server. Limited
scope is necessary for an authorization grant made using client credentials or
other forms of client authentication. Format: one or more strings, separated by
spaces.
State A request parameter used in lieu of a complete redirection URI (if that is not
available), or to achieve per-request customization
Use Refresh Checking this box enables the use of refresh tokens to obtain access tokens,
Token To Get rather than obtaining them manually.
Access Token
Login and User credentials with which the client or resource owner registers with the
Password authentication server before initiating the OAuth protocol.
Domain Domain to which the client or resource owner belongs; for example, composite.
Access Token Bearer—User of a bearer token does not need to prove possession of
Type cryptographic key material. Simple inclusion of the access token in the request is
sufficient.
Query—The query string “?access_token=<token>” is appended to the URL.
Not available for SOAP data sources.
Access Token Credentials used to access protected resources, with a specific scope and
duration. Usually opaque to the client.
Get Token Initiates acquisition of an access token. Proper information must be configured
button for this request to succeed.
Refresh Token Credentials used to obtain access tokens when they become invalid or have
expired. Intended for use with authorization servers, not resource servers.
Field Description
Custom Flow The name of the custom flow for a data source with an OAuth flow of
CUSTOMIZED.
Using Check this box to use processors. The editable text field allows you to enter
Processors JavaScript and XML. You can use this field to add JavaScripts that log in
check box automatically or use XML to customize any part of the authorization or access
token that does not conform to specifications.
and editable
text field
Editable text field
The editable text field underneath the Using Processors check box can be used to
type the XML elements necessary to establish authorization and access tokens.
For example:
<Authorization> <AuthorizationProcessors>
<AuthorizationProcessor>
document.getElementById('email').value='[email protected]';
document.getElementById('pass').value='jellypassword';
document.getElementById('loginbutton').click();
</AuthorizationProcessor> </AuthorizationProcessors>
</Authorization>
<AccessToken> <RequestMsgStyle>QUERY</RequestMsgStyle>
<ResponseMsgStyle>FORM</ResponseMsgStyle>
<ExpireTime>1000</ExpireTime> </AccessToken>
Sensitive Click the plus-sign icon one or more times to add tag-keyword pairs to
Keyword in substitute into the JavaScript specified for a custom authorization flow. The
JavaScript values sent to the server are encrypted, and then replaced with their decrypted
values where the tags are found in the JavaScript.
These pairs are used for the user name, email ID, password, and other sensitive
information.
Tag—The name of the tag to find in the JavaScript.
Keyword—The encrypted value to decrypt and substitute for the tag in the
JavaScript.
To use an HTTP header parameter to submit an XML request and get a JSON
response
1. Define your REST data source.
2. On the Basic tab, select the check box next to the JSON Format label.
3. On the Basics tab, scroll down to the Operations section.
4. Define the details for your operation, including the HTTP Verb, Operation
Name, and Operation URL.
5. In the Header/Body Parameters table, add an OUT Param Name named
response with a Data Type of XML. This is a Body parameter with the
direction of OUT.
6. In the Header/Body Parameters table, add an IN Param Name named
[rawdata] with a Data Type of XML. This is a Body parameter with the
direction of IN.
7. Inn the Header/Body Parameters table, add an IN Param Name named
Content-Type with a Data Type of string. This is a Header parameter with the
direction of IN.
8. Clear all the XML <-> JSON check boxes for the [rawdata] parameter.
9. Select the XML <-> JSON check boxes for the response parameter.
10. Save your data source.
11. Invoke the service and provide a full request document for the [rawdata]
input parameter, and providing the string application or XML for the
Content-Type parameter.
After you make a selection, the full Studio tree path to that sample definition
appears in the Type field.
10. Click OK.
A new row appears in the Header/Body Parameters section with no name,
but with default location (Body), data type, and in/out direction (IN). A new
definition set resource is created in Studio under the REST data source with
the name of the operation TDV used to suggest the schema.
11. Double-click under Param Name and type the name of the parameter.
12. Validate that the values listed for the other columns are consistent with your
design.
13. Save your changes.
CORS Configuration
The TDV configures CORS using configuration parameters. The parameters are in
the Administration > Configuration window under Server > Web Services
Interface > CORS. These parameters are described in the following table.
Configuration
Parameter Comments
Allowed Origins A comma-separated list of the origins that may access the
resources.
Default value is * (all origins).
Preflight Max The number of seconds that the client is allowed to cache
Age preflight requests. Default value is 1800 seconds (30
minutes).
Preflight Requests
A preflight request asks the other domain’s server which HTTP request methods
it supports. This request also asks for an indication of whether cookies or other
authentication data should be sent with the actual request.
The preflight request sends its HTTP request using the OPTIONS method, to
determine whether the actual request is safe to send. A preflight request is
required if the actual request wants to:
• Use a PUT or DELETE method
• Set custom headers in the request—for example, a header such as bb
Actual Requests
An actual request includes an HTTP request method and any cookies or other
authentication the other domain’s server requires. If the actual request qualifies
(see list below) as a simple cross-site request, it does not need to be preceded by a
preflight request.
A simple cross-site request has the following characteristics:
When Web data sources require client authentication, a keystore must be specified
to identify the TDV Server to the provider. The TDV Server configuration
keystore key alias has a default value that names a sample keystore, so that you
can use client authentication immediately upon installation.
If the TDV configuration settings for keystore alias are set to null, the method
described below to comply with client authentication requirements is used for
Web data sources. The TDV configuration to use a specific keystore key alias
overrides keystore specification defined on individual data sources.
The use cases focus on how you would use TDV to customize sign-in automation,
and configuration of the parts that do not conform to RFC 6749. These are
examples—not exact, and not likely to run outside of one or two specialized
environments. OAuth service providers occasionally change their sign-in process,
which would require that you analyze the new sign-in process and design
accordingly.
• Google OAuth Example, page 147
• Facebook OAuth Example, page 148
• Linkedin OAUTH Example, page 149
• Salesforce OAuth Example, page 150
• Github OAuth Example, page 151
• Foursquare OAuth Example, page 152
The sensitive tag is now Testpassword, and the sensitive keyword is the real
password (xxxxxx).
OAUTH Tab
Field Example Value
Authorization https://accounts.google.com/o/oauth2/auth
URI
</Authorization>
Another way to get expire time is to use TokenProcessor, which can handle the
input data and return standard JSON data. In this case, MessageValue is the value
to retrieve from the response body, because the valid response is in FORM format.
By retrieving access token and expire time from MessageValue, the token
processor can return standard parameters that conform to RFC 6749 and JSON
format.
<AccessToken>
<RequestMsgStyle>QUERY</RequestMsgStyle>
<ResponseMsgStyle>FORM</ResponseMsgStyle>
<ExpireTime>1000</ExpireTime>
</AccessToken>
<TokenProcessor>
VAR accesstoken;
VAR expires;
...//Get access token and expire-time value from MessageValue
MessageValue = "{access_token:" + accesstoken +",
expires_in:" + expires+ "}";
</TokenProcessor>
</Authorization>
In the example, using the JavaScript regular expression to fetch the authorization
code is just for demonstration purposes.
</Authorization>
</AccessToken>
<QueryTokenName>oauth_token</QueryTokenName>
On the OAuth tab for your WSDL, SOAP, and REST data sources, you can define
custom processors. The custom processors use XML to provide authorization and
access token customization. TDV provides a collection of processors and XML
elements that you can use to accommodate nonconforming requests and
responses. This topic provides examples that use these processors to illustrate
how they obtain access tokens from several well-known third parties.
This topic is a reference of the XML element syntax that is valid for entry in this
field.
<AccessToken>
<RequestMsgStyle>QUERY</RequestMsgStyle>
<ResponseMsgStyle>FORM</ResponseMsgStyle>
<ExpireTime>1000</ExpireTime>
</AccessToken>
<TokenProcessor>
VAR accesstoken;
VAR expires;
...//Get access token and expire-time value from MessageValue
</TokenProcessor>
• Authorization Element Reference, page 154
• AccessToken XML Element Reference, page 156
• RefreshToken Element, page 159
• QueryTokenName Element, page 161
</xs:element>
Sequence
Element Description of value
RequestMsgStyle According to RFC 6749, the default value is GET for Authorization Code
Grant, Implicit Grant, and Customized Flow.
<xs:element name="RequestMsgStyle" minOccurs="0">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="FORM"/>
<xs:enumeration value="QUERY"/>
<xs:enumeration value="QUERYPOST"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
ResponseMsgStyle According to RFC 6749, the default value is GET for Authorization Code
Grant, Implicit Grant, and Customized Flow.
<xs:element name="ResponseMsgStyle" minOccurs="0">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="FORM"/>
<xs:enumeration value="QUERY"/>
<xs:enumeration value="JSON"/>
<xs:enumeration value="RAWBODY"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
TokenProcessor Get tokens or other parameters if the response is not in the specified format.
</xs:element>
</xs:complexType>
</xs:element>
Sequence
Description of Values
Element
RequestMsgStyle According to RFC 6749, the default value is FORM for Authorization Code
Grant, Client Credentials Grant, Resource Owner Password Credentials
Grant, and Customized Flow.
<xs:element name="RequestMsgStyle" minOccurs="0">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="FORM"/>
<xs:enumeration value="QUERY"/>
<xs:enumeration value="QUERYPOST"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
ExpireTime Sets an expiration time for the access token, overriding the default time of 5
seconds.
</xs:element>
</xs:element>
RefreshToken Element
The RefreshToken element lets you customize the way the access token is
refreshed in the OAuth flow.
</xs:element>
According to RFC 6749, the default value is JSON for Authorization Code
Grant, Resource Owner Password Credentials Grant, and Customized Flow.
<xs:element name="ResponseMsgStyle" minOccurs="0">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="FORM"/>
<xs:enumeration value="QUERY"/>
<xs:enumeration value="JSON"/>
<xs:enumeration value="RAWBODY"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
ResponseMsgSty
le • QUERY—All response parameters of OAuth are returned as a query
string appended to a redirect URL.
• FORM—All response parameters of OAuth are returned with the entity,
and Content-Type is application/x-www-form-urlencoded.
• JSON—All response parameters of OAuth are returned with the entity,
and Content-Type is application/json.
• RAWBODY—All response parameters of OAuth are returned with the
entity, but the format is not clearly defined. In this case, use
tokenProcessor(JavaScript) to retrieve all parameters.
</xs:element>
</xs:element>
QueryTokenName Element
If the access token type is Query, this element specifies the name in the query
string if the name is different from access_token.
package com.compositesw.extension.security.oauth;
import java.util.Map;
/**Custom flow used for Extension Grants of OAuth2.0:
http://tools.ietf.org/html/rfc6749#section-4.5.
* Any OAuth 2.0 flow with customized request or response that does
not conform to RFC 6749 can be
* customized.
* With CustomFlow, a flow step is ignored if the request is NULL,
or if no request info is defined
* in OAuthProfileCallback.
*/
public interface CustomFlow {
/**
* Build authorization request. If request info is NULL, the
authorization step is ignored. */
public void buildAuthRequest(OAuthProfileCallback callback)
throws Exception;
/**
* Handle authorization response. */
public void handleAuthResponse(OAuthProfileCallback callback)
throws Exception;
/**
* Build access token request. If request info is NULL, the
authorization step is ignored.
* The flow fails if both authorization request and access
token request info are NULL.
*/
public void buildAccessTokenRequest(OAuthProfileCallback
callback) throws Exception;
/**
* Handle access token response. */
public void handleAccessTokenResponse(OAuthProfileCallback
callback) throws Exception;
/**
* Build refresh token request. If request info is NULL, the
authorization step is ignored.
*/
public void buildRereshTokenRequest(OAuthProfileCallback
callback) throws Exception;
/**
* Handle refresh token response. */
public void handleRefreshTokenResponse(OAuthProfileCallback
callback) throws Exception;
/**
* All OAuth elements (access_token, refresh_token, expires_in,
token_type, scope, etc.)
* extracted from response can be found in the value map
returned by getOAuthElements(). */
public Map<String, Object> getOAuthElements();
/**
* Get access token. */
public String getAccessToken();
/**
* Get refresh token. */
public String getRefreshToken();
}
This topic describes the configuration of file-based data sources. For the purposes
of TDV, the file data sources grouped in this topic are those that are file based and
require similar configuration options.
• Supported Character Encoding Types, page 165
• Supported URL Protocols, page 166
• About Export and Import of Custom Java Procedures, page 166
• Adding a Custom Java Procedure, page 167
• Adding a Delimited File Data Source, page 169
• Adding an XML File Data Source, page 171
• Adding a Cache File Data Source, page 174
• Adding an LDAP Data Source, page 174
• Adding Microsoft Excel Data Sources, page 177
• Configuring XML/HTTP Data Sources, page 184
• Cp1250 • utf-16be
• Cp1257 • utf-16le
• iso-8859-1 • windows-1250
• us-ascii • windows-1251
• utf-8 • windows-1256
• utf-16 • windows-1257
Data sources that reference a file, can do so through a URL. TDV supports several
URL protocols, including file, http:, https:, ldap, smb, and ftp. For example, your
file data source could be located at one of the following URL locations:
• file:///C:/projectstuff/build/trialrun/teststuff/flatfile/USPSaddresses.csv
• file:///\\megawat/engineering/bees/swarmhive/xml_files/royaljelly_xml_1
000.xml
• http://rss.news.queenbee.com/rss/topstories
• https://dvirtual-weakhive1/beepackage1/shadyhive10.csv
• ftp://ftp.varoa.fi/pests/standards/RFC/treatment_options.txt
• ldap://dvirtual-waxmoth:389/dc=small,dc=net
• http://dvirtual-waggle/cgi-bin/dance_GetVoters.cgi
• jdbc:PostgreSQL://queenhost:3406/cs030101?continueBatchOnError=false&us
eUnicode=true
• smb://server/share/frame/file
Type Limitation
FTP HTTP, HTTPS and FTP are supported for reading the data. File must
be in text format and unzipped.
network location The URL to the single file must be relative to the machine where TDV
Server is running.
machine without a Web It must be mapped to the machine where TDV Server is running.
server
Custom Java Procedure JARs are exported with a TDV full server backup (and
when using the backup_export), although the tool backup_export -excludejars
option can be used to omit those files when required.
An export exception: If the Custom Java Procedure makes use of one or more
classpaths referring to other JAR files, those files must be backed-up or migrated
separately because they are not picked up and replicated during export.
When a data source is exported, the adapter from which the data source was
created is exported as well. In particular, all the files in the data source directory
are included in the CAR file.
Custom Java Procedures (CJP) are normally imported into the directory
conf/customjars/ when restoring TDV from the full server backup CAR.
The CJP cluster propagation is immediately propagated across the cluster along
with its CJP library. Unlike other TDV-defined resources, those resources that can
be referred to by a CJP data source's classpath are not propagated. Such resources
must be manually distributed to all cluster nodes.
TDV has a JDBC interface and provides a bridge interface so that you can connect
to a data source that is not currently supported. You can create a driver adapter
that connects to that interface.
TDV supports custom procedures written in Java created to interface with other
data sources. TDV provides APIs to create custom procedures.
A CJP library is a JAR file containing the Java classes implementing a set of
Custom Java Procedures (CJPs) and other resources used by the CJPs. A CJP data
source is a TDV custom data source that is created in Studio by specifying the
signature of the CJP, a CJP library, and, optionally, a classpath. The classpath is
needed only if the CJPs need resources that were not included in the CJP library.
For more details on TDV APIs to create custom Java procedures, see “JAVA APIs
for Custom Procedures” in the TDV Reference Guide.
One adapter is sufficient to connect to any number of the same type of data
sources. After it has been uploaded, the JDBC adapter functions like any other
JDBC adapter, such as those used by Oracle, SQL Server, or MySQL.
Customizations can be made to further change the adapter behavior.
You add a custom Java procedure to TDV Server as you would add a new data
source. You must supply the specific JDBC driver and direct the server to the
custom procedure JAR location so that TDV can upload it. The TDV server
assumes that the JDBC adapter is implemented correctly. The server does not
make any accommodations for JDBC adapters that do not supply correct
metadata about the data source and it does not retrieve result sets that are not
consistent with the metadata.
Note: If you need to export or import previously created custom Java procedures,
see About Export and Import of Custom Java Procedures, page 166.
If file is at Description
FTP URL HTTP, HTTPS and FTP are supported for reading the
data. The file must be in text format and unzipped.
network location The URL to a single file must be relative to the machine
where TDV Server is running.
8. Select Use Credentials if you want to specify user credentials here (rather than
with system configuration) for connecting the data source.
Field Description
Domain User’s domain; for example, composite.
14. Select Ignore trailing delimiter check box if each row can contain a trailing
delimiter that you want to ignore.
15. Accept the default file extensions to filter the root directory for, or type in the
file extension values for which you want to filter. Example of two filters:
*.csv,*.txt (a space is allowed between two filters)
network location The URL to the single file must be relative to the machine where TDV
Server is running.
machine without a Web It must be mapped to the machine where TDV Server is running.
server
8. Select Use Credentials if you want to specify user credentials here (rather than
with system configuration) for connecting the data source.
Field Description
Domain User’s domain; for example, composite.
9. Accept the default or specify the Character Set encoding type. The Character
Set drop-down list includes <auto-detect> as the default option for file-XML
data sources, letting Studio detect the character set automatically.
10. Optionally, type in the location of the XML schema file using this syntax:
<namespace> <location> [<namespace> <location>]
namespace http://www.compositesw.com/services/webser
vices/system/admin/resource
location file:///C:/test.xsd
11. Optionally, type in the No Namespace Schema Location to specify the location
for an XML Schema that does not have a target namespace.
12. Accept the default file extensions to filter the root directory for, or type in the
file extension values for which you want to filter.
Rules for the filters:
— * (asterisk) means that any character in the filename occurs zero or more
times.
— ? (question mark) means that any character in the filename occurs exactly
once.
— , (comma) is a separator between filters.
— \ (backslash) is an escape character to escape a filename that contains *
(asterisk), ? (question mark), or , (comma).
The file-cache data source is used for storing resource data that are cached using
the Automatic storage mode. For additional information, see TDV Caching,
page 465. The file-cache data source uses a directory for each table in it, and a file
in that directory for storing the data. The data files are binary encoded.
You can add an LDAP data source and configure it to behave like a relational
table. During introspection, TDV maps all LDAP data types to the string data
type.
The Active Directory objectGUID attribute displays in the "binding GUID string"
format. For example, c208521a-6fcd-43f2-90ad-ed790c9715c1. If a value for the
objectGUID comes from anywhere other than LDAP or is specified in a TDV view
or script, that value must use the same dashed string format.
There are two ways to introspect and use Microsoft Excel files. The method used
depends on whether the TDV Server you are working with is hosted on a
Windows operating system or UNIX operating system:
• Adding Microsoft Excel Data Sources, page 177
• Adding Microsoft Excel (non-ODBC) Data Sources, page 181
In both cases the Microsoft Excel files must be locally accessible to the TDV Server
on a mapped or mounted drive. Flat files do not expose a JDBC interface, so direct
(mapped or mounted) LAN access to those flat files is required.
If you want to introspect Excel documents that contain non-US characters, you
should use the Non-ODBC Excel data source.
Note: Excel files are loaded into managed memory. If managed memory is
insufficient, the introspection fails.
Advanced Tab
Options Description
Connection URL A URL pattern that functions as a template for generating an actual
Pattern URL to connect to the physical data source. Modify this template per
implementation requirements, but be aware that TDV does not
validate modifications. The data source adapter may or may not
validate changes, ignoring invalid URL connection parameters.
Connection URL String The literal URL string that is generated from the connection URL
pattern with the connection information you provide. This string is
used by the JDBC adapter to connect to the physical data source. This
field is generated by the system and is not editable. For further details,
see “Configuring TDV Data Connections” in the TDV Administration
Guide.
When TDV is running on a Windows platform you can access a file
located on the network with a file URL like the following:
file://10.1.2.199/d$/public/folder/ExcelFileName.xls
If you map a network drive from the computer hosting TDV to connect
the computer to file resources, you can browse to the directory to
introspect more than one Excel file at a time, or specify a file URL to
add a single file. The following is an example of a Windows file URL:
file:///Z:/shared/folder/ExcelFileName.xls
Connection Properties Enables specification of property-value pairs that are passed to the
targeted JDBC data source to determine the data source behavior for
the connection with TDV. A selection of commonly used properties for
all the supported versions of MySQL, Oracle, and Sybase are
populated on the Advanced tab with default values.
Maximum Connection Sets the duration, in minutes, that an inactive connection (a connection
Lifetime (m) that was returned to the pool) persists if there are more open
connections than the minimum pool size. The duration is calculated
from the creation time not from the time that the connection was
released to the pool. A value of “0” allows connections to last
indefinitely. Default: 60 minutes.
Connection Validation A native data source query that is sent directly to the data source
Query without evaluation by the TDV query engine. Enter a query that gives
a quick return. If the validation query returns a non-error result, this
fact validates the connection to the data source.
Execution Timeout (s) The number of seconds an execution query on the data source is
allowed to run before it is canceled. The default value of zero seconds
disables the execution timeout so processes that take a long time are
allowed to run. For example, cache updates set to run at non-peak
processing hours can be resource intensive processes that take much
longer than a client initiated request.
Execute SELECTs If this option is checked, a SELECT statement submitted to this data
Independently source is executed using a new connection from the connection pool
and committed immediately after the SELECT is completed. INSERT,
UPDATE, and DELETE statements continue to be executed using the
same connection as part of the transaction.
The code should be written such that the init procedure causes rights
to be revoked if not called with the appropriate context.
Supports Star Schema Check this option if this data source can support large predicates for
star schema semijoins. Do not check this option unless you are sure the
data source can support receiving queries with large predicates and
large cardinalities. Refer to Star Schema Semijoin, page 587 for more
information.
Max Source Side Sets the maximum source-to-source ratio of cardinality for semijoins.
Cardinality for Semi
Join
Max Source Side of Sets the maximum source-side use of the OR syntax for semijoins.
Semi Join to Use OR
Syntax
Selection Description
Local File System Select if the Excel file is on the local file system. With this option, you can select
one, more, or all the files in a directory. You can also select all the directories
and all the files of the same type in those directories to introspect all Excel
spreadsheets at the same time.
Specify the root path to begin the search for the files on the local file system.
Root path is the absolute path to the root directory where the files reside.
URL For TDV running on UNIX operating systems, you can add an Excel file
located on the local machine with a URL file protocol like the following:
file:///usr/name/folder/excel_filename.xls
The directory containing the source Excel file must be mounted to the UNIX
server hosting TDV. For example if the computer directory
10.1.2.199/d$/public contains the Excel file, it could be mounted as
/root/public. The Excel file could be accessed with a file URL like:
file:///root/public/folder/excel_filename.xls
Element Description
Character Set Character encoding type. See Supported Character Encoding Types,
page 165.
Data Range Enter the value that indicates the data range you want to introspect.
Blank Column Type Choose the data type to apply to blank columns: Varchar, Double,
Boolean, or Datetime.
Has Header Row Check if the first row of all the introspected Excel sheets has a row of
column names. If it is not selected, the first row of each Excel data sheet is
introspected as a data row and the column names are: COL1, COL2,
COL3. After the connection is established and the Excel files are
introspected, each sheet is made available as a TABLE that can be defined
as having or not having a header row independently of the original
schema header row setting.
Columns in Every Row Check to introspect the data in every row formatted as specified in the
Use Format Categories first row.
of Columns in First Row
An XML/HTTP data source collects data from raw XML over HTTP. It collects the
information for a single XML/HTTP operation, writes the WSDL for you, and
establishes the WSDL data source instance.
The following steps are involved in configuring an XML/HTTP data source for
use:
• Creating an XML Definition Set, page 184
• Adding an XML/HTTP Data Source, page 185
• Retrieving XML Data Over HTTP, page 187
After creating an XML/HTTP data source, you can use it as follows to get XML
data:
• Execute the XML source within the data source, and view the data in XML
format.
• Create a transformation of the data source, and execute the transformation to
view the XML data in tabular form.
To complete the data source creation process in Studio, the data source metadata
must be introspected from the source. This topic describes how to introspect the
data sources that you have configured, set filters for introspection, and
reintrospect data sources.
The following topics are covered:
• About Introspection and Reintrospection, page 190
• Introspecting Data Sources, page 191
• Tracking and Viewing the Introspection Process, page 200
— Tracking Introspection, page 200
— Viewing the Introspection Task Status Report, page 200
— Viewing the Introspection Report for a Data Source, page 201
• Introspecting Data Source Table and Column Comment Metadata, page 202
• Setting Introspection Filters on a Resource, page 203
• Reintrospecting Data Sources, page 206
— Reintrospecting a Data Source Immediately, page 207
— Scheduling Reintrospection, page 207
— Triggering Reintrospection, page 208
• Tips on Introspecting Multiple Files in a Network Share Folder, page 209
• Tips on Introspecting Based on a CREATE TABLE Statement, page 210
• Tips on Introspecting Large Data Sources, page 210
You can use the TDV API to set up introspection tasks and view the results. You
can find the introspection API operations under:
/Data Services/Web Services/system/admin/resource/operations/
For instructions on how to use these operations, open the operation in Studio and
click the Info tab, or see the TDV Application Programming Interfaces Guide.
Introspection is the process of collecting native data source metadata so that all
selected resources—like catalogs, schemas, tables, and procedures—are
represented in a standardized way in the TDV virtual data modeling layer. The
representations include capabilities and connection profiles. Composite views
and procedures can then be built on the data modeling layer.
Introspection can be performed any time after data source configuration. When
setting up introspection, you can also set filter criteria so that any new resources
of interest are automatically introspected.
Introspected resources are added to the Studio resource tree for use in the Studio
modeling layer, where you can then create views, procedures, and other resources
based on those data resources. The metadata layer can be further enhanced by
definition of indexes and primary keys, by discovery of data relationships, and by
gathering of statistics about the data. The TDV query engine uses introspection
data to generate efficient queries with source-aware algorithms.
See Retrieving Data Source Metadata, page 189 for more information.
After each data source instance is configured and introspected in Studio, you can
use that instance with any other data source instance to create views and
procedures that intermix them all.
Note: Privileges on objects in the Data Source are evaluated only at the time of
introspection or later when a FETCH is called during statement execution. So if
privileges on an object in the data source are lowered after introspection, the error
will not be caught until actual statement execution.
Reintrospection is the process of updating the data modeling layer after a data
source has changed. During reintrospection, new data resources can also be
automatically introspected for the first time.
Reintrospection can be initiated in any of these ways:
• Immediately, at any time.
• Automatically, according to a schedule.
• Triggered by conditions that you specify.
• Invoked by a TDV API call.
You can set filters that automatically reintrospect based on criteria applied to a
catalog, schema, node, or resource.
To invoke data source introspection at any time after the data source has
been created
Use any of these methods:
• Right-click the name of the data source in the Studio resource tree and select
the Add/Remove Resources option.
• Select the data source in the Studio resource tree. Open the Resource menu
and select Add/Remove Resources.
• Open the data source and click the Add/Remove Resources button.
If you request multiple introspection tasks for the same data source, the requests
are queued and executed sequentially, with all but the current task listed as
WAITING.
2. Select the resources you want to introspect by checking the boxes next to
them.
If you select a parent folder, all of the resources it contains are selected.
Note: If you select a catalog, schema, or parent node before the entire list of
contained resources is loaded, only those child resources that were loaded at
the time of selection are selected. In this case, you might want to recheck later
that the entire list has been loaded.
When you select a resource, a Properties tab on the right displays the boxes
that can be checked to control introspection of new resources in the future. See
Reintrospecting Data Sources, page 206.
3. Optionally, filter the display of detected resources as described below.
A count of the number of detected resources that are visible in the resource list
versus the total number of resources is shown at the top of the resource list
(for example, 5 Visible of 6 Total).
You can use the fields and buttons above the resource list to control what is
displayed.
When searching with the Find field, type part or all of the name of a resource
if you want to view only resources starting with this string. The Find field
search pattern has the following rules:
— Search patterns are applied to the entire resource path and name.
— Only matches are shown.
— The characters in the search pattern do not need to be adjacent—only in the
same order.
— The search pattern is not case-sensitive.
— You can use “/” (forward-slash) to control how containers match. Without
a slash, the search pattern spans catalog, schema, and resource names. (For
example: abc matches resource a/b/c.) With one or more slashes, matching
characters must be found in the same positions relative to the slashes.
— No wildcards are supported.
When searching with the Show field, filter what is displayed in the resource
list using one of the following options.
Changes Only the changes made since the dialog was opened, including new selections and
clears.
Adds Only resources that have been added since the dialog box was opened.
Removes Only resources that have been selected for removal from TDV since the dialog was
opened.
— When searching with the filter buttons, click any of the buttons to include
or not include empty containers, check or clear all resources, or expand or
collapse the list.
4. Optionally, click Refresh Resource List to refresh the list of resources that are
available for introspection.
Note: Any currently selected resources are unchecked during the refresh
process.
You must click Refresh Resource List if you have added new resources after
caching metadata and you want to bring those new resources into TDV. Studio
and the TDV Server cache the resource list, and this button forces both caches
to be refreshed. By default, the list of resources is cached within Studio for the
seven most recently introspected data sources during the current user session.
This cache exists only in memory and is cleared when Studio is closed.
Another cache exists within the TDV Server to improve resource load time.
The cache within TDV Server persists across Server restarts.
5. Select how you want TDV to run introspection based on the resources it
encounters using the check boxes at the bottom of the Data Source
Introspection dialog.
Allow partial Check to allow the metadata of those resources that are introspected to be
introspection, omitting committed to the TDV repository even if other resources fail to
resources with errors introspect.
If this check box is checked and dimmed then you are working with a
data source adapter that has not yet been upgraded to the new
introspection framework.
Stop introspection Check only if the data source adapter does not yet support the new
upon the first error introspection framework. If unchecked, you can view all errors and
warnings in the Introspection Status report without prematurely ending
the introspection of other resources.
If this check box is checked and dimmed, you are working with a data
source adapter that has not yet been upgraded to the new introspection
framework.
Copy privileges from Check if you want the data source resources to inherit the access
parent folder privileges accorded to the parent resource.
6. Optionally, for SOAP data sources, expand and define properties for each of
the nodes. Properties can include any of the following.
Property Description
Detect New Resources If you select this option, child resources added to the current resource
During subsequent to this introspection will be detected during
Re-Introspection reintrospection.
If the current resource is a catalog and you add a schema to it
subsequent to this introspection, the added schema is detected during
reintrospection. For details on reintrospection, see About Introspection
and Reintrospection, page 190.
If you do not select this option, child resources added to the current
resource subsequent to this introspection are not detected during
reintrospection.
If the current resource is a catalog and a schema is added to it
subsequent to this introspection, that schema is not detected during
reintrospection. For details on reintrospection, see Reintrospecting
Data Sources, page 206.
Binding Profile Type Allows specification of the HTTP transport protocol, literal encoding
scheme, and document message style.
Use Endpoint URL Check box that allows you to configure the endpoint URL. If it is not
checked, the URL defined in the WSDL is used.
The Endpoint URL displays the real URL of the SOAP data source,
which might be different than the WSDL URL.
If the specified URL is invalid, the introspection process fails for the
SOAP data source. The WSDL data source fails until the request is sent.
Endpoint URL The Endpoint URL displays the URL where clients can access the
SOAP data source, which might be different than the WSDL URL.
You can use this field to edit the endpoint URL.
Default Timeout (msec) The number of milliseconds the connection waits before declaring a
failure is the attempt is unsuccessful.
Property Description
Fast Infoset Enabled Select to allow conversion of XML files into XML Infoset. This option
provides for more efficient serialization than the text-based XML
format.
Timeout (msec) The number of milliseconds the connection waits before declaring a
failure is the attempt is unsuccessful.
Choose Input Envelope Obsolete field for legacy WSDL data sources.
If you do not to select this option, the Choose Top Level Child
Elements appears.
Choose Top Level Child Select this option to map the operation’s request and response to input
Elements and output parameters, respectively. This option is configurable for
every message part separately.
JMS related If the SOAP data source connects through JMS, your node might
display several fields that allow you to enter JMS information such as:
• JMS Connector
• JMS Destination
• JMS Delivery Mode
• JMS Expiry
• JMS Priority
Property Description
Detect New Resources Detect child resources added to the current resource after this
During Re-Introspection introspection.
If the current resource is a catalog and you add a schema to it after this
introspection, the added schema is detected during reintrospection.
See About Introspection and Reintrospection, page 190.
If you do not select this option, child resources added to the current
resource subsequent to this introspection are not detected during
reintrospection.
If the current resource is a catalog, and a schema is added to it after
this introspection, that schema is not detected during reintrospection.
See Reintrospecting Data Sources, page 206.
Property Description
Wildcard Symbol for Defaults to an underscore.
Single Character Match
Filter in Case Sensitive Check box that indicates your preference for running filters.
Mode
New Resource Allows you to save the filter settings with a name that you specify.
<Schema|Catalog|Proced
ure|Table> Name Filter(s)
No Namespace Schema The URL (one only) of a schema document that does not define a
Location target name space.
8. Optionally, you can set up filters for the resources, but they are applied only if
reintrospection is done. See Setting Introspection Filters on a Resource,
page 203, for more information.
9. Click Next.
Studio opens a Data Source introspection plan that lists the resources that are
to be introspected, based on your selections. The Data Source Introspection
plan lists all the introspection changes (additions, removals, and updates) that
will occur when you click the Finish button.
— Resources in black will be introspected.
— Resources in green will be added. (Their parent resources are also shown in
green.)
— Resources in gray will be removed.
10. Review the introspection summary, and if necessary, click Back to revise the
introspection plan and how it is to be done.
11. Click Finish to introspect the resources.
TDV introspects the selected resources and adds the metadata about those
resources to the TDV repository.
12. You can click OK to dismiss this window at any time; however, the
introspection process continues running in the background. The introspection
process continues even if you close the Studio session.
Panel Shows
Running The progress of introspection tasks and their status in the current user
Tasks/Completed Tasks session.
(left side)
Introspection Summary The status; introspection start and end times; number of warnings and
errors; and the number of resources that were added, removed,
(upper right)
updated, and skipped (plus an overall total).
Details The specific resources that were added, updated, removed, and
skipped in the selected introspection. Those resources with no
(lower right)
changes in the metadata used by TDV are added to the Skipped count.
You can:
• Show options—Filter the details displayed to view all details, only
warnings, or only errors.
• Show Details—Select any added resource then click show details
to get additional information about the resource.
• Export—Click to save a text introspection report that lists the
tables, columns, and indexes that were introspected. See Viewing
the Introspection Report for a Data Source, page 201 for an
example of this file.
13. You can view the introspection information for a data source at a later time.
See Tracking and Viewing the Introspection Process, page 200.
Tracking Introspection
You can track time usage by the introspection process with the support of log
files. A configuration parameter controls debug logging.
To diagnose introspection
1. Select Administration > Configuration from the main Studio menu.
2. Locate the Debug Output Enabled for Data Sources configuration parameter.
3. Set the parameter value to True.
4. After introspection, open cs_server_events.log to see how much time the
various introspection steps used.
Panel Shows
left The status of all completed introspection tasks during this user session. A status of
SUCCESS indicates that at least one resource was introspected. It does not mean that no
errors or warnings were logged, but that the attempt was successfully committed to the
data source.
right The status summary and details for the most recent introspection.
Refer to the TDV Reference Guide for details about the data source data types
that are supported by TDV.
4. Optionally, select a message in the Details list and click Show Details.
TDV shows details about that message.
5. Optionally, click Export to export text file that contains a list of introspected
resources.
Data sources, such as Oracle and Teradata, include the ability to add annotations
or comments for added understanding. During the introspection process, TDV
can retrieve this information and add it to the annotations field for each resource.
The annotation field can be viewed on the Info tab of each resource.
Requirements
Requires one of the following data sources:
• Oracle
• DB2 LUW
• Teradata
• Netezza
• PostgreSQL
• DB2 mainframe
• Vertica, Composite
• SAP HANA
• HSQLDB
• Hbase
You can enable or disable the comments introspection for Oracle data sources.
1. Open the Oracle data source for which you want table and column level
metadata retrieved.
2. Select the Advanced tab.
3. Locate and select the Introspect comments check box.
4. Run the introspection process.
5. Review the comments by opening a resource editor and selecting the Info tab.
When you introspect or reintrospect a data source, you can set filters to control
how TDV runs the reintrospection process. You can set filters at the data source,
catalog, or schema level to:
• Filter by catalog, schema, table, or procedure name.
• Use wildcard characters to filter names.
• Detect new resources during introspection.
• Filter in case-sensitive mode.
The filters are applied during reintrospection, whether invoked by API or trigger,
at a scheduled time, or manually. Filters do not apply on initial introspection but
are applied when during reintrospection.
Note: These filters are only available on relational data sources.
Wildcard Symbol for Zero Symbol that stands for zero or more characters. The default symbol is
or More Characters Match the percentage sign (%) and this cannot be changed. For example, if
you type ab% in the Schema Name Filter(s) field, it would match
schema names such as abc, abd, ab, and any name with ab as prefix.
Escape Character for Character for escaping the symbols specified in the previous two
Wildcard Symbols fields (Wildcard Symbol for Single Character Match and Wildcard
Symbol for Zero or More Characters Match). The default escape
character is backslash ( \ ) and this cannot be changed. For example,
the sequence ab\_c would match the resource-name ab_c by escaping
the underscore ( _ ).
Separator for Each Filter Symbol for separating filters. The default separator is a comma (,). For
example, if you type orders, customers, it would match the table
names orders and customers.
New Resource <resource Where <resource type> can be a catalog, schema, procedure, or table
type> Name Filter(s) depending on what is selected in the resource list. For example, if you
select a data source that has no catalogs or schemas but does have
procedures and tables, then only New Resource Table Name Filter(s)
and New Resource Procedure Name Filter(s) are displayed on the
Properties tab.
Enter filters on the Properties tab to introspect only the resources that
meet the filter criteria. See the preceding descriptions for wildcard
symbols, escape character for wildcard symbols, and separator for
filters.
If no filter is specified, all new catalogs/schemas/procedures/tables
(and their contents) are added during reintrospection. Existing
resources that were not selected during the initial introspection or
during a manual introspection are ignored during reintrospection,
even if they match the New Resource Catalog <resource type> Name
Filter(s) string. You can add previously ignored resources at any time
using Add/Remove Resources.
For example, if BZHANG is not selected for introspection and a
reintrospection filter of B% is set for the data source, subsequent
reintrospection finds any new schemas that begin with B but does not
introspect BZHANG.
The filters do not use the same serial character matching pattern as the
Find field, which applies to the Introspectable list only. Strings are
matched using the case sensitivity setting and the wildcard symbols
specified.
5. Click Next.
The Data Source Introspection summary page lists all the introspection
changes (additions, removals, and updates) that will occur when you click the
Finish button.
— Resources in black will be introspected.
— Resources in green will be added. (Their parent resources are also shown in
green.)
— Resources in gray will be removed.
The summary also lists at the bottom the rules that are applied during
introspection as a result of what you selected for the check boxes on the
previous panel.
6. Click Finish to initiate the introspection.
Scheduling Reintrospection
You can schedule reintrospection to occur at a specific time, date, or interval. The
scheduled reintrospection option is available only if the password was saved
when the data source was originally added to the server.
To schedule reintrospection
1. Right-click a data source name in the resource tree, and select Open.
2. Click the Re-Introspection tab at the bottom of the Studio workspace for the
open data source.
3. In the Scheduled Re-Introspection section, select the options for the schedule
you want reintrospection to follow.
a. Select Save Detected Changes if you want the changes detected in the data
source during reintrospection to persist (that is, be saved in the
repository). Uncheck this box if you want to inspect the changes prior to
accepting them and committing data source metadata changes to the TDV
metadata.
b. Select the frequency with which you want reintrospection to repeat:
— None (the default). Reintrospection is not scheduled.
— Repeat Every < > minute(s), and type the number of minutes in the text
field.
— Repeat Hourly for the reintrospection to recur every hour.
— Repeat Daily for the reintrospection to recur every day.
— Repeat Weekly for the reintrospection to recur every week.
c. Specify the starting time and date for the introspection in the
corresponding drop-down lists.
The date entered indicates the time at which the first occurrence of the
reintrospection is to occur. For example, if a daily event is set for 11:55 a.m.
three days in the future, it will run at 11:55 a.m. in three days and then every
day thereafter.
d. To send a reintrospection report to one or more email addresses, type and
confirm the email addresses of the recipients.
Separating multiple email addresses with commas (,) or semicolons (;).
4. Right-click on the resource panel’s upper tab and select Save to save your
settings.
Triggering Reintrospection
You can set up a trigger to reintrospect based on a JMS event, a system event,
timer event, or user-defined event. See Triggers, page 391 for more information
about triggers.
If the root path does not show the network mapped drives, the Root Path can be
specified as UNC path without using the Browse button.
1. Define a new or edit an existing Microsoft Excel (non-ODBC) data source.
2. On the Basic tab for the Local File System value, if the Browse button does not
allow you to navigate to the machine or root directory that contains your
Excel data you can type the root directory location in the Root Path field. For
example:
R:/<mymachinename>/work/users/mega/excel-ds
Note: The direction of the slashes in the root path are important. If you know
your files are in a particular directory and they do not show in the
introspection window, modify the direction of the slashes in your root
pathname and save the data source.
3. Save the Microsoft Excel (non-ODBC) data source.
4. For an existing data source, on the Basic tab, scroll to and click the
Add/Remove Resources button.
5. Multiple files in the network share folder should now be available for
introspection.
Large data sources with complex schemas and big tables with lots of columns
have a correspondingly large amount of metadata that can pose challenges for
TDV. For TDV, it is not the amount of data stored, but rather the complexity and
amount of metadata that defines the data source that is the most important factor
for performance, particularly during introspection.
Introspection can be broken down into three phases.
• Understanding the Resource ID Fetching Phase, page 211
• You choose Refresh Resource List in the Data Source Introspection dialog
when you choose Add/Remove Resources.
• Introspection (see Understanding the Introspection Plan Implementation
Phase, page 212) is executed with an introspection plan that requests detection
of new resources to be added.
• You invoke the deprecated or UpdateDataSourceChildInfosWithFilter (or the
deprecated UpdateDataSourceChildInfos) Web services operation.
Reintrospecting existing resources does not recreate the cache unless the user has
requested that new resources be detected for adding.
• Increase the Connection Pool Maximum Size setting of the data source (be
aware that the physical data source can get impacted if this number is too
high). To change this setting, open the data source, and modify settings on the
Advanced tab.
• Set the introspect.override_parallelism_degree in the data source capabilities
file. The data source capabilities files for all supported data sources can be
found at:
<TDV_install_dir>\apps\dlm
This topic describes the Studio user interface that you use to create and work with
composite views and table resources.
• About Composite Views, page 215
• Creating a New View, page 216
• Setting Default Query Options, page 217
• Commenting SQL, page 218
• Adding Column Level Annotations, page 219
• SQL Request Annotation Pass-Through, page 219
• Designing a View and Table Resource, page 220
• Generating a Model from the SQL Panel, page 241
• Working with Views and JSON, page 242
• Designing Column Projections Using the Columns Panel, page 245
• Obtaining View Details, page 246
• Executing a View, page 246
• Generating a Query Execution (Explain) Plan, page 247
• Generating an Explain Plan and Displaying it in a Client Application,
page 247
• Rebinding a View, page 248
• Displaying the Lineage of a View, page 249
• View Column Dependencies and References, page 251
• Creating a View from a Cube, page 261
• Ad Hoc SQL and the SQL Scratchpad Editor, page 261
A composite view is a virtual data table defined by SQL and TDV metadata. A
view defines a SQL query that comprises a SELECT statement and any
ANSI-standard SQL clauses and functions. A view can JOIN or UNION to any
resource defined in the TDV virtual data layer.
This section describes how to create a view that uses relational data source tables.
For details on selecting a location to create a resource, see Locating a Container for
a TDV Resource, page 40.
To create a view
1. Right-click an appropriate location in the resource tree, and select New View,
or use File > New > View.
2. In the Input dialog, type a name for the view.
3. Click OK.
The view is added to the specified location in the resource tree, and the view
editor opens the Model Panel in the right pane.
Studio provides a view editor that opens automatically when you create a
view. You can open the view editor at any time by double-clicking a view’s
name in the resource tree.
Query hints are options that you can include to optimize your SQL statement for
the TDV query engine. You can specify any of the options that are documented in
“TDV Query Engine Options” in the TDV Reference Guide.
Options specified in a query override any defaults set using the configuration
parameter values.
4. Type an option or hint type value for the Key. For example, INDEX.
5. Type a Key Value. For example, employees emp_dept_ix.
6. Click OK, to save the changes and exit the Configuration screen.
Commenting SQL
You can insert standard SQL comments by starting each line with a pair of
hyphens. You can also insert multiple-line C-style comments, using /* to start the
comment and */ to end the comment. Each of these methods is described below.
Annotations that are added to tables and views, are carried through when you
publish the resource. After publishing, the column level annotations can be
viewed in Business Directory or through the other client interfaces that you use to
access published TDV data.
SQL annotations (comments) sent in requests to data sources can be used for
customer implementations to enhance logging, traces, and tracking. By default,
TDV trims SQL annotations from client requests.
View design involves performing one or more of these tasks, depending on your
implementation requirements:
• Add resources for the view.
• Join tables by columns and specify join properties.
• Combine the output of multiple SQL selects.
• Specify global query options.
• Specify query options and query engine hints.
• Specify column output to include in or exclude from the view execution
result.
• Apply functions on columns to transform result set values.
• Supply aliases for column names.
• Specify the sorting order for columns.
• Specify GROUP BY, HAVING, or EXPRESSION options.
• Specify constraints for the WHERE clause.
• Specifying indexes, primary keys, and foreign keys.
You can accomplish these tasks using the Model Panel, Grid Panel, SQL Panel,
Columns Panel, Indexes Panel, and the Foreign Keys Panel.
The following sections describe how to perform these tasks:
• Designing a View in the Model Panel, page 220
• Designing a View in the Grid Panel, page 230
• Designing SQL for a View in the SQL Panel, page 237
• Generating a Model from the SQL Panel, page 241
• Defining Primary Key for a View or Table in the Indexes Panel, page 238
• Defining a Foreign Key for a View or Table Resource, page 239
To design a view
1. Select data resources for the view by dragging and dropping tables,
procedures, and transformations into a view.
2. JOIN tables by linking table columns.
3. Configure JOIN properties
a. Define the JOIN logical operators.
b. Specify the preferred join algorithms or suggest semijoin optimizations.
c. Provide estimates of table cardinality to help the query engine optimize
the execution plan.
d. Include all rows from the left, the right, or both sides of the join.
e. Force join ordering or swap the order of the tables in the join.
4. Create UNIONs of SELECT statements, bringing together rows from
matching tables.
5. Navigate among SELECT statements and the UNION of those SELECTs.
6. Use query hints to set the maximum number of rows to return, to set case
sensitivity, to ignore trailing spaces, to mark the query as STRICT (adhere to
SQL-92), or to force the query processing to disk if the query is likely to exceed
memory constraints.
The following topics have more information on designing views in Studio:
• Adding a Resource to the Model Panel, page 221
• Joining Tables in the Model Panel, page 222
• Enforcing Join Ordering, page 224
• Creating a Union, page 225
• Navigating between Tables in the Model Panel, page 227
• Specifying the DISTINCT Query Option, page 228
• Specifying Query Hints, page 228
3. Make sure that columns specified for JOINS are of compatible data types.
To check the data type of a column, click the Columns tab and open the
Columns Panel.
4. Right-click the join icon on the JOIN line and select Properties to open the Join
Properties window, or double-click the diamond graphic.
5. Select how the columns should be compared (=, <=, >, and so on) from the
drop-down list in the top center of the Join Properties window.
6. In the Include rows section, check the boxes to specify the rows you want to
include in the join:
— Select the upper box to specify the LEFT OUTER JOIN.
— Select the lower box to specify the RIGHT OUTER JOIN.
7. In the Join Details section:
a. From the Specify Join Algorithm drop-down list, select the algorithm to
use for the join.
For the descriptions of the different algorithms, see Semijoin Optimization
Option, page 579.
b. Specify the Left Cardinality constraint.
Provides cardinality hint for the left side of a join. It should be a positive
numerical value, for example 50.
c. Specify the Right Cardinality constraint.
Provides cardinality hint for the right side of a join. It should be a positive
numerical value, for example 500.
d. If you select the Semijoin Optimization check box, the TDV query engine
attempts to use the results from the number of rows to be processed for the
join is minimized, and the query engine’s performance is enhanced.
e. Select one of the following order options:
— Default Ordering—Applies the default ordering.
— Swap Order—Swaps the left and right sides of a join.
— Force Join Ordering—Overrides join ordering optimization.
f. Click OK to save the join specification.
8. Click the SQL tab to verify how the join is specified in the SQL statement,
similar to the following example:
SELECT
products.ProductID,
orderdetails.UnitPrice,
orderdetails.Status
FROM
/shared/examples/ds_inventory/products products FULL OUTER {
OPTION SEMIJOIN="True", LEFT_CARDINALITY="50",
RIGHT_CARDINALITY="500" } JOIN
/shared/examples/ds_orders/orderdetails orderdetails
ON products.ProductID = orderdetails.ProductID INNER JOIN
/shared/examples/ds_orders/orders orders
ON orderdetails.OrderID = orders.OrderID
9. Click OK.
Creating a Union
Studio lets you graphically create a UNION between two or more tables.
3. Right-click again in the Studio design space and choose the Add Union option
to open a second UNION design space, called Union1.
5. Drag and drop resources into the Union1 side of the SELECT statement.
The definition of the two tables in the UNION must have the same number of
columns in the same order, and they must be of compatible data types, for the
UNION to function correctly.
6. After defining the second table for the UNION, right-click anywhere in the
design space and select UP to return to the top-level query view navigator.
By default the two tables are combined with a UNION ALL, but you can
change that to a UNION, EXCEPT, or INTERSECT.
To change the maximum number of flips between UNION ALL and JOIN
1. From Administration in the Studio main menu, select Configuration.
2. Navigate to Server > SQL Engine > Overrides.
3. Change the Value from its default of 2 to another integer.
4. Click OK at the bottom of the panel.
After you first click in the Navigator window, a blue rectangle appears,
showing what area is currently visible in the Model panel.
2. Click inside the blue rectangle and drag it over the area you want to see.
3. Select the SQL panel and verify that the SQL statement now contains the
DISTINCT option.
2. You can move tables to the right side in one of two ways:
— Click the double-headed arrow button to move all available tables to the
right side.
— Select one (or more, using Ctrl-click or Shift-click) of the available tables,
and click the right-arrow button to move the selected tables to the right
side.
You can move a table back to the left side by selecting it and clicking the
left-arrow button.
3. Optionally, if you want to add all columns from tables even if they are already
displayed in the Grid panel, uncheck the Skip Duplicate Columns check box.
4. Click OK.
Columns from the tables you selected are added to the bottom of the Grid
panel listing. The view editor automatically creates aliases for identical
column names, whether they came from the same table or different tables. You
can select any alias on the Grid panel and type a new alias for the column.
5. Select the SQL panel and see that the SELECT statement now includes all of the
columns you requested.
wrap_prod_id View
SELECT
ProductID,
a_prod_id,
ProductName,
ProductDescription
FROM
/shared/DEMO/VirtualColumn/for_procedures/requires_prod_id
WHERE
a_prod_id = 10
ProductID,
Discount,
OrderDate,
CompanyName,
CustomerContactFirstName,
CustomerContactLastName,
CustomerContactPhone
FROM
/shared/examples/ViewOrder ViewOrder
WHERE
(ProductID >= prod_id_begin)
and (ProductID <= prod_id_end)
;
END
requires_prod_id View
SELECT
products.ProductID,
{ DECLARE prod_id_begin INTEGER } prod_id_begin,
{ DECLARE prod_id_end INTEGER } prod_id_end,
products.ProductName
FROM /shared/examples/ds_inventory/tutorial/products products
WHERE
(ProductID >= prod_id_begin)
and (ProductID <= prod_id_end)
view_wrap_prod_id View
SELECT
ProductID,
ProductName
FROM
/shared/DEMO/VirtualColumn/range_of_values/requires_prod_id
where
prod_id_begin = 5
and prod_id_end = 15
There are some limitations and considerations when editing the SQL:
• The SQL for a view cannot contain an INSERT, UPDATE, or DELETE clause.
You have to include such clauses in a SQL script, as described in Procedures,
page 267
• If you want to use the Model panel later for the current view, first save the
SQL statement under a different name, and then use the SQL panel for your
hand-typed SQL code.
• You are responsible for the syntax of the SQL you type in the SQL panel.
Studio does not check the validity of the syntax.
• If table or column names contain special characters, such as @, $, or &, enclose
the names in double quotation marks (" ") in the query.
• You should not use OFFSET and FETCH in a composite view.
For details on the SQL features supported in TDV, see the TDV Reference Guide.
Foreign key relationships indicate that if you join A.X = B.Y, there is exactly one
row in Table B found for each row in Table A. This information is a useful hint for
join operations.
In Studio, the Foreign Keys panel in the view and table editor lets you create a
definition that acknowledges and allows use of any foreign keys in the data
source. For each foreign key you want to define, first you need to select the
column that would be the foreign key, and subsequently identify the
corresponding primary key column in the parent table.
The processes for defining a foreign key and testing the creation of a foreign key
are as follows:
7. Use the forward arrow button to move ProductID from Available Columns to
the Foreign Column section.
The columns in the Parent Table are displayed in a drop-down list in the
Primary Column section, and the primary key column is visible.
8. Select the primary key column (ProductID) in the drop-down list (in the
Primary Column section) and save the resource. Now, the view has a column
(ProductID) identified as a foreign key.
Repeating this process, you can define as many foreign keys as you need.
If you have typed or changed the SQL statement in the SQL panel, you can
generate a model based on the SQL currently displayed in that panel. The Model
and Grid panels reappear in the view editor when you generate a model.
The model generator does not support all of TDV SQL syntax. SQL features the
model generator does not support are as follows:
• UNION, INTERSECT, EXCEPT, EXISTS, scalar subqueries, derived tables, IN
clause with a subquery, quantified subquery, and the INSERT, UPDATE, and
DELETE operations.
• Error messages result for models generated from queries that include these
SQL features.
In the regenerated model, the tables would be joined at the top (joining the table
tiles), instead of at the appropriate columns, under the following circumstances:
• If the ON clause of the JOIN has one or more of the following items:
— A function
— an OR condition
— Any of the following predicates: IN, LIKE, BETWEEN, IS NULL
• If the join is a self-join and no columns are involved in the join, or only
columns from the same table are involved in the join.
The model generator accepts all INNER and OUTER JOINs. However, it might
not be able to match the columns originating from the left and right sides of the
join. When the model generator cannot identify both columns, the two tables in
the resulting model are joined by a line that stretches from the title bar of the first
table to the title bar of the second table.
JSON files are widely used to collect and transfer data, especially through the
web. The structure of a JSON file can readily be converted into a structure that
resembles a table. The ‘virtual table’ can be inserted into an existing database
table, or it can be queried using a SQL JOIN expression. After creating a view for
your JSON file, you can then reference and use the resulting JSON table in any
other view, or you can use JSON_TABLE syntax to define the JSON table structure
and populate it from within you view.
TDV and Studio support the use of the SQL JSON_TABLE function. For more
information about the SQL syntax, see the TDV Reference Guide.
You can use JSON_TABLE syntax in various ways within TDV. The following
sections describe two common scenarios:
• Using a Referenced JSON File within a TDV View, page 243
• Using a JSON Table Within a TDV View, page 243
• JSON Table Examples, page 244
},
{
"DepartmentID": 2,
"DepartmentName" : "Project"
},
{ "DepartmentID": 3,
"DepartmentName" : "Market"
},
{ "DepartmentID": 4,
"DepartmentName" : "HR"
}
]
}}',
'$.company.department'
2. From within this view you can now use standard SQL functions to manipulate
the composite view.
You can find out the details about a resource, such as its name, location in the
resource tree, type and owner, and whether it is locked or not, using the Info tab.
Executing a View
After you have designed and saved a view (refer to Designing a View and Table
Resource, page 220) you can execute its SQL from within Studio to see the result.
For executing a view’s SQL through client applications, see TDV Client Interfaces
Guide.
Executing a view might be affected if the view uses a data source that was
originally added to TDV Server using the pass-through mode without saving the
password. To execute such a view, you must log into TDV Server with the same
logisign-indentials (userusername password) that are necessary to log into the
data source. For further information on pass-through credentials, see the details
for the Save Password and Pass-through Login fields under Adding a Data
Source, page 68.
Executing a view from Studio is blocked if the view’s SQL is SELECT * on a view
that includes a table with column-based security (see “About Managing
Dependency Privileges” in the TDV Administration Guide).
You can show the query execution plan instead of executing the query.
An alternate to show a query execution plan is to precede the query with the
keyword EXPLAIN. This feature can be used in Studio, although it is intended for
use in JDBC/ODBC client applications. Being able to analyze the execution plans
of a query in tools other than TDV can help you optimize your queries faster. The
text results are retrieved in a result set that can be consumed by Studio or by your
client application. The result set is one column of VARCHAR text.
The default column width of the execution plan is 100 characters. You can change
it from Studio using the TDV Explain text width configuration parameter.
Rebinding a View
A view depends on one or more underlying sources, and the view is considered
bound to those underlying sources. A view can be bound to tabular data or a
procedure. Rebinding is useful when:
• You create a view with its sources and later decide to rebind the view to
different sources.
• An execution error has occurred for a view because a source with which the
view was initially bound does not exist any more.
When you are working on a complex query, it is useful to know what resources
are involved so you can understand the query’s relationships to other resources.
The relationships displayed include the resources on which the view depends and
any resources that depend on the view. The Lineage panel in Studio shows a
view’s lineage. The Lineage panel is also available for cached views as well as
other resource types. See Exploring Data Lineage, page 445, for more information.
screen shows the resources (data sources and tables) on which the sample
view ViewOrder depends and the resources that depend on this view.
In the graphical format, the view’s dependencies are shown on its left and the
resources that depend on it are shown on its right.
2. Select any resource in the Lineage panel to see details about the resource in the
right pane. For example, select a table in a view to display the SQL used to
generate that view with the selected table highlighted.
3. Use the buttons at the top of the Lineage panel to adjust the display. See
Lineage Panel Buttons and Controls, page 447 for more information.
See Working with the Data Lineage Graph, page 450 for more information about
the Lineage panel.
TDV provides API tools for you to ascertain the data lineage of the resources
involved in a query. You can determine:
• Column dependencies—How the columns are derived from a specified
composite view and going back one or more levels through the various
resources on which the view depends. You can find out what resources are
involved and what transformations might have occurred that affect the
results. See Getting Column Dependencies Using the API, page 252, for more
information.
• Column references—What resources directly depend on that column. You can
find out what other views refer to a specific column in a table or view. See
Getting Column References Using the API, page 258, for more information.
If the view depends on a procedure, web service, or packaged query, that is the
end of the dependency analysis for that data. No further analysis is done through
a procedure, web service, or packaged query.
In this example, the column lineage data results might be as shown here:
Paramete
Value Description
r
resourceP Required Enter the path of the resource to be analyzed.
ath
The supported resource types are TDV SQL Views in plain or published form.
ignoreCac Optional. Specify if the analysis should ignore whether depended resources are
hes cached or not. Enter one of these values:
true—Do not return any caches as part of the column lineage.
false—The default. If blank or false, any existing caches in the column lineage are
returned in the procedure results.
recursivel Optional. Specify if the analysis should be done recursively all the way to the source
y level of dependency or only return one level of dependency. Enter one of these
values:
true—Return all levels of dependencies down to the original source level.
false—The default. If blank or false, return only a single level of column dependency.
Column Description
columnName The name of the resource column having the column dependency encoded in
the row.
dependencyData The path to the data source containing the resource owning the dependency.
sourcePath Empty if not applicable.
dependencyData The type of the data source containing the resource owning the dependency.
sourceType Empty if not applicable.
dependencyReso The path to the resource owning the dependency. Empty if not applicable.
urcePath
Column Description
dependencyReso The type of the resource that owns the dependency. Empty if not applicable.
urceType The set of data source types consists of all the data source adapter names
accepted by TDV.
The table and procedure resource types accepted by TDV is as follows:
Database Table
Delimited File
Excel Table
TDV SQL View
System Table
SAP RT Table
SAP RFC Table
SAP AQ Query Table
SAP Infoset Query Table
Siebel Table
Database Stored Procedure
Packaged Query
Java Procedure
Web Service Operation
Composite SQL Script Procedure
XQuery Procedure
Transform Procedure
Basic Transform Procedure
Stream Transform Procedure
XQuery Transform Procedure
XSLT Transform Procedure
Column Description
derivationKind One of the following:
• direct, to indicate that the value of the dependency is preserved by the
dependent column
• indirect, to indicate that the value of the dependency is transformed by the
dependent column
derivations When the dependent column is not a direct projection of the dependency, this
field denotes how the dependent column is derived.
8. To view the details of the column dependency, select a dependency and click
the Details button in the Result For columnDependencies panel. Studio
displays the full details for the selected column dependency as shown in this
example:
First level of
dependency
Second level of
dependency
10. Optionally, open any of the involved resources to see who originally created
the resource and when on the resource Info tab.
• After you have determined the column reference, you can run the
GetColumnReferences procedure again on the referenced view or table to
determine the next level of reference.
• Publish the GetColumnReferences procedure:
— As a new Data Service. You can then access it with the supported protocols
like JDBC, ODBC, ADO.NET.
— As a new Web service which you can access using REST, SOAP, and the
other supported protocols.
columnFilter Optional. Specify if the column lineage results should be filtered. If empty, no
filtering occurs. To filter the results, enter a comma-separated sequence of
case-insensitive column names, indicating the columns whose references should
be analyzed.
Note: The analysis is done ignoring whether the specified view or table is
cached or not.
7. Click OK.
TDV executes the procedure and displays the column references in the lower
Result For columnReferences panel.
Column Description
columnName The name of the resource column having the column reference encoded
in the row.
referentResourcePath The path to the resource containing the column reference. Empty if not
applicable.
Column Description
cardinalityInfo When applicable, one of the following:
• Aggregate, to indicate that an aggregate function is involved in the
derivation of the dependent column.
• Analytic, to indicate that an analytic function is involved in the
derivation of the dependent column.
Populated only if referenceContext is in a SELECT clause as output;
otherwise, empty.
8. To view the details of a column reference, select the reference and click the
Details button in the Result For columnReferences panel. Studio displays the
full details for the selected column reference.
9. Optionally, run the GetColumnReferences procedure again on the referenced
view or table in the Result For columnReferences panel to determine the next
level of reference.
You can run the GetColumnReferences procedure as many times as needed to
discover the reference path.
10. Optionally, save the results as a Comma-Separated Values (*.csv) file by
clicking Save to File on the Result For columnReferences panel.
11. Optionally, open any of the involved resources to see who originally created
the resource and when on the resource Info tab.
These objects are only valid if you have the SAP BW adapter installed. For more
information on defining and using this type of Studio resource, see the TDV SAP
BW Adapter Guide.
TDV provides an ad hoc SQL editor a SQL Scratchpad editor that let you execute
“anonymous” SQL statements that are not bound to a resource. You can define
and execute any SQL statements that can be executed for a composite view.
Ad hoc SQL content and history are saved on the local (Studio instance) machine,
and retrieved in the following Studio session.
These topics are covered:
• Opening the SQL Scratchpad Editor, page 262
• Composing and Executing an Ad Hoc SQL Query, page 263
• The SQL Scratchpad History Pane, page 264
• The SQL Scratchpad Result Panel, page 265
You can open the SQL Scratchpad from the Studio Modeler panel in one of
three ways:
• From the Studio menu bar, select File > Show SQL Scratchpad.
• On the Studio toolbar, click the Show SQL Scratchpad button.
• Press Ctrl+Q anywhere in the Studio window.
SQL Scratchpad
toolbar
work area
You can type queries in the work area and add resources to the queries by:
• Typing the paths and names of the resources
• Dragging the resources from the Studio resource tree and dropping them onto
the work area
• Using the Add Resources button
The following table describes the actions you can perform in the SQL Scratchpad
Editor toolbar.
Export to File Open a dialog box to save the SQL query as a file.
Import from File Open a dialog box to import a SQL query from a file.
Add Resources Open a window where you choose a resource to add to the SQL
Scratchpad editor.
Format Query Format the SQL text according to the settings made on the Editors
tab of the Studio Options window.
Toggle Syntax Change between displaying all SQL text in the same font color
Highlighting (usually black) and displaying comments, keywords and literals in
the colors you selected in the Editors tab of the Options dialog box.
Show/Hide History View Display or hides a History pane along the right side of the SQL
Scratchpad editor. (See The SQL Scratchpad History Pane, page 264).
Show Option Dialog Open the Studio Options window to the SQL Scratchpad tab, where
you can change the options for this window.
locking hint
• See the distinct queries you have executed. The history list retains the number
of queries you specify in the Studio Options window, with the most recently
used query at the top.
• Instantly display a previous query in the work area.
• Save all distinct queries automatically.
• Lock a query so it cannot be deleted.
You can Alt-click an unlocked query in the history list to lock it. You can lock
as many queries as you want.
When a query is locked, a lock icon appears next to it. Locked queries do not
count toward the Maximum History Size set in the Studio Options window.
• Unlock a query so it can be deleted. Alt-click a locked query in the history list
to unlock it.
• Delete a selected query from the history list (after unlocking it, if necessary).
• Change the maximum number of queries to save in the history list. Click
Maximum History Size in the Studio Options window and choose a number
from the drop-down list.
Changes to maximum history size do not take effect until you restart Studio.
Details Open a Result Details window that lists all of the column headings and values
for the selected row. Click Next to list details for the next row, or Prev to list
details for the previous row. To dismiss the window, click OK.
Show Execution Open a tab, next to the Result tab, that shows the execution plan for the SQL
Plan executed.
If you click Show Execution Plan on the Result tab, an Execution Plan opens
adjacent to the Results tab. The Execution Plan tab shows the execution plan and
details for the SQL executed. The execution ID is shown on both tabs, so that the
two can be correlated. An Execution Plan tab to the left of a Result tab indicates a
stale plan.
Procedures
This topic describes how to design procedures to query and manipulate data
stored in a data source. It also describes transformations and how to use them to
map your data into different formats.
• About Procedures, page 267
• Creating a SQL Script Procedure, page 268
• Working with SQL Scripts, page 270
• Java Procedures, page 277
• XQuery Procedures, page 278
• XSLT Procedures, page 281
• Packaged Queries, page 284
• Parameterized Queries, page 293
• Physical Stored Procedures, page 298
• Using Stored Procedures, page 303
• Executing a Procedure or Parameterized Query, page 305
• Using Design Procedure By Example for Introspected Procedures, page 310
Transformations are a special class of procedures. For more information about
working with transformations, see Using Transformations, page 311.
About Procedures
Although different languages (TDV SQL script, Java, SQL native to data sources,
XSLT, XQuery) are used to formulate procedures, they all function alike in the
TDV system. You can invoke one procedure type from within another.
You can use any text editor to write a procedure, and use Studio to add it to the
server. You can also use Studio’s procedure editor to create and edit any type of
procedure, except Java procedures. You can add a procedure to any location
except Data Services in the resource tree. For details, see Locating a Container for
a TDV Resource, page 40.
For details on creating or adding procedures, see:
• Creating a SQL Script Procedure, page 268
• Adding a Custom Java Procedure, page 167
• Java Procedures, page 277
• Creating a Packaged Query, page 285
• Quick Steps for Creating a Parameterized Query, page 293
• Creating an XML, XSLT, or Streaming Transformation, page 314
• Creating an XQuery Transformation, page 320
For a description of how to move a procedure to a Data Services container, or if
you want to make a procedure available to client programs or as a Web service,
you must publish that procedure. For details, see Publishing Resources, page 407.
This section describes how to create a SQL script—a procedure written in TDV’s
SQL script language, as described in “TDV SQL Script” in the TDV Reference
Guide.
The parameters you define in the SQL script panel must match the parameters
that you define in the Parameters panel, including the order in which they are
defined.
As part of creating a SQL script, you might also want to do the following:
The left navigation button is enabled for a CURSOR parameter. The right
navigation button is enabled for a parameter in a row immediately after the
last parameter in a CURSOR.
10. After specifying all of the parameters you want to define, save the edits.
11. In the SQL Script panel, complete the procedure using the parameters
designed in the Parameters panel.
12. Save the script.
After the parameters in the design match the parameters in the script on the
SQL Script panel, the name of the script is displayed in black.
5. Click OK.
6. Determine which adapter file you want to modify using the following
guidelines and tips:
— Changes made under /conf/adapters remain after applying hotfixes, and
are included in any CAR file.
This is true for /conf/adapters/custom and /conf/adapters/system.
— Changes made to system adapters under /conf/adapters/system might
require slightly more editing. For example, for Oracle you would need to
define a <common:attribute> </common:attribute> for 11g.
Note: If you want a custom function to be pushed to all versions of a data
source, add the name and signature of that custom function to the “generic”
capabilities file for that data source.
— Changes should not be made under /apps/dlm.
— You can choose to create a custom adapter to manage your customizations,
but if you are adding missing capabilities it might not be recommended.
7. Locate the capabilities file to customize.
For example, to add a custom function, ss10, intended for a custom Netezza
data source, open the file:
<TDV_install_dir>\conf\adapters\system\<netezza_ver>\<netezza>.xml
8. From a text editor, open and add XML with the name, type, value, and
configID of the custom function. For example, to add custom function ss10,
add XML similar to this:
<common:attribute
xmlns:common="http://www.compositesw.com/services/system/util/comm
on">
<common:name>/custom/ss10(~string,~string)</common:name>
<common:type>STRING</common:type>
<common:value>ss10($1,$2)</common:value>
<common:configID>ss10(~string,~string)</common:configID>
</common:attribute>
Pipes
A pipe is a cursor that can have data inserted into it from many sources. TDV only
sees pipes as OUT parameters for a procedure. You can use INSERT statements to
get the data into the pipes for processing. For example:
PROCEDURE q(OUT name VARCHAR)
BEGIN
DECLARE c CURSOR (name VARCHAR);
CALL /shared/rtnPipe(c);
FETCH c INTO name;
END
Cursors
For TDV, a cursor is a result set returned from a single data source. A cursor can
be thought of as a pointer to one row in a set of rows. The cursor can only
reference one row at a time, but can move to other rows of the result set as
needed.
Cursors can be used to perform complex logic on individual rows.
To use cursors in SQL procedures, you need to do the following:
• Declare a cursor that defines a result set.
• Open the cursor to establish the result set.
• Fetch the data into local variables as needed from the cursor, one row at a
time.
• Close the cursor when done.
CLOSE c;
END
Java Procedures
Java procedures are programs that you write in Java to access and use TDV
resources. You can add these procedures as TDV Server data sources, execute
them against a data source, use them in views or procedures, and publish them as
TDV database tables or Web services.
For details about the TDV APIs for custom Java procedures and examples of Java
procedures, see “Java APIs for Custom Procedures” and “Java for Custom
Procedures Examples” in the TDV Reference Guide.
For details about adding a Java procedure to TDV Server, see Adding a Custom
Java Procedure, page 167.
This topic contains:
• Viewing Java Procedure Parameters, page 277
For information on how to define caching for your procedures, see TDV Caching,
page 465.
XQuery Procedures
XQuery procedures are programs that you write in XML to access and use TDV
resources. You can add these procedures as TDV Server data sources, execute
them against a data source, use them in views or procedures, and publish them as
TDV database tables or Web services.
Note: The XQuery and XSLT procedures cannot be used as resources in the
Any-Any transformation editor.
This topic contains:
• Creating an XQuery Procedure, page 278
• Designing Parameters for an XQuery, page 279
For information on how to define caching for your procedures, see TDV Caching,
page 465.
— If you have a script stored in your file system, you can upload it using the
Insert from File button.
— The XQuery panel can highlight XML syntax elements (comments,
keywords, and so on).
— Insert standard XML comments by highlighting one or more lines and
typing:
— Ctrl-Hyphen to insert two dashes (--) at the start of each line.
5. Save the script.
For details on executing an XQuery procedure, see Executing a Procedure or
Parameterized Query, page 305.
XSLT Procedures
XSLT procedures are programs that you write in XML to modify XML
documents—for example, to convert them to other formats based on an XSL style
sheet. You can add these procedures as TDV Server data sources, execute them
against a data source, use them in views or procedures, and publish them as TDV
database tables or Web services.
The XSLT procedure editor is a stand-alone, language-specific processor that lets
you compose and execute XSLT scripts.
Note: The XQuery and XSLT procedures cannot be used as resources in the
Any-Any transformation editor.
This topic contains:
• Creating an XSLT Procedure, page 281
• Designing Parameters for an XSLT Procedure, page 282
For information on how to define caching for your procedures, see TDV Caching,
page 465.
The procedure editor inserts “<!--” at the start, and “-->” at the end, of each
line you have highlighted.
4. Save the script.
For details on executing an XSLT procedure, see Executing a Procedure or
Parameterized Query, page 305.
Parameter Entry
When you click the icon to execute an XSLT procedure with parameter input, an
Input Values for <procedure_name> dialog box appears. It displays parameter
names down the left side, and prepared XML code fragment with a question mark
that you can replace with the input value.
You can replace the question mark with an input value, or check the Null box to
eliminate the XML code fragment.
Packaged Queries
Packaged queries let you use database-specific queries defined within the TDV
system and sent for execution on the targeted data source. Sometimes, you
already have a complex query written for a particular database and it might not
be feasible to rewrite that query in TDV SQL.
For example, your query might require a database-specific feature not available in
TDV or perhaps the query takes advantage of the database-specific feature for
performance reasons. In such cases, your database-specific query can be created
as a packaged query and subsequently be used in other queries.
Multiple database-specific SQL statements can be executed sequentially by the
same packaged query as long as only the last SQL statement returns a single
cursor with at least a single column output. See the section on Multiple SQL
Execution Statements in a Packaged Query, page 292.
Every packaged query is associated with a specific data source.
A packaged query is stored in TDV with the associated data source’s metadata,
and it functions like a stored procedure, accepting input parameters and
producing rows of data. It must have exactly one output parameter that is a
cursor with at least one column.
Because the TDV system cannot automatically determine the required inputs and
outputs of a database-specific query, it is necessary for the user to supply this
information.
For details on specifying parameters for a packaged query, see Specify Input
Parameters for a Packaged Query, page 287.
For details on executing a packaged query, see Executing a Procedure or
Parameterized Query, page 305.
This section includes:
• Creating a Packaged Query, page 285
• Specify Input Parameters for a Packaged Query, page 287
• Multiple SQL Execution Statements in a Packaged Query, page 292
This is the first input parameter in our example. If it is placed under result as a
part of the CURSOR output, use the left triangle button on the toolbar to move
it outside the output cursor. A right-arrow icon appears next to it.
A right-arrow icon denotes an input parameter; a left-arrow icon denotes an
output parameter.
f. Rename the parameter as ProductID, and click ENTER.
g. Add another input parameter named Status of the type String > CHAR.
7. Save the packaged query.
8. Execute the packaged query.
Executing a packaged query is similar to executing a parameterized query. For
details, see Executing a Parameterized Query, page 306.
Data Types
Packaged queries support the following input data types:
DECIMAL, DOUBLE, FLOAT, NUMERIC, BIGINT, BIT, INTEGER, SMALLINT,
TINYINT, CHAR, LONGVARCHAR, VARCHAR, DATE, TIME, TIMESTAMP
Input Substitution
The package query can define N inputs, with substitution patterns numbered
from {0} to {N-1}.
The pattern for a string is enclosed in single quotes. The first input value replaces
all the occurrences of {0}, the second input value replaces {1}, and so on.
Note: The backslash (\) is removed from the original query. Escaping the second
curly brace is optional.
WHERE
customer.name = {0:string-sql-literal} AND customer.id = {1}
Invalid Query A
The following query is invalid because substitution pattern input {1} is missing
from the query.
SELECT customer.name, customer.balance
FROM customer
WHERE customer.id = {0} AND customer.status = {2}
Invalid Query B
The following query is invalid because the database-specific query defines four
inputs, but the packaged query has defined only three input substitutions. The
fourth SQL input placeholder{3} makes the query invalid. You need to define one
more input parameter. Also the customer.zip is an integer that does not need to be
enclosed in single quotes.
SELECT customer.name, customer.balance
FROM customer
WHERE
customer.id = {0} AND customer.zip = '{1}'
AND customer.status = {2} AND customer.email = {3}
This statement can be used to define other unique character sets to comply with
any special data source query language requirements, but make sure the chosen
delimiter is not used within any string, which can cause an unintended break.
Using a separator like “;” causes problems if a semicolon also shows up inside a
string value substituted into the query.
Given definition of the multi-part separator, SQL statements are parsed and
executed sequentially until the last statement returns a result set.
Note: If you execute anything that modifies the state of the connection in a way
that survives beyond the current transaction, you can cause unexpected results
for another user when and if the connection is put back into the pool.
Parameterized Queries
Studio can create SQL SELECT statements that include named parameters in the
projections or selections. In the TDV system this feature is called a parameterized
query. Such queries are implemented as single-statement SQL scripts. The SQL
script code is automatically generated when you design a parameterized query
with the Model and Grid panels. For details on SQL script languages, see the TDV
Reference Guide.
The resources you can use in a parameterized query are tabular data and any
procedure that outputs either a set of scalars or one cursor.
A parameterized query lets you include {param NAME TYPE} structures in the
Grid panel to parameterize parts of the view. The resource itself is actually a SQL
script with a model, so it functions like a procedure within the system.
The following sections describe how to create and execute a parameterized query:
• Quick Steps for Creating a Parameterized Query, page 293
• Add Input Parameters, page 294
• Adding a Parameter to a SELECT Clause, page 295
• Adding a Parameter to a WHERE Clause, page 296
• Adding a Parameter to a FROM Clause, page 296
• Executing a Parameterized Query, page 306
8. Type in the value for the parameter and click OK to execute the query.
3. Drag the transformation that requires input parameters and drop it onto the
Model panel. For information on transformations, see Using Transformations,
page 311.
The Input Parameters for <transformation name> window opens.
In this window, the upper section displays the input parameters. The icon
next to each parameter represents its type.
4. To display the parameter’s definition, click the Show Definition button.
The window displays the parameter’s name and type.
In the Value group box, all options are mutually exclusive. Null specifies a
NULL value; if it is disabled, the parameter cannot be NULL. Literal applies a
value selected in the upper section. Query Parameter accepts a name for the
parameter.
Each parameter in the upper section is displayed with its initial value. If the
parameter is nullable, the default is null. If the parameter is not nullable, the
default value depends on type: zero for numeric, an empty string for a
character, and the current date for a date.
5. Select the Query Parameter radio button, type the name, and click OK.
Do not include string values in quotation marks. The parameter name must be
unique and begin with a letter. It can contain any number of alphanumeric
characters and can include an underscore. It must not contain any spaces.
6. Click OK to save the entry and close the window.
The name of the parameter appears as the input parameter (IN) and is also
included in the FROM clause.
7. If you want to view the auto-generated SQL script, click the SQL Script tab in
Studio.
8. Save the parameterized query.
The name of the parameter (ticker) appears as the input parameter (IN) and is
also included in the FROM clause.
TDV supports the introspection of data sources that contain stored procedures
written in SQL. Procedures stored in data sources are referred to as physical stored
procedures.
Note: In this document, the terms “physical stored procedure” and “stored
procedure” are use interchangeably.
Physical stored procedures have certain capabilities and limitations:
• They can have scalar parameters, but the direction of scalar parameters might
not always be detected.
• They can return one or more cursors that represent tabular output.
• Array-type parameters are not allowed in published stored procedures.
• If they are in a physical data source, you can manually specify cardinality for
them. See Creating Cardinality Statistics on a Data Source, page 568.
• A stored procedure that returns only one cursor can be used within a view.
• A stored procedure that returns multiple cursors or scalar parameters cannot
be included in a view.
• TDV supports the introspection of some stored procedures for some data
sources.
• Cursors are not introspected for DB2, SQL Server, and Sybase databases.
• In Oracle data sources, the cursor type is introspected as type OTHER. If you
do not change the cursor’s signature but execute the Oracle stored procedure,
cursor values are interpreted as binary data. However, if you edit the stored
procedure and define the cursor’s signature, its output is displayed in tabular
form.
To edit stored procedures and define cursor signatures, see:
• Editing a Stored Procedure in an Introspected Data Source, page 299
• Editing a Stored Procedure in Microsoft SQL Server, page 302
• Editing a Stored Procedure in DB2, page 302
7. Add more child parameter names and types to cursorParam one at a time, in
the following order, and specify each parameter’s type as noted.
ord_num (type NUMERIC)
qty (type SMALLINT)
title_id (type CHAR)
discount (type DOUBLE)
price (type DECIMAL)
total (type DOUBLE)
4. Click Design By Example on the toolbar, and use the scroll bar in the Design
By Example window to view the two cursors.
Use the expand/collapse symbol on the left of the parameter name to
expand/collapse cursor display.
5. Click OK to include both the cursors for projection.
6. Save the stored procedure.
For details on executing stored procedures, see Executing a Stored Procedure,
page 307.
Rebinding a Procedure
A procedure depends on one or more underlying sources, and the procedure is
considered “bound” to those underlying sources.
The following procedure bindings are available:
• A SQL script can be bound to a table or a procedure.
• A transformation can be bound to hierarchical data (XML or WSDL).
• A packaged query can be bound to a data source.
Rebinding a procedure is useful:
• If after creating a procedure you decide to rebind the procedure to different
sources.
• If an execution error occurs because a source no longer exists and you want to
rebind the procedure with a new source.
If you modify a procedure or its dependencies, the binding is automatically
updated when you save the script, and the Rebind panel reflects the change. For
example, if your procedure has a data source dependency named ds_orders, and
you rename it ds_orders_renamed and save the script, the Rebind panel lists the
dependency as ds_orders_renamed.
If you call this procedure for the first time in a transaction with a given input
value, the procedure runs as usual. If you call the same procedure again with the
same input value in the same transaction, Studio returns the results of the
previous call instead of re-executing the procedure.
For this example, supply a value for the defined input parameter,
desiredProduct.
3. Click OK.
When the parameterized query is executed, the product name, ID and
description are displayed.
4. In the Result For result panel:
a. To view the next fifty rows in a large data set, click Next.
b. To view the details of a specific row in a Result Details dialog, select the
row and click Details. To see details for an adjacent result, click Prev or
Next. For some extremely large results, such as those generated by using
EXTEND on an array to generate BIGINT number values, some of the data
might be truncated without a visual indication by Studio.
c. To save the results, use Save to File.
d. To clear what is displayed in the tab, click Clear.
Note: Result for result appears on the tab of the Result panel because result
was the name used for the output parameter in the query. If you assign a
different name to the output cursor, such as productFound, the tab would
show Result for productFound.
2. Write and compile the Java code, and save the class file in a directory.
See Sample Code Using MultipleCursors, page 308.
3. From the directory where you saved the class file, run the following
command:
java -classpath <path_to_csjdbc.jar_file>/csjdbc.jar\;.
<MultipleCursors>
MultipleCursors is the name of the class file, and csjdbc.jar is the name of the
JAR file containing the class file.
"jdbc:compositesw:dbapi@localhost:9401?domain=composite&dataSource
=ds";
"cs.jdbc.driver.CompositeDriver";
try {
executeProcedure();
} catch (SQLException ex) {
ex.printStackTrace();
return;
}
}
printResultSet((ResultSet)stmt.getObject(1));
printResultSet((ResultSet)stmt.getObject(2));
stmt.close();
conn.close();
}
You can use Studio or you can use the command line to complete the steps. These
steps help make use of database procedures that have cursor outputs that might
not be detectable by TDV until after the procedure has been executed by TDV.
These steps are helpful if you have Oracle procedures with cursor outputs that
you want to use in further data customizations within TDV.
• Provide the path to the procedure with the output cursors that you want TDV
to execute.
• Provide valid input values for the procedure that you want TDV to execute.
• Set commitChanges to TRUE to save the output cursor metadata to the TDV
repository for the procedure.
Using Transformations
This topic describes how to design transformations that are saved as procedures
to query and manipulate data stored in a data source
The data for modeling can come from a mix of relational (tabular) and
hierarchical (such as XML) sources. Having tools to help you transform your data
is helpful when designing your data virtualization layer.
This section assumes that you understand XML, XQuery, and WSDL.
• Studio Transformation Types, page 312
• Creating an XML, XSLT, or Streaming Transformation, page 314
• Creating an XQuery Transformation, page 320
• Upgrading XML to Tabular Mapping Transforms, page 329
• Executing a Transformation, page 331
• Converting XML to JSON using Design By Example, page 332
• JSON XML List Formatting Example, page 333
Information about the Any-Any Transformation Editor is in:
• Using the Any-Any Transformation Editor, page 335
Type Description
Any-Any Transformation The Transformation Editor allows you to transform data from any type
of data source structure to any type of target.
The transformation of the data will happen in real time. Where
possible the queries are pushed to the data source for execution. Data
will be streamed, if possible, from the source to the target.
XML to Tabular Mapping This transformation accepts an XML or WSDL source as input, and
generates a tabular mapping to the elements in the source schema.
The mapping can be thought of as producing a set of rows, each of
which represents a node in the original XML document.
Type Description
XSLT Transformation The XSLT transformation lets you define how the XML data should be
transformed using a graphical editor in Studio. It accepts an XML or
WSDL source as input. Complex transformations are possible by
writing custom XSLT. The only requirement is that the XSLT must
produce tabular data with the following structure:
<results>
<result>
<column_one>a</column_one>
<column_two>b</column_two>
<column_three>c</column_three>
</result>
<result>
<column_one>d</column_one>
<column_two>e</column_two>
<column_three>f</column_three>
</result>
</results>
Streaming Transformation The Streaming transformation lets you define how the XML data
should be transformed using a graphical editor in Studio. This
transformation is useful for transforming a large amount of XML data.
This transformation does not require the entire XML source document
to be realized in memory.
XQuery Transformation The term XQuery stands for XML Query, a query that returns results in
the form of an XML document.
An XQuery transformation is a TDV resource that reads data from
related sources, such as tables, views, and procedures, and assembles
that data into an XML document that conforms to a user-specified
schema called the target schema. An XQuery transformation looks and
acts like any other procedure in the TDV system.
Underlying an XQuery transformation is an XQuery procedure that is
run by an XQuery processor.
The target schema for the output XML document can be created or
edited in the definition set editor.
Studio also provides a graphical editor for mapping the target schema
and tabular sources that are to supply the data for the XQuery in the
transformation.
4. Locate and select an appropriate XML source for the transformation in the
displayed tree.
5. Type a name for the transformation in the Transformation Name field.
6. Click Finish.
When you click Finish, the transformation is added to the resource tree, and
the editor for the transformation opens in the right pane.
If the transformation is of the type XML to Tabular Mapping, it is ready to be used
like any other procedure. In other types of transformations—Streaming and
XSLT—the source elements and target output columns need to be mapped. (See
Mapping Source Data to Target Output Columns (XSLT or Streaming), page 316.)
To manually create target columns and map them to source XML data
1. If necessary, open the transformation.
The transformation editor displays the Data Map panel.
2. If necessary, expand the nodes in the Source column to display all XML source
definitions.
3. In the Target column, select the outputs node or an existing column, click Add
( ) in the toolbar, and select the data type for the output column from the
drop-down list.
An output column with a default name is created. You can create as many
target columns as you need.
4. Connect a source and a target by selecting them individually and clicking the
Create Link button.
5. Save your edits.
To automatically create target columns and map them to source XML data
1. If necessary, open the transformation.
The transformation editor displays the Data Map panel.
2. If necessary, expand the nodes in the Source column to display all XML source
definitions.
3. Select the source in the Source column.
4. Click the Create Link And Target button.
A target with the same name and data type of the source is created in the
Target column, and a link is also created between the source and the target.
Selecting multiple sources and clicking Create Link And Target creates a
separate target and link for each source. Selecting a source and clicking Create
Link And Target several times creates that many targets and links pointing to
the same source.
5. Save your edits.
To add target output columns through the XSLT and Outputs panels
1. If necessary, open the transformation.
2. Edit the XSLT in the XSLT panel, as needed.
If you edit the XSLT in the XSLT panel, the Data Map panel is permanently
disabled, but the Outputs panel becomes editable (Change Type, Delete and
Rename on a context menu) and the Add button in the Outputs panel toolbar
is enabled.
3. Optionally, on the Outputs panel add target output columns, the same way
you would design parameters for a SQL script on the Parameters panel.
4. Update the XSLT in the XSLT panel to correspond to any output elements you
added.
Tabs open in the lower part of the editor showing the settings for a source: a
join XPath expression (Join tab), a filter XPath expression (Filter tab), a sort
order (Sort Order tab), or an input parameter (Inputs tab) as in a procedure.
The Schema tab shows the schema for the selected source.
5. Specify a join relation between the current source and one of its direct
ancestors in the Join panel.
For example, PurchaseOrderID = $INV/PurchaseOrderID means that
PurchaseOrderID in PurchaseOrder is joined to PurchaseOrderID in the
ancestor InventoryTransactions.
6. Specify a filter on the current source (the equivalent of a WHERE clause in a
SELECT statement) in the Filter panel.
8. Specify the values for a source’s inputs if the source is a procedure with input
parameters in the Inputs panel.
9. Click the icon in the Source Settings column at any time to view or edit the
setting.
The Source column (available for tables) lets you specify the tabular sources that
provide data for the resulting XML document. Each source specified in this
column corresponds to a top-level element (non-leaf node) in the target schema.
Resources specified in the Source column exist in a “scope” relative to the target
XML document to which they provide data. At a particular location in the
document, the source scope is defined as:
• The current resource (table, view, or procedure)
• All the resources that are directly above this resource in the document (direct
ancestors)
• The input to the XQuery
The scope defines what resources are available to the value expression.
The scope for Transaction is its own source, the inventorytransactions table, and
the global input SupplierName.
The scope for PurchaseOrder is its own source, the purchaseorders table, its direct
ancestor, the inventorytransactions table, and the global input SupplierName
(line #2).
The scope for Supplier is its own source, the suppliers table, its direct ancestors,
the purchaseorders and inventorytransactions tables, and the global input
SupplierName (line #3).
The scope for Product is its own source the LookupProduct procedure, its direct
ancestor inventorytransactions table, and the global input SupplierName (line
#4). The sources (purchaseorders and suppliers) in the scope of its sibling or peer
resource (PurchaseOrder) are excluded from the scope of Product.
The Source Alias column (available for tables) lets you specify aliases for the
sources in the Source column. The aliases are used when join conditions, filters,
and inputs are supplied for a source value. This column is populated
automatically when a resource is added to the Source column, but it can also be
edited as a text field.
The Source Settings column (available for tables) displays icons indicating that
certain settings have been specified for the corresponding source. When clicked,
these icons display the corresponding settings in the lower section of the editor.
For example, you can click the Schema icon to display the schema for the
corresponding source.
Multiple input parameters are shown at the end or the drop-down list, in the
order in which they were defined in the Inputs panel.
To specify the sources, values, and other settings for target sources in the
XQuery transformation
1. In the Model panel, specify the sources, target values, and source aliases to
provide the data and constraints for the output XML document.
In the current example, the top-level elements in the Target column are
Inventory Transactions, Transaction, PurchaseOrder, Supplier, and Product.
You can specify a source for each of these elements.
2. To specify a source:
— Double-click the Source cell corresponding to the top-level element in the
Target column.
Or
— Select the top-level element in the Target column and click the Choose
Source toolbar button.
3. In the window that opens, select the source table and click OK.
4. If you want to change the automatically assigned alias, type a different alias in
the Source Alias column for the source you just added.
The aliases are used when join conditions, filters, and inputs are supplied for a
source value.
5. To specify a value for a target, click the Target Value cell corresponding to the
target in the Target column, and select a value from the drop-down list, as
follows.
For a target value, you can supply a literal value or a system function to be
evaluated at runtime. Literal values must be enclosed in single or double
quotes.
6. To specify a source setting, click the appropriate icon in the Source Settings
column, or click the Show Source Settings toolbar icon to display the source
settings panel in the lower section if it is not already visible. Then open the
Join, Filter, Sort Order, or Inputs panel and supply the settings.
PurchaseOrderID = $INV/PurchaseOrderID (in the Join panel) means that
PurchaseOrderID in PurchaseOrder is joined to PurchaseOrderID in the
ancestor InventoryTransactions. The syntax for the value expression
$INV/PurchaseOrderID contains a reference to the alias (INV) for the parent
source inventorytransactions preceded by the dollar sign ($).
7. To supply a value for an input parameter, if the source is a procedure that has
input parameters, use the Inputs panel (in the source settings panel).
a. Click the Inputs tab to open the Inputs panel.
b. Click the Inputs icon in the Source Settings column.
In the Inputs panel, the names of the inputs are displayed, and the Is NULL
check box is selected by default.
c. Click the row in the Value column, and select an appropriate value for the
input parameter in the drop-down list.
If you have existing XML to Tabular Mapping transforms that you would like to
update to be compatible and editable using the Any-Any Transformation Editor,
you can follow the steps in this section to accomplish that. There are two different
upgrade paths:
• Upgrading and Creating a Backup of an XML to Tabular Mapping Transform,
page 330
• Creating an Upgraded XML to Tabular Mapping Transform, page 330
5. Select the top radio button, accept or rename the transform to a name that will
be easy to understand later.
6. Click OK.
The Studio navigation tree is updated. Your transform is converted and a new
backup of the old XML to Tabular Mapping transform is add to the tree.
5. Select the bottom radio button, and accept or rename the transform to a name
that will be easy to understand later.
6. Click OK.
7. Click Refresh All to refresh the Studio navigation tree with the upgraded
transform.
8. Optionally, open the new transform and edit it for your current needs.
Executing a Transformation
To execute a transformation
1. In Studio, open the transformation.
2. Click the Execute toolbar button.
If using one of the standard transform methods doesn’t work for you for some
reason, it is possible that you can convert XML to JSON using the Design By
Example functionality that exists within TDV.
Formatting lists using JSON can be done in many ways. Typically, they are done
using a nested element in XML. The following example gives one option. There
are many others. Consulting a good JSON reference can help you determine how
best to format the list that you need to create.
"https://www.facebook.com/sacramentosports?ref=br_tf": {
"about": "Welcome to the official Sacramento Sport Facebook Page!",
"affiliation": "National Basketball Association, Western
Conference, Pacific Division",
"category": "Professional sports team",
"category_list": [
{ "id": "109976259083543", "name": "Sports Venue & Stadium" }
,
{ "id": "189018581118681", "name": "Sports Club" }
],
"checkins": 7149,
"is_published": true,
"location":
{ "street": "One Sports Way", "city": "Sacramento", "state": "CA",
"country": "United States", "zip": "95834", "latitude":
38.64906488403, "longitude": -121.51807577132 }
Before using the Transformation Editor there are a few concepts that you can
familiarize yourself with. This topic also covers the basic usage of the
Transformation Editor.
• Transformation Editor Concepts, page 335
• Using the Transformation Editor, page 340
• Caching Transform Data, page 375
Tutorials that walk you through the creation of several different types of Any-Any
transforms can be found in the TDV Tutorials Guide.
The Transformation Editor is an editor within TDV that can be used to define the
transformation rules for the source data sets. It can transform data from any
source structure to any target structure. The transformation of the data happens in
real time as data is streamed, if possible, from the source to the target. The source
and target are not persistent structures.
A transformation is a procedure in TDV and it can be used by other resources in
TDV as a procedure. When caching is enabled on a transform, it is cached in the
same way as any other procedure in TDV. It produces XQuery code that is used to
execute the transformation. The XQuery code is not directly editable.
The Transformation Editor can map and transform data to and from the
following:
• Relational (Cursors)
• XML (Hierarchies)
• Scalar (Procedures)
You can also browse to and drag existing resources defined in Studio into the
Transformation Editor. Concepts in other sections under this topic include:
• The Transformation Editor Window, page 336
• Transformation Editor Terminology, page 337
• About Transformation Editor Operations and Parameter Containers, page 337
• Transformation Editor Limitations, page 338
Transformation Editor
Palette Workspace
Parameter
Parameter Container
Container Operations
Parameters
Operation
Connectors
Name Description
Cast Use to convert one data type to another data type.
in Container for the input parameters for the transform. This container can be
empty and not connected to any object on the workspace.
out Container for the transformation outputs. This container cannot be empty.
Name Description
Query Use to join, filter, and group data.
Resource Use to add a table, view, XML file, SQL script, or other valid TDV resource to the
Transformation editor.
Union Use to combine the result sets of two or more items in your transform. All the
source inputs must have the same data type. Unions can be used to remove
duplicate rows between the items that are input.
not have the associated definition set, you can try re-introspecting or
recreating them to get it.
— XML unions (element-based) are not supported. However, XML choices are
supported and appear as a canonical Union type within the Transformation
Editor.
• Code Generation
— Query and Join hints are only applied to queries that will be optimized into
SQL.
— SQL push optimizations are only applied to query operations and the
sources they directly depend upon.
— There is no indicator of whether or not a query will be generated as XQuery
or SQL. You will need to look at the generated source code to see whether
it was optimized or not. In general, if a query's sources are relational, SQL
will be generated.
— No runtime schema validation is performed.
• Other Limitations
— There is no technique to perform element selection. Transformations that
make decisions based on element names are not supported.
— Cross joins are not supported.
— There is no way to handle case-sensitivity mismatches.
— There is no literal format for durations.
— The cast operation is not supported for casting to a complex structure. If
you need to cast the children of a structure, you must assign the structure
field by field and insert casts as appropriate for each of the fields. If the
structure is a sequence, you can insert a loop operation into your mapping
to iterate over all the elements and then provide the individual assignments
(with appropriate casting) to the target structure.
— There are several SQL functions that do not work well. These include:
CAST, CORR, COVAR_SAMP, STDDEV, STDDEV_POP, STDDEV_SAM,
SUM_FLOAT, VAR_POP, VAR_SAMP, VARIANCE, VARIANCE_POP,
VARIANCE_SAMP, XMLATTRIBUTES, XMLFOREST, XMLQUERY
— Creating placeholders for intermediate values is not allowed. The
workaround is to invoke a child transformation and use the outputs of that
transformation as a reference for the intermediate values you want to reuse.
This topic covers basic usage and some common use cases for the Transformation
Editor. Transforms can be used to convert data from one type of structure to
another, to insert calculations, and to otherwise change the data in ways that are
typical for data processing and analysis.
Data can be transformed between relational and hierarchical structures.
This section describes how to perform basic actions within the Transformation
Editor:
• Creating a New Transform, page 341
• Undoing and Redoing Actions in the Transformation Editor, page 342
• Zooming and Arranging the Model, page 342
• Editing Operations in the Transformation Editor Model, page 343
• Adding Resources, page 345
• Working with Connections and Loops, page 345
• Working with Operations, page 355
• Working with Operation and Parameter Container Details, page 362
• Deleting Operations in the Transformation Editor, page 368
• Working with Messages, page 369
• Editing Parameters on the Parameters Tab, page 371
• Viewing and Copying the XQuery Code From Your Transform, page 374
• Validating the Transform XQuery Code Using Studio, page 374
• Rebinding Transformation Operations, page 372
• Working with the Transformation Editor XQuery Tab, page 373
Use-case tutorial that describes using the Transformation Editor to combine the
various elements to create a procedure are included in the TDV Tutorials Guide.
— Zoom In
— Zoom Out
— Zoom to Fit Contents
— Full Screen
Your mouse scroll wheel can also be used to zoom in and out.
3. Continue working with the model.
Switches Expressions
When writing expressions, use the inputs of the operation that contains the
expression.
Loops Loop Iterator Editor Provides the name given to the loop. You can add
levels, define item filters, and use the arrows to
arrange the nesting of the levels.
Queries Query Editor Provides a complex visual modeling tool with three
tabs to help you define the query used within your
transform.
Resources Invoke Resource Provides the name of the resource, the path to the
Operation Editor resource in Studio, and a button that can be used to
navigate through a Studio resource tree of valid
resources for selection.
Switch Switch Case Editor You can use the Switch operation with one or more
pairs of expressions. The Switch function evaluates
each pair of expressions, and returns the result
corresponding to the first boolean expression in the list
that evaluates to True.
Cast Choose Data Type You can use the CAST function converts a value from
one data type to another.
3. Use Close, OK, or other mechanisms to close the editors when you are
finished editing.
Adding Resources
The Transformation Editor can map and transform data to and from relational,
XML, or scalar data structures.
You can browse to and drag existing resources defined in TDV into the
Transformation Editor.
Note: The XQuery and XSLT procedures cannot be used as resources in the
Any-Any Transformation Editor.
2. Click an operation handle, while holding your mouse button down, drag the
line to the row you want to connect it to.
3. Save your transform.
2. Click an operation handle, and while holding your mouse button down, drag
the line to the row you want to connect it to.
3. Save your transform.
A new box is drawn on the model to represent the loop. Use this new
operation to define the behavior of the loop logic.
TDV maps the two structures according to the names that match. For example:
TDV maps the two structures according to the names that match and adds any
extra parameters to the target of the auto map. For example:
'\\', '[', ']', '-', '(', ')', '.','^','$','+', '*','?', '{', '}',
'|', ':', '<', '>', '<', '='
Depending on what iterator rows you have connected, the loop editor might
have one or more iterator levels already defined.
2. To add a new iterator, click the green circle with a plus symbol in it.
A Name is the name given to the iteration level of the loop which matches the
name of the source input of the loop. The Level indicates the nesting level
associated with an iterator.
3. To define the filter logic, type text into the Item Filter Field. The item filter is
processed every iteration.
Value Description
True If the item filter evaluates to true, then the current loop entries used to
generate a target sequence item.
If the item filter is true, then the target sequence will contain as many items
as in the source sequence.
Complex Expressions If you provide a more complex expression, output is constrained to data
that meets the criteria specified.
For example, for totalPrice > 50.0, the loop only outputs entries where the
total price is greater than 50.
It is only valid for expressions to refer to the input of the loop.
4. Use the arrows to the right of the Item Filter fields to arrange the Filter Name
fields in the order that makes sense for the given loop logic that you need to
define.
5. Click Close to save your definitions and close the Loop Iterator Editor.
6. Save your transform.
6. Select the data type that you want your data to become.
7. Specify any additional data type details.
8. Click OK.
9. Save your transform.
About Expressions
An expression is a combination of one or more values, operators, and functions
that resolve to a value, like a formula. You can use expressions to return a specific
set of data. The Transformation Editor can accommodate expressions in queries,
loops, switches, and in the stand-alone Expression operation.
When writing expressions, use the inputs of the operation that contains the
expression. For example if the operation has an EMP input column named, when
you are defining the expression for that operation you can use the EMP input
column as a symbol within the expression.
Expression
Description
Components
Operators Expression operators are used to compute values, for example, with + or -.
Operators contain one or two arguments. The supported operators are:
NOT, AND, OR, <=, >, <, <>, >=, =, +, *, -, /, and function calls
Examples:
"customer", "{http://biz.com}customer", "biz:customer"
Expression Description
Components
Paths Paths can be used to refer to hierarchical elements of input parameters.
Paths are names separated by slashes (/).
Function Calls The Transformation Editor makes use of the following categories of
functions:
• Canonical—A function type that can be used regardless of the target
language.
• SQL—A function type that can be used only within SQL code.
Queries can be generated into SQL. Use the sql: prefix to specify a SQL
function in an expression.
• XQuery—A function type that can be used only within XQuery code.
Most operators are generated into XQuery. Use the xquery: prefix to
specify a XQuery function in an expression.
• Custom—A function type that is defined as custom within TDV.
Use the custom: prefix to specify a custom function in an expression.
The following canonical functions are available for use within expressions:
• Aggregate: AVG, MIN, MAX, SUM, COUNT
• Character: CONCAT, SUBSTRING, UPPER, LOWER, LENGTH,
TRANSLATE, REPLACE, MATCHES, CHARACTER_LENGTH
• Numeric: ABS, CEIL, FLOOR, ROUND
• Date: CURRENT_DATE, CURRENT_TIME
When creating a custom function, the Transformation Editor follows the same
rules as for TDV SQL. You create a procedure with one output and promote it as a
custom function within the TDV. Using it within a transform, requires the use of
the "custom" prefix. For example, if you made a custom function called
"amazing", then you'd invoke it within a transform expression using
"custom:amazing(x,y,z)".
XQuery functions must be invoked with the "xquery" prefix. For example, to
invoke the XQuery concat function, you would use "xquery:concat()". XPath
expressions are not supported.
Transform expressions don't allow dashes within symbol names. Replace dashes
(-) with underscores (_).
of the resources that have been added. For more information on similar
functionality, see the View editor documentation in the TDV User’s Guide.
8. Select the data type that you want your data to become.
9. Specify any additional data type details.
10. Click OK.
11. Connect the result side of the Cast operation with the column where the data
will be consumed next.
12. Save your transform.
The following model diagram shows a transform using Cast operations:
4. Connect at least one parameter from somewhere else in your transform to the
Switch operation.
5. Connect the Switch operation’s results to other operations within your
transformation.
6. (Optional) Add more source parameters by:
a. Click the Switch operation to select it.
b. Right-click and select Add Source or Add Parameter Insert from the
menu.
7. (Optional) Rename source or result parameters by:
a. Select the operational handle of the parameter you want to rename.
b. Right-click and select Rename “<parm_name>”.
Or, click Rename from the Transformation Editor Model tab toolbar.
c. Type the new name.
8. Define the conditions on the switch. Each condition is called a Case. Each
Case is evaluated in the order specified and the first case to evaluate to true is
returned in the result.
2. Review and modify the values for Min and Max occurrence.
3. Determine the setting you want specified for Nullable:
— Unknown
— True
— False
4. Click OK.
If the min and max values were changed, they are displayed next to the parameter
name.
3. Click Import Paths to obtain a selection window that you can use to navigate
through a list of valid Studio resources available to add as an imported
definition set.
4. Use the Select Resources dialog to select the additional definition set that you
want added to the Imported Paths list.
5. Click OK.
6. Determine if the order of the list should be changed and if any included paths
need deletion.
— To reorder paths, select a path and use the up and down arrows.
— To delete a path from the list, select the path and click the red X.
7. Close the Edit Imported Definition Sets dialog.
8. Save your transform.
The XML Namespace represents the root URL path that is used for XML-based
names. The Namespace is augmented with the specific names of things that are
then created.
For example, http://masterschool.com/transform could be the root path
assigned to your specific business_school_transform, which would result in the
following:
http://masterschool.com/transform/business_school_transform
3. To add a new prefix, use the green arrow button to create a new Namespace
prefix.
4. To edit an existing, double-click in the cell that you want to edits and begin
typing to edit the namespace text.
If you changed the text in the Prefix column and that name had been shown as
part of your transform, the new name should now appear in your transform
on the Model tab.
If you changed the text in the Namespace column, the actual namespace that
is used when the transform is executed is changed.
5. Click OK to save your changes.
Viewing Messages
To view messages
1. Open your transform.
2. Click Show Messages on the toolbar.
The messages pane opens in the split panel and displays messages for the
transform you are editing. In the following example, the messages button was
selected while on the Parameters tab, but messages can be viewed from other
portions of the Transformation Editor.
To view messages
1. Open your transform.
2. Click Show Messages on the toolbar.
3. Select a message in the message tab.
Use the up and down arrow buttons to move up or down the error message
list and change the focus in the Model tab.
The following graphic shows several parameters that exist for the in and out
containers.
To rebind a transform
1. Open the transform that appears to need rebinding.
Typically, Studio knows that a rebind issue has occurred and displays the Rebind
tab for the transform.
Rebind
2. If you need to modify a data source or other resource to point to the location
of the data on your machine, do so.
3. Click the Rebind button.
4. Navigate to the valid location for the resource in your Studio tree.
5. Click OK.
The Transformation Editor’s Caching tab gives you access to set up several of the
standard TDV caching features. Because TDV treats transforms as if they were
procedures, the caching features that you can enable are the features that are
generally supported for other TDV procedure resources. For information on
caching with TDV, see TDV Caching, page 465.
The procedure caching process uses one storage table for each output cursor and
an additional storage table for any scalar outputs. For example, a procedure with
two INTEGER outputs and two CURSOR outputs would use three tables:
• One for the pair of scalars
• One for the first cursor
• One for the second cursor
If a procedure has input parameters, the cached results are tracked separately for
each unique set of input values. Each unique set of input parameter values is
called a variant.
A procedure cache that uses non-null input parameters must be seeded with at
least one variant from a client application other than Studio, for the Cache Status
to change from NOT LOADED to UP. Using the Refresh Now button does not
change the status of a cache that is not loaded. Even if procedure caching
configuration is correct, the status does not show that it is loaded until a client
seeds the cache.
When a procedure cache is refreshed, all the cached variants already in the table
are refreshed. If no variants have yet to be cached, then nothing is refreshed or
only the null input variant is refreshed. You can refresh a procedure cache from
Studio or using the RefreshResourceCache procedure. (See the TDV Application
Programming Interfaces Guide.)
Definition Sets
This topic describes definition sets including what they are, how to define them,
and how to use them in TDV.
These topics are covered:
• About Definition Sets, page 377
• Creating a Definition Set, page 378
• Using a SQL Definition Set, page 385
• Using an XML Definition Set, page 386
• Using a WSDL Definition Set, page 387
A definition set is a set of definitions you create that can be shared by other TDV
resources. A definition set includes SQL data types, XML types, XML elements,
exceptions, and constants. You can refer to the resources in a definition set from
any number of procedures or views. The changes you make in a definition set are
propagated to all the procedures or views using the definition set. You cannot
publish a definition set. The definition sets you create are reusable and can be
shared with any user with appropriate access privileges in the system.
TDV supports three types of definition sets:
• XML-based
• SQL-based
• WSDL-based
TDV supports these definition sets with the following data types and arrays of
those types:
BINARY, BLOB, VARBINARY, DECIMAL, DOUBLE, FLOAT, NUMERIC, BIGINT,
BIT, INTEGER, SMALLINT, TINYINT, CHAR, CLOB, LONGVARCHAR, VARCHAR,
DATE, TIME, TIMESTAMP, CURSOR, ROW, XML
Studio has an editor for building and editing definition sets. The definition set
editor opens automatically when you create a definition set. For details, see
Creating a Definition Set, page 378. At any time, you can open the definition set in
the editor by double-clicking the definition set in the resource tree.
You can create a definition set anywhere in the Studio resource tree except within
Data Services. For other details, see Locating a Container for a TDV Resource,
page 40.
This section describes how to create SQL, XML, or WSDL definition sets. An XML
definition set can be created from either a schema or a raw XML file instance.
Studio adds a definition of the specified type, giving it the default name
NewTypeDefinition. For example, if you select String > CHAR, the new
definition would look like this:
b. Rename the newly added type definition by clicking in the Name field and
typing the new name. Or, you can right-click and select Rename from the
menu.
c. If the type is Complex and you have created multiple xmlElements, you
can move or indent a type definition by using the navigation arrows in the
editor’s toolbar.
d. Repeat to create as many type definitions as you need.
e. Save the definition set.
4. Optionally, add an existing definition set from the Studio resource tree by
following these steps:
a. Click the Add button and select Browse from the menu.
6. To add constants, select the Constants tab, click the Add button, and add
constants the same way you added types and exceptions.
The Constants panel with an example constant is shown below.
You can now reference the XML definition set and all of the XML types defined in
the definition set from stored procedures, XQuery procedures, and XSLT
procedures. See Using an XML Definition Set, page 386.
You can now reference the WSDL definition set and all of the XML types defined
in the definition set from stored procedures, XQuery procedures, and XSLT
procedures. See Using a WSDL Definition Set, page 387.
You can reference a SQL definition set and all of the data types defined in the
definition set from SQL scripts, stored procedures, XQuery procedures, and XSLT
procedures.
The procedure described here refers to the data types defined in a SQL definition
set named mySQLDef:
lastname CHAR(255)
The procedure used in this section uses an XML definition set named
XMLLib_Composite, which has the following XML schema and elements:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema elementFormDefault="qualified"
targetNamespace="http://www.compositesw.com/services/webservices/s
ystem/admin/resource"
xmlns:ns="http://www.compositesw.com/services/webservices/system/a
dmin/resource"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:simpleType name="Location">
<xs:restriction base="xs:string"/>
</xs:simpleType>
<xs:complexType name="Address">
<xs:sequence>
<xs:element name="street" type="ns:Location"/>
<xs:element name="number" type="xs:integer"/>
<xs:element name="suite" type="xs:integer"/>
<xs:element name="phone" type="xs:integer"/>
<xs:element name="fax" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
<xs:element name="CompositeAddress" type="ns:Address"/>
</xs:schema>
The procedure described here supplies values to the elements in the XML schema
in the definition set XMLLib_Composite (Note: The type elements defined in the
wsdltest definition set like TradePriceRequest and TradePrice are also referenced.,
page 389).
<ns:number>2655</ns:number>
<ns:street>Campus Drive</ns:street>
<ns:suite>Suite 200</ns:suite>
<ns:city>San Mateo</ns:city>
<ns:zip>94403</ns:zip>
<ns:phone>650-277-8200</ns:phone>
<ns:fax>650-227-8199</ns:fax>
</ns:CompositeAddress>';
END
You can use a WSDL definition set in a procedure or a view. The example here
shows how a WSDL definition set can be used in a procedure.
For example, if you reference the wsdltest WSDL definition set shown above,
you would reference it in a procedure as shown in this example, where it is
used for both input and output parameters.
Note: The type elements defined in the wsdltest definition set like
TradePriceRequest and TradePrice are also referenced.
Triggers
This topic describes how to define triggers that cause an action to be performed
under specified conditions.
• About Triggers, page 391
• Creating a JMS Event Trigger, page 396
• Creating a System Event Trigger, page 397
• Creating a Timer Event Trigger, page 399
• Creating a User-Defined Event Trigger, page 402
• Creating Email Alerts, page 403
About Triggers
• Views and procedures can be executed and the results can be sent to one or
more recipients.
ClusterServerShunned Generated when a cluster node fails to send a heartbeat signal to the
cluster timekeeper and has been shunned (temporarily excluded) from
the cluster.
FailedLoginSpike Any ten failed sign-in attempts within a ten minute period generate a
FailedLoginSpike event.
MetricRestoreFailure Generated when the metrics are trying to be restored, but they fail.
ResourceLock Generated when resource locks are set on a resource so that only one
person can modify the locked set.
ResourceUnlock Generated when a resource is unlocked so that any user with sufficient
rights and privileges can edit it.
Making a trigger based on a system event does not enable a disabled event.
All system events are enabled by default, but if they are disabled they require
a manual server configuration change to enable them. For more information,
see the TDV Administration Guide.
Triggers cannot enable a disabled data source.
• Timer Event—Lets you schedule periodic timed trigger events. Actions that
can be timed to start are: executing a procedure, gathering relevant data from
a data source for query processing, re-introspecting a data source, or sending
email with the results of a procedure/view execution.
To set a timer event, you define the start time, optionally specify a period or
window of daily or periodic activity, and set the frequency of the trigger
refresh to get the wanted periodicity. Triggers can be set to begin operation in
the future and end after a predefined activity period.
See Action Pane, page 672 for more information about the options for each
condition type.
Field Description
Connector The name of your JMS connector (containing the configuration for initial context
factory, JNDI Provider URL, and Queue Connection Factory) as it appears on the
CONNECTOR MANAGEMENT page in Web Manager.
Selector A JMS message selector allows a client to specify, by header field references and
property references, the messages it is interested in. Only messages whose header
and property values match the selector are delivered. For example, a selector
condition using SQL WHERE clause syntax might look like: 'uname'='admin'
an empty string, the value is treated as a null and indicates that there is no
message selector for the message consumer.
7. Specify the action type and the options for the action as described in Action
Types for a Trigger, page 395 and Action Pane, page 672.
8. On the trigger Info tab, specify the trigger properties. See Getting Information
about a Resource, page 42 for all fields except Maximum Number In Queue
which is described below and works different for JMS triggers:
Maximum Number In Queue—For JMS triggers, this field is always set to 0
by the TDV server and you cannot change it. This is done to indicate that each
JMS trigger can hold at most one message in memory at a time until it has
been fully processed before receiving another pending message from the JMS
server. That is, no message queuing is done inside TDV.
9. Save the trigger. If the Save option is not enabled, select File > Refresh All to
refresh the resources and enable the Save option.
There are many system events that can be set as conditions for a trigger action.
The APIs—GetEnvironment and SetEnvironment—that are available in
/lib/util/ can be called at any time to listen for a system-generated event.
These steps describe how to create a sample procedure, create the system event
trigger, and then test the trigger.
END||': '||CASE
WHEN (tname IS NULL) THEN 'NULL'
ELSE tname
END||' = '||CASE
WHEN (tvalue IS NULL) THEN 'NULL'
ELSE tvalue
END;
CALL Log(result);
END
When this procedure is executed by a system event, the results are written out
to the log.
Field Description
Execute Procedure To see how it works, specify the path to the procedure
CallsGetEnvironment in the Action section. Select Exhaust output
cursors, if the procedure is a SQL script that contains a PIPE cursor.
Reintrospect Data Source To see how it works, select the data source
/shared/examples/ds_orders as the data source to be re-introspected.
Send E-mail To see how it works, specify the path to the procedure
CallsGetEnvironment in the Action section.
7. In the Action section, enter the options for the action type. The options are
described in the section Action Types for a Trigger, page 395 and Action Pane,
page 672.
For example, if you want email notifications sent, type and confirm the email
addresses of the recipients to receive notification, as needed. You can specify a
list of email addresses in the address fields by separating each one with a
comma (,) or semicolon (;).
8. Click the Info tab.
9. Specify the Maximum Number In Queue field to the maximum number of
times this trigger is to be fired if multiple periods have elapsed while a
recurrence restriction prevents the trigger from firing. Most triggers should
fire only once when the next available active operation window, but there
might be cases when more than one trigger action should be initiated at the
next opening of the recurrence restriction. This number should not be
negative.
10. Save the trigger.
If the Save option is not enabled, select File > Refresh All to refresh the
resources and enable the Save option.
Most triggers should fire only once within the next available active operation
window, but there might be cases when more than one trigger action should
be initiated at the next opening of the recurrence restriction. This number
should not be negative.
9. Save the trigger.
If the Save option is not enabled, select File > Refresh All to refresh the resources
and enable the Save option.
When creating a user-defined event trigger, you first need to create a user-defined
event that can function as a condition for a trigger action. Then you create the
user-defined event trigger. Finally, you test the trigger.
Triggers are created like other TDV resources, but they cannot be published.
User-defined events are generated when the built-in procedure GenerateEvent (at
/lib/util/) is called with an event name and value as its arguments.
User-defined event triggers execute once per cluster. Timer-based triggers can be
designated as being cluster-aware and they fire on all cluster nodes on which the
user-defined event is generated.
3. In the trigger editor that opens on the right, enable the trigger by selecting the
Enable Trigger check box.
4. Select User Defined Event from the drop-down list in the Condition Type
section.
5. In the User Defined Event Name field, supply a name for the event.
You can use a regular expression, for example, run*, to match more than one
similar user-defined event name. For this example, you would supply the
name runAReport.
6. For Action Type, select the type of action to be triggered from the drop-down
list.
7. In the Action section, enter the options for the action type. The options are
described in the section Action Pane, page 672.
Select Exhaust output cursors, if the procedure is a SQL script that contains a
PIPE cursor.
For example, if you want email notifications sent, type and confirm the email
addresses of the recipients to receive notification, as needed. You can specify a
list of email addresses in the address fields by separating each one with a
comma (,) or semicolon (;).Click the Info tab. Specify the Maximum Number
In Queue field to the maximum number of times this trigger is to be fired if
multiple periods have elapsed while a recurrence restriction prevents the
trigger from firing. Most triggers should fire only once when the next
available active operation window, but there might be cases when more than
one trigger action should be initiated at the next opening of the recurrence
restriction. This number should not be negative.
8. Save the trigger.
If the Save option is not enabled, select File > Refresh All to refresh the
resources and enable the Save option.
Tip from an expert: The following configuration parameters are left over from
functionality that does not send email alerts: Email Addresses for CC, Enable
Email Events, Email Addresses.
Configuration
Parameter Description of Value Example
Choose System Event to collect information for typical TDV events including,
Metrics collection, caching actions, and request spikes.
Caching CacheRefreshFailure
CacheRefreshSuccess
Publishing Resources
This topic describes publishing of TDV-defined resources. You can publish almost
any object that you create in TDV, including views and procedures.
• About Publishing Resources, page 408
• About the TDV DDL Feature, page 409
• About the OData Feature, page 410
• Accessing REST-Based Web Services Using JSON, page 414
• Publishing Resources (Creation, Access, and Deletion), page 414
• Publishing Resources to a Database Service, page 416
• Publishing Resources to a Database Service with OData, page 417
• Configuring DDL for a Data Service, page 419
• Configuring TEMP Table Space for a Data Service, page 420
• Publishing Resources to a Web Service, page 421
• Web Services Security, page 439
• Security and Privileges on Published Resources, page 443
• Disable OData for Published Data Sources, page 444
For details on how client applications can query and consume published
resources, see the TDV Client Interfaces Guide. Connectors must be defined prior to
publishing resources over JMS. See “Configuring TDV Data Connections” in the
TDV Administration Guide.
In addition to publishing to a database data service, you can use the TDV DDL
(Data Definition Language) feature to manipulate tables in a physical data source.
Also included in this topic are:
• Managing Data Type Mappings for the TDV DDL Feature, page 409
• TDV DDL Limitations, page 410
• Netezza DDL Creation Limitations, page 410
When your client application creates a table in a data source, it uses the
connections and protocols established with TDV. A created table is immediately
added to the TDV resource tree and can be manipulated like any other table object
within TDV. TDV DDL CREATE TABLE supports the following features (as
supported by the underlying data source):
• Column name specification
• Column type specification (after consulting the capabilities framework to
determine type mapping appropriate to the data source)
• Column nullability specification
• Primary key specification
If your client application needs to DROP a table, the table is dropped from both
the data source and the TDV resource tree directly. No re-introspection of the data
source is needed to see the change within TDV.
This feature can be used to improve the performance of queries or reports,
because it lets you create and drop temporary tables as needed to optimize
actions.
Based on testing, we recommend that you use the following data type mappings
for custom adapters that you intend to use with MicroStrategy.
OData is a Web protocol that provides JDBC-like access for querying and
updating data from a variety of sources, including relational databases, file
systems, content management systems and websites. OData allows you to query a
data source through the HTTP protocol. Optionally, you can configure
transport-level security and authentication methods. TDV uses odata4j to parse
OData queries.
TDV supports the reformatting of results, and several system query options.
These standard OData syntax options are discussed on the OData website,
odata.org.
• OData Support Limitations, page 411
Reformatted Results
By default in TDV, all resources that are published as database data services are
available through OData in the XML-based Atom format. Client interfaces (the
applications requesting data) can tell OData to reformat the Atom results to JSON
using $format=json (instead of the default $format=atom) in the client interface
definitions. For example:
http://services.odata.org/OData/OData.svc/ds_orders?$format=json
A client that wants only JSON responses might set the header to something
similar to this:
GET /OData/OData.svc/ds_orders HTTP/1.1 host: services.odata.org
accept: application/json
Expand Option
The expand option retrieves records and includes their associated objects. For
example,
http://services.odata.org/OData/OData.svc/ds_orders/?$expand=order
s
For each entity within the product’s entity set, the values of all associated orders
are represented in-line.
The corresponding TDV SQL query would be:
SELECT * from
/shared/OData/ODataDemo/DemoService/Products('?$expand=orders')
Filter Option
The filter option retrieves only records that meet certain criteria. For example,
http://services.odata.org/OData/OData.svc/ds_orders/?$filter=order
id gt 1000
This request is used for query and return only records whose orderid value is
greater than 1000.
The corresponding TDV SQL query would be:
SELECT * from
/shared/OData/ODataDemo/DemoService/Products('$filter=orderid gt
1000')
OrderBy Option
The orderby option arranges the returned records in the order of a designated
entity’s values. For example,
http://services.odata.org/OData/OData.svc/ds_orders/?$orderby=cust
omerid
Skip Option
The skip option retrieves records by skipping a specific number of rows. For
example,
http://services.odata.org/OData/OData.svc/ds_orders/?$skip=2
This request returns all records except the first two rows.
Top Option
The top option retrieves the first n records. For example,
http://services.odata.org/OData/OData.svc/ds_orders/?$top=4
Select Option
The select option retrieves values in the column with the specified name. For
example,
http://services.odata.org/OData/OData.svc/ds_orders/?$select=shipn
ame
All returned records display only the values in the column named shipname. The
other columns display [NULL].
The corresponding TDV SQL query would be:
SELECT * from /shared/OData/ODataDemo/DemoService/Products('
?$select=shipname')
Combinations of Options
You can concatenate system query options by putting ampersands between them.
For example,
http://services.odata.org/OData/OData.svc/ds_orders/?$filter=order
id gt 1000&$top=2
This request return the first two rows whose orderid is greater than 1000.
The corresponding TDV SQL query would be:
SELECT * from /shared/OData/ODataDemo/DemoService/Products('
?$filter=orderid gt 1000&$top=2 ')
JSON data objects, including array objects, would need to be created and any
nested array objects need to be present before the results can be transferred.
Arrays in JSON require that the data be held in memory, which for extremely
large arrays can cause out of memory errors. If there are no arrays, then TDV will
stream the data for better performance.
Typically, resources that contain XML data types might involve hierarchical data
resulting in JSON arrays. So, you should avoid consuming the data from these
resources through Web services using REST with JSON format if the resource you
are consuming contains one of the following:
• The type of output parameter, or the column type in the cursor binding in one
of output parameters is an XML type that might result in JSON array if the
amount of data is large enough to require an array.
• Extremely large amounts of data from an XML formatted object.
For example, the following procedure is likely to perform better if it is consumed
through an XML format rather than a JSON format:
proc (out MyFavFunct XML)
Review the views and procedures that you want to publish to determine if they
contain XML types that will result in JSON arrays or cursors with XML data
types. Depending on what you find, determine the best way for you to consume
your resources.
See Publishing Resources to a Web Service, page 421 for more information about
consuming resources using JSON.
When selecting names for your data services and resources, consider that special
characters can make it difficult or impossible for clients (including Business
Directory or Deployment Manager) to display or consume the published
resources. For example, Business Directory cannot see the data for a data source
with the name “dsInv/publish”.
You can do the following for published resources:
• Access its underlying source as it exists in Studio.
• View the resource content if it is a table or view.
• Rebind to a difference source. See Rebinding a View, page 248 for details.
To publish resources
1. From the resource tree, select one or more resources that you want to publish.
2. Right-click and select Publish.
3. Select from the list of existing data services or create a new data service:
a. Select Databases or Web Services.
b. Click Add Composite Service or Add Composite Web Service. Depending
on what is selected, the buttons are active or unavailable.
c. Type a name for the new data service that you want to use to hold your
published resource.
d. Click OK. The new data service is created.
e. Click OK to save the resource into the new data service.
4. Finish definition of your data service by using the instructions in one of the
following sections:
— Publishing Resources to a Database Service, page 416
— Publishing Resources to a Web Service, page 421
To make tabular data and procedures available for querying and manipulation as
a TDV database service, you need to publish them to a container (folder) under
Data Services/Databases.
You can create as many TDV databases and schemas as you need, and organize
them in multiple folders. The published tabular data can be queried like any
normal database.
This section describes general publishing instructions. If you would like to define
one of the optional TDV DDL or OData features for your database service, see the
following sections:
• Publishing Resources to a Database Service with OData, page 417
• Configuring DDL for a Data Service, page 419
Considerations
If you plan to consume your published data from an Excel spreadsheet, you must
publish the data tables using catalogs. For example, create a catalog under Data
Service/Databases/<catalog> so that the data from the resources there can be
used through an Excel client interface.
When selecting names for your data services and resources, consider that special
characters can make it difficult or impossible for clients (including Business
Directory or Deployment Manager) to display or consume the published
resources. For example, Business Directory cannot see the data for a data source
with the name “dsInv/publish”.
Validate Client When this is checked, the client must present a certificate to the server during
Certificate authentication, and the server must validate it.
Authentication
Options Description
Basic Allows users to present plain-text username and password credentials for
authentication.
Negotiate Enables client and server to negotiate whether to use NTLM or Kerberos
Protocol (thus making Kerberos available).
5. In the Unsupported section, review the list of resources that are not available
through OData.
The Reason column explains why resources are unsupported. Determine if
you want to fix any issues that caused them to appear on this list. For
example, you could add primary keys to resources that do not have them
(Defining Primary Key for a View or Table in the Indexes Panel, page 238).
6. Save your changes.
7. Optionally to use multiple columns for filtering, specify one or more columns
for filtering, see Defining Primary Key for a View or Table in the Indexes
Panel, page 238.
The TDV DDL feature allows TDV to use DDL to directly create or drop tables in
a physical data source. On the TDV DDL tab you can specify the path to a
relational container (catalog or schema) for DDL submission. The DDL will be
submitted by TDV clients (that is by a client that uses JDBC, ODBC, or ADO.net
protocols).
Having multiple locations within a data source for the manipulation of data,
helps avoid conflicts when several users are attempting to access the same tables.
This allows temp table creation in one or more data sources where related query
work is being performed. In this way you can take full advantage of the TDV
optimizer’s capability to push an entire query down to a database when all joined
objects reside in a common data source. Data sources have a variety of proprietary
optimizations, so by being able to determine where a query will run, you can
determine the most optimal data source for your query to run.
For information on Netezza DDL creation limitations, see About the TDV DDL
Feature, page 409.
Data services support the following DDL statements:
• CREATE TABLE AS SELECT
• DROP TABLE
Temporary tables come in handy for a variety of business integration tools. The
temporary tables allow the tools to store filters for visualizations because of their
simplicity. Temporary tables also create space to accommodate DDL capabilities
that you might also configure for your data service. Specifically, the temporary
tables can streamline the creation and removal of the table for use during a
working session.
You can use Studio to configure a location where you would prefer to have
temporary tables created for your data services.
You can create the temporary tables in Oracle, Netezza, SQL Server, Teradata,
MySQL, and DB2. Support for creation of other TEMP tables depends on the
physical data source where you want to create them.
For information on limitations, see About the TDV DDL Feature, page 409.
To make tabular data and procedures available for querying and manipulation as
a TDV Web service, you need to publish them to a container (folder) under Data
Services/Web Services.
You can create as many TDV databases and schemas as you need, and organize
them in multiple folders.
This section describes how to publish views and procedures as WSDL (Web
Services Definition Language) services. The following topics are covered:
• About WSDL and Web Service Data Services, page 421
• Publishing a SOAP Data Service, page 422
• Publishing a Contract-First WSDL and SOAP Data Service, page 428
• Moving the Namespace Declaration From Elements to the Soap Header,
page 431
• Publishing a REST Service, page 432
Web services are Web-based applications that dynamically interact with other
Web applications using an XML message protocol such as SOAP or REST. Web
services can be bound to any message format and protocol. The binding profile
allows specification of the HTTP transport protocol, literal encoding scheme, and
document message style.
Considerations
For WSDL data services use explicit cursor definition. The cursor should name
the column.
When selecting names for your data services and resources, consider that special
characters can make it difficult or impossible for clients (including Business
Directory or Deployment Manager) to display or consume the published
resources. For example, Business Directory cannot see the data for a data source
with the name “dsInv/publish”.
If, for example, you published CompositeView to the Web service you just
created, and the Web service is still opened to the SOAP panel, it would be
updated to display the Web service properties, operations, and parameters.
7. In the Service portion of the screen, type or select values for the following
properties.
If a property requires a value, a default value is already assigned to it.
Requi
Properties red Description
Enabled Yes True (the default value) tells the system you want to use the
information that you have defined on this tab.
Target Namespace Yes Replaces the default value with the location of your SOAP
installation.
Binding Name Yes A unique binding name that can be referenced from elsewhere in
the WSDL definition.
Enable Contract First Yes The default value is false. Select true to point to your predefined
WSDL and WSDL operations. For information on how to create the
service, see Publishing a Contract-First WSDL and SOAP Data
Service, page 428.
Enable MTOM Yes Valid values are true or false. Set to true to enable the use of
Message Transmission Optimization Mechanism (MTOM), a
method of sending binary data to and from Web services. After it is
enabled you must complete the definition of binary parameter
types.
Security Policy Yes List shows all of the available policies that are defined under
localhost/policy/security.
SAML Validator No Name of the Java class that lets users validate SAML assertions.
Class
Endpoint URL Path Yes The path of the endpoint URL for this Web service. Default is the
name (with spaces removed) you used for the Web service.
SOAP11 Context Url The path in which a SOAP object should appear. Associating an
Path object to a certain page on your site.
SOAP12 Context Url The path in which a SOAP object should appear. Associating an
Path object to a certain page on your site.
HTTP/SOAP 1.1 The URL of the HTTP SOAP 1.1 WSDL file that can be used to
access the operation.
HTTP/SOAP 1.2 The URL of the HTTP SOAP 1.2 WSDL file that can be used to
access the operation.
HTTPS/SOAP 1.1 The URL of the HTTPS SOAP 1.1 WSDL file that can be used to
access the operation.
HTTPS/SOAP 1.2 The URL of the HTTPS SOAP 1.2 WSDL file that can be used to
access the operation.
If you have published views or procedures to this service, you can perform
following steps.
8. From the upper part of the Operations section, select a Web service table,
view, or procedure that is available for use.
9. In the lower part of the Operations section, type or select values for the
properties listed in the following table.
Property Description
Security Policy Lists all available policies defined under localhost/policy/security.
SOAP Action Typically, this field is unused. If this value is defined as part of your
contract-first WSDL, the value is displayed in this field.
Input Message
Wrapper
Element Unique name to give to the wrapper element. For example, if you type
Name wrappedOrderParams here, the WSDL XML includes an element:
<wrappedOrderParams> </wrappedOrderParams>
Part Name Name of the message part that represents the wrapper element.
Property Description
Output Message
Wrapper
Element Unique name to give to the wrapper element. For example, if you type
Name wrappedOrderParams here, the WSDL XML includes an element:
<wrappedOrderParams> xxx </wrappedOrderParams>
Part Name Name of the message part that represents the wrapper element.
10. Repeat for each operation or view you want to define for the Web service.
11. If your operation has parameters, go to the upper part of the Parameters
portion of the screen and select a parameter from the list.
12. In the lower part of the Parameters portion of the screen, you can view and
edit the properties and values of the parameter you selected in the upper part
of the Parameters section.
Property Description
Name Name of the parameter. Cannot be edited.
Element Name Fully qualified name to give to the input parameter, output parameter, or
output cursor; or name (relative to the cursor path) to give to the column.
Property Description
Direction (Not listed for columns) Indicates the direction of the cursor. Cannot be
edited.
Binding Location When the selected operation is a procedure, you can define binding
locations. If the message is BARE, exactly one parameter can use the BODY
binding location.
The available locations for input are:
BODY, HEADER
The available locations for output are:
BODY, FAULT, HEADER, HEADERFAULT
Row Element Name (Cursor only) Unique name to give to the row element of the cursor. For
example, if you type myFavoriteRow here, the WSDL XML includes an
element:
<myFavoriteRow> </myFavoriteRow>
Row Type Name (Cursor only) Type value of this row element.
Cursor Type Name (Cursor only) Unique name for the type that defines the global cursor
element.
MTOM CONTENT If the parameter is binary, you should set the parameter values to
TYPE VALUE_TYPE_BINARY, VALUE_TYPE_BLOB, or VALUE_TYPE_CLOB.
The content-type values are available from a list of values. The default is
application/octet-stream (a binary file). This header indicates the Internet
media type of the message content, consisting of a type and subtype, that is
accepted.
13. Repeat for each parameter that is to be defined for the Web service.
14. Save your definition.
3. Select the Web Services node under the Data Services node in the resource
tree.
4. Right-click and select New Composite Web Service.
The Add Composite Web Service dialog opens.
5. For Data Service Name, enter a name for your web service.
6. Click OK.
7. From the resource tree, open the web service that you just made.
The Web service opens on the SOAP panel.
8. In the Service section, click Enable Contract First and set its value to true.
Select true to point to your predefined WSDL and WSDL operations. After
TDV retrieves your WSDL definitions, you must finish defining the WSDL
operations using the Studio WSDL Definition Editor.
A new property, Contract, appears below Enable Contract First.
9. Expand Contract.
10. Double-click WSDL Contract under Contract and type the resource path, or
browse to your WSDL definition set.
Visible if Enable Contract First is set to true. Type or use the browse button to
specify the WSDL definition set that you have defined within TDV.
Properties Description
Enabled True (the default value) tells the system to use the information that you have
defined on this tab.
Target Namespace Displays the location of your SOAP installation. If your WSDL contract has a
target namespace defined, it will be displayed in this field.
Binding Name A unique binding name that can be referenced from elsewhere in the WSDL
definition. This name signifies that the binding is bound to the SOAP
protocol format: envelope, header and body.
Properties Description
Contract Property heading that is visible if Enable Contract First is set to true.
Contract Style Visible if Enable Contract First is set to true. Select ABSTRACT or
CONCRETE. Use concrete if your WSDL definitions contain bindings that
you have defined to control the security and transportation of WSDL
messages.
If the WSDL that you have selected in the WSDL Contract field does not
contain concrete bindings, you will not be able to select the CONCRETE
Contract Style.
Service Name Visible if Enable Contract First is set to true and if Contract Style is set to
CONCRETE. Values are populated from the WSDL imported through your
definition set.
Port Name Visible if Enable Contract First is set to true and if Contract Style is set to
CONCRETE. Select the port from the list of values. The port does not have to
be unique within the WSDL, but it must be unique within the Web Service.
Port Type Name Visible if Enable Contract First is set to true. Select the port from the list of
values. The port does not have to be unique within the WSDL, but it must be
unique within the Web Service.
Implementation Visible if Enable Contract First is set to true. Points to a TDV folder where the
Folder implementation procedures are generated. Double-click to edit the location.
Enable MTOM Valid values are true or false. Set to true to enable the use of Message
Transmission Optimization Mechanism (MTOM), a method of sending
binary data to and from Web services.
SAML Validator Type the name of the Java class that allows users to validate SAML assertions.
Class
Subject Mapper Define a Java class for your SAML content using TDV’s extended Java API
Class Subject Mapper, which can be found in:
<TDV_install_dir>\apps\extension\docs\com\compositesw\extension\se
curity\SubjectMapper.html
Use method samlSubjectToCompositeUser to map samlSubject to an existing
TDV user, so that this SAML principal has the same privileges as that user.
After you create the Java class, save the JAR file to the
<TDV_install_dir>\apps\server\lib folder.
Properties Description
Endpoint URL Path The path of the endpoint URL for this Web service. Default is the name (with
spaces removed) you used for the Web service.
HTTP/SOAP 1.1 Displays the HTTP SOAP 1.1 URL that can be used to access the operation.
HTTP/SOAP 1.2 Displays the HTTP SOAP 1.2 URL that can be used to access the operation.
HTTPS/SOAP 1.1 Displays the HTTPS SOAP 1.1 URL that can be used to access the operation.
HTTPS/SOAP 1.2 Displays the HTTPS SOAP 1.2 URL that can be used to access the operation.
13. From the upper part of the Operations portion of the screen, select a Web
service view or procedure that is available for use.
This portion of the screen is automatically populated with the WSDL
operations that were imported or defined in the WSDL definition set. If the
definition set changes, you might need to save your Web service to see the
latest list of WSDL operations. If you publish views or other procedures to the
Contract First Web service, they should automatically appear in the
Operations list.
14. To edit the Operations and Parameters portions of the screen, see the steps in
Publishing a SOAP Data Service, page 422.
15. In the lower part of the Parameters portion of the screen, if you have chosen to
define a concrete WSDL, you can view some additional properties and values
of the chosen parameter.
Element Name and Message Name for concrete WSDLs can be viewed, but
not edited, from this Studio editor.
16. Save your definition.
For Example
When the value of the configuration parameter is false, each element would have
something like this:
<soap-env:Envelope
xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/">
<soap-env:Header>
<ns1:TestSQLScriptCustomevalueOutput
xmlns:ns1="http://tempuri.org/">custom</ns1:TestSQLScriptCustomeva
lueOutput>
</soap-env:Header>
For Example
Assume that the Output parameter for the following code snippet is cust_data
and that the data type is a generic XML type.
<ns1: cust_addr>
<mail_addr>
. . .
</mail_addr>
</ns1:cust_addr>
Set the Generic XML REST Output Parameter Wrapper parameter to true, to
obtain output similar to:
<cust_data>
<ns1: cust_addr>
<mail_addr>
. . .
</mail_addr>
</ns1:cust_addr>
</cust_data>
If you do not want the parameter name wrapper in your output, set the
configuration parameter to false. For example:
<ns1: cust_addr>
<mail_addr>
. . .
</mail_addr>
</ns1:cust_addr>
Requir
Property ed Description
Enable Yes True (the default) tells the system you want to use the
information defined on this tab.
Target Namespace Yes Replace the default value with the location of your REST
installation.
JSON Format Yes for • JSON Package Name — Default name of your JSON service.
JSON The input and output data use this package name.
• Use Parameter Name of Procedure— If true, a JSON object
name with the format of parameter will be the same as the
original name defined in the procedure. The default value is
false. If false, the format is:
<package name> + '.' + <procedure name>+<parameter
name>
For the wrapped options, the default value is true. Default value
is consistent with an XML definition.
Enable HTTP Basic Yes Valid values are true or false. Set to true to enable a method to
provide a username and password (when making a request)
that results in a string encoded using a Base64 algorithm.
9. From the upper part of the Operations section, select a Web service view or
procedure that is available for use.
If you publish a new view or procedure to the Web service while this window
is open, you might need to close and reopen this window for it to appear in
the list.
10. In the lower part of the Operations section, type or select values for the
properties listed in the following table.
Property Description
HTTP Method Select the verb associated with the procedure, for example GET, POST,
PUT, or DELETE. Depending on the choices you have made for the
operations field, these options might already be defined.
Operation URL Path Specify the operation URL path as defined for your operation. Follow
endpoint URL guidelines to configure your values. You can use brace tags.
You can specify a portion of the endpoint URL to be used in each Web
service module. In a published WSDL file, the URL defining the target
endpoint address is found in the location attribute of the port's
soap:address element.
Input Message
Parameter Use this field to determine the shape of the input argument or message.
Style This value applies to all parameters.
• Select WRAPPED when there might be multiple parameters that use
the BODY location. The wrapper element is the child of the REST BODY
and the parameter elements are children of the wrapper element.
• Select BARE when exactly one parameter has a location of BODY and
its element is the only child of the REST BODY. All other parameters of
the same direction must be mapped to another location—for example,
HEADER.
Wrapper
Element Unique name for the wrapper element. For example, if you type
Name wrappedOrderParams here, the WSDL XML includes an element:
<wrappedOrderParams> </wrappedOrderParams>
Output Message
Property Description
Parameter The value applies to all parameters.
Style
• Select WRAPPED when there might be multiple parameters that use
the BODY location. The wrapper element is the child of the REST BODY
and the parameter elements are children of the wrapper element.
• Select BARE when there is exactly one parameter with a location of
BODY and its element is the only child of the REST BODY. All other
parameters of the same direction must be mapped to another location,
for example, HEADER.
Wrapper
Element Unique name to give to the wrapper element. For example, if you type
Name wrappedOrderParams here, the WSDL XML includes an element:
<wrappedOrderParams> </wrappedOrderParams>
Endpoint URLs
HTTPS/JS Displays the HTTPS/JSON endpoint URL for this operation. Present only if
ON Enable SSL is true in the Service section of this panel.
HTTPS/X Displays the HTTPS/XML endpoint URL for this operation. Present only if
ML Enable SSL is true in the Service section of this panel.
14. In the lower part, type or select values for the following properties for input or
output parameters.
Resource
Property Type Description
Row Table Fully qualified name to give to the input parameter, output parameter,
Element View or output cursor; or name (relative to the cursor path) to give to the
Name column.
Row Type Table (Cursor only) Type value of this row element.
Name View
Cursor Table (Cursor only) Unique name for the type that defines the global cursor
Type Name View element.
Element All Can be NULL. The default value of the Element Name field specifies
Name the schema of the REST message that is encrypted. This procedure can
be used to encrypt the REST message body.
Binding Procedure When the selected operation is a procedure, you can define binding
Location locations. If the message is BARE, exactly one parameter must use the
BODY binding location.
The available locations for input are:
ENTITY (not available for GET), QUERY, HEADER
The available locations for output are:
ENTITY
Query Procedure The parameter-name part of a query string. The query argument can
accept NULL values. For example, the following query arguments are
both valid:
• http://localhost:9410/json/goody/data?name_arg={name_arg_valu
e}
• http://localhost:9410/json/goody/data
Default Procedure Optional. Default value to use for the parameter if a parameter value is
Value not passed through the request.
Nullable Procedure Whether this field is allowed to have no value. True specifies that the
parameter can be passed to the server with no value. Without a value
for the parameter, all data is returned from the query.
Direction All Indicates the direction of the input parameter, output parameter, or
output cursor. Cannot be edited.
15. Repeat the steps for each parameter to be defined for the Web service.
16. Save your definition.
TDV offers Web service security at two levels: transport-layer level and message
level. Secure Web services are published by TDV using Web Services Security
(WSS). TDV enables signing of messages with digital certificates to identify the
source of the message, and encryption of the message for secure delivery to the
client.
• Supported Web Service Security Standards, page 439
• Using a Predefined Security Policy for a Web Service, page 441
• Creating and Using a Custom Security Policy for a Web Service, page 442
Transport
or System Security Policy Description
Standard
HTTP Http-Basic-Authentication.xml Policy that requires a username and password
when making a request.
Transport
or System Security Policy Description
Standard
HTTP Http-Negotiate-Authentication.x Policy that enables Kerberos authentication.
ml
6. Left-click in the Value column next to Security Policy in the Services or the
Operations part of the SOAP tab and choose one of the predefined or custom
security policies from the drop-down list.
7. Save your changes.
To control what is returned for a column for which the user lacks
appropriate permissions
1. From the Studio main menu, select Administration > Configuration.
2. In the Configuration window, navigate to Server > Configuration > Security >
Enable Exception For Column Permission Deny.
3. Set this parameter to true if you want to raise an exception when column-level
permissions are missing.
4. Set or leave this parameter set to false, if you want columns for which the user
does not have permissions to contain NULLs or not be returned at all.
You can globally disable OData for all published data sources.
This topic describes how to display and explore the data lineage from the data
sources through to the published resources defined in TDV.
• About Data Lineage, page 445
• Displaying Data Lineage, page 448
• Working with the Data Lineage Graph, page 450
• Lineage Information for Different Resource Types, page 461
When you work with resources in Studio, there are often many relationships and
dependencies between resources. A composite view, for example, depends on
things like data sources and procedures which might then depend on a different
data source. Other views might reference the view and thus be dependent on it.
Security policies, triggers, and other resources might be dependencies.
Understanding these interdependencies can be useful in analyzing where data
comes from, how it is derived, and how it is used.
To help you understand resource dependencies, Studio provides a data lineage
graph that supports TDV modeling and data visualization capabilities. The TDV
data lineage feature helps you:
• Trace where data originates—From data sources through published resources,
you can trace which specific data sources contribute to each column.
• Understand how data is used and/or transformed—You can determine how
the data is used or transformed by TDV resources and which data contributes
to the published resources.
• Discover the impact of change and analyze dependencies—If you delete or
rename a resource, the data lineage graph for related resources graphically
indicates which dependent resources are impacted.
You can obtain the data lineage for all resource types, but lineage for these
resources can be especially useful:
— Data sources
— Tables
— Composite views
— Models
— Procedures
— Definition sets
— Published services
— System resources
See Lineage Information for Different Resource Types, page 461 for what is
displayed for each resource type along with examples.
• See that a projection is concatenated in a prior view, then CAST to a date, then
used in a GROUP BY clause.
• See that a column is used in a JOIN or is used in a filter.
• See that a column is used in aggregate or analytical function.
Save to File Open a dialog box to save the dependencies in a file. See Exporting Lineage
Reports to a Comma-Separated-Values File, page 459.
Hide/Show A toggle button that hides or shows dependencies for a selected resource
Dependencies when enabled. By default, dependencies are displayed.
Hide/Show A toggle button that hides or shows references for a selected resource when
References enabled. By default, references are displayed.
Show/Hide A toggle button that shows or hides WHERE/GROUP BY/FROM and other
Indirect Links indirect links when enabled. By default, indirect links are hidden.
You can display the data lineage for most TDV resource types. The kinds of
interdependencies displayed depend on the resource type as illustrated in
Lineage Information for Different Resource Types, page 461.
Studio displays the lineage for the selected resource including all resource
dependencies and all reference resources. This example shows the data
lineage for the ViewSales example view:
The data lineage panel gives you a relatively large area in which to graphically
view all of the dependencies and dependent resources. See Lineage Panel Buttons
and Controls, page 447 for the purpose of the various buttons and controls in the
data lineage panel.
You can navigate and explore the relationships and resource dependencies as
follows:
To ... Do this...
Change the focus to a Double-click a resource or use the right-click menu.
different resource and
See Changing the Resource Focus, page 451 for more information.
display its lineage.
Expand and collapse what Use the expand and collapse buttons to see a resource’s contents. See
is displayed. Changing What Is Displayed, page 452.
Filter the visible columns Use the funnel button on the view title bar. See Filtering the Visible
Columns, page 453.
Go back to a previous data Use the navigation arrow buttons or select a resource in the history
lineage display. drop list.
See Using the History Feature, page 456 for more information.
Get details about the Click the Details button. Displays the resource definition in the Detail
resource definition. beneath the lineage graph.
See Getting Resource Details, page 454 for more information.
Discover impacted Impacted resources are indicated with a warning icon. You can hover
resources. over the resource to get more information. See Impacted Resources,
page 47 for more information.
To ... Do this...
Refresh the graph. See Refreshing the Lineage Graph, page 457 for more information.
Path to resource
The default is to hide indirect links. The following example shows four
indirect links that reference specific columns.
This is the same information displayed in the resource tree tip when you
hover over a resource.
You can also right-click any resource and choose Open to open the resource’s
editor in a new tab.
4. Optionally, select a different resource to display its details in the same tab.
The type of information displayed depends on the resource type:
View SQL that defines the view. A selected column is highlighted in yellow.
For example, when you double-click a view to display its columns, the
columns and their relationships are displayed and the Detail panel shows the
SQL that defines the view. If you select a column, then the references to the
column are highlighted in yellow as shown in this example:
3. Select the path for the resource you want to be the central focus.
Studio displays the resource graph that was previously viewed.
Note: You can also use the Navigate Backward and Navigate Forward arrow
buttons to step through the lineage graphs.
Note: If your lineage graph is especially large, you can create a poster print of
it using the Options button.
3. Use the buttons at the top control the printed output as follows:
Use... To...
Page Change the orientation between portrait and landscape and to adjust the margins.
Print Print the model diagram using your standard print dialog.
Zoom In Zoom in on the print image. This does not increase the size of the image on the page.
Zoom Out Zoom out on the print image. This does not decrease the size of the image on the
page.
Use... To...
Options Displays a Print Options dialog with three tabs:
• General tab options let you create a large poster-style print of the model diagram
by dividing the model into sections that can be pieced together.
Poster Rows—divides the image vertically into the number of rows entered and
displays how it would appear on that number of pages if printed.
Poster Columns—divides the image horizontally into the number of columns
entered and displays how it would appear on that number of pages if printed.
Add Poster Coordinates—prints the location of each subsection of the model
diagram so that you can figure out how the sections go together.
Clip Area—Two options are provided: Graph enlarges the model image as much
as possible within the margins; View increases the blank space around the model
image.
• Title tab options let you specify a title for the printed model.
Text—Specify the title you want to appear at the top of the model diagram.
Titlebar Color—Click to select the color of the title bar which will appear behind
the title text.
Text Color—Click to select the color of the title text.
• Footer tab options let you specify a footer for the printed model. If you are
printing the model poster style, the footer will span the pages appropriately. You
can specify text, footer and text color.
dependencies or references in the lineage graph, they are not included in the
exported file.
3. On the lineage button toolbar, click the Save To File button.
4. In the Save dialog, navigate to the location where you want to save the
information.
5. Name the file that will contain the dependency information.
6. Click Save.
7. Open the file in Excel or other similar application.
The lineage report contains one row for every connecting line in the lineage
graph. A connecting line represents a dependency or reference. The
information for each dependency or reference is as follows:
ReferencedB
Sourc SourceType SourceColumn Reference ReferenceByTyp yColumnNa
ePath Name ByPath e
me
The The source type: For column The path to The resource type: For column
path to dependencies, the references,
TABLE, TABLE,
source the name of the reference. the name of
DATA_SOURCE, DATA_SOURCE,
column. the column.
PROCEDURE, PROCEDURE,
MODEL, and so MODEL, and so
on on
You can obtain the lineage information for most resource types. The information
is displayed graphically in the resource lineage graph. The relationships and
connection between different resources can be especially useful. Also, you can
change the resource focus to navigate between resources and understand their
relationships.
This table illustrates examples of the lineage information that is displayed for
many resource types (listed in alphabetical order).
Data source Displays all resources that depend on the selected .csv data source. For
(.csv) example:
Data source Display all resources that depend on the selected relational data source.
(relational) Example:
Data source Displays all resources that depend on the selected XML data source. For
(XML) example:
Definition Set Displays all resources that depend on the selected definition set. For example:
Model Displays all resources on which the selected model depends. For example:
Policy Displays all resources that depend on the selected policy. For example:
Procedure Displays all resources on which the procedure depends and all resources that
(Script) reference the script. For example:
Published Displays all dependencies for the selected resource. For example:
resource
Table Display the parent data source and all resources dependent on a table and its
individual columns. Example:
Transformation Displays the dependencies and references for the XSLT transformation. For
(XSLT) example:
View Displays the dependencies, references, and cached tables/views for a view
(by default). For example:
You can also toggle the indirect links button to view the references to columns
in a view’s WHERE, GROUP BY, HAVING, FROM, or other non-SELECT
clauses as shown here:
TDV Caching
Caching makes a copy of frequently accessed data and stores it for quicker, more
convenient access. The following topics are covered:
• Overview of TDV Caching, page 466
• Cache Requirements and Limitations, page 483
• Setting Up Caching, page 494
— Caching Transaction Results of a Procedure, page 494
— Caching to the Default Database Target, page 495
— Caching to a File Target, page 496
— Pre-Creating Caching Objects for Database Caching, page 497
— Caching to a Single-Table Database Target, page 500
— Creating a Multiple Table Cache on a Database Target, page 501
— Enabling Cache Data Storage to Multiple Data Sources, page 505
— Setting Up Native (Bulk) Caching Options, page 507
— Setting Up the Parallel Cache Option, page 519
— Enabling JDBC-Only Cache Loading, page 520
— Canceling a Cache Refresh that Is Using Native or Parallel Loading,
page 521
• Defining Cache Refresh Behavior, page 521
• Cache Maintenance, page 536
— Indexing Your Cache, page 537
— Managing Configuration Changes and Cache Behavior, page 538
— Displaying the Dependencies of a Cached View, page 539
— Displaying the Dependencies of a Cache Policy, page 540
• Caching Tips from an Expert, page 542
— Destroying a File or Default Cache, page 542
— Managing Cache Status Table Probe Row Conflicts, page 543
— Managing Unresponsive Cache Tables, page 543
— Managing Open Connection Threads, page 544
TDV caching is a feature that can improve the responsiveness of data access in
your enterprise. Caching features can help you quickly build data caches that
complement data virtualization and improve reporting. Working within TDV and
Studio, you can access data and metadata from across data repositories, whether
they be applications, databases, or data warehouses. You can then use TDV to
create data caches. From the data caches, you can consume the cached data for
client activities such as data analysis.
TDV caching is a feature of TDV and uses the Studio user interface. TDV caching
is delivered with TDV, and contains all of the software needed to implement TDV
caching in your IT environment.
• TDV Caching Concepts, page 467
• How Does TDV Caching Work?, page 474
• TDV-Created Caching Objects, page 479
• What Incremental Caching Options Are Available in TDV?, page 481
• About Native and Parallel (Bulk) Caching, page 482
• Cache Status and Events Reporting, page 483
The following illustration provides a simplified representation of caching with
TDV.
What Is a Cache?
A cache is a location that holds a copy of frequently accessed data that is otherwise
expensive to retrieve or compute.
Consider what cache target is most suitable for the type of data to cache and how
you want to interact with it. For example, a cache target might not support
important data types in the source. Review the Cache Requirements and
Limitations, page 483. Also check the Cache Data Type Mapping section of the
TDV Reference Guide.
Cache policies also enhance cache performance. With cache policies TDV can
calculate the optimal cache load strategy for performance because the list of
resources that are part of the cache refresh action for a given policy are well
known.
With cache policies, cached data consistency is all or nothing. If one of the
resources included in the cache policy fails during refresh action, all of the caches
are rolled back to the previous successful snapshot of data. For this reason, cache
policies and incremental caching are not supported for use together.
When Does Cache Data Expire and Get Removed From the Cache?
A cache refresh is used to create or update the cache contents. Depending on how
you have configured your cache determines how the data from a previous refresh
expires and potentially gets removed from the cache.
Expired cache versions are not deleted until all transaction holds using that
version are completed or rolled back.
Typical expiration schedules are:
• Never—Keep the current and all previous data snapshots in the cache.
• Periodic—Allows you to define a regular recurring cache expiration time. For
example, after a specified number of weeks or years.
• According to Custom Criteria—You can control cache data refresh and
clearing by defining custom SQL scripts, Java programs, or web services.
You can also control whether the cache data clears:
• Immediately—Clear the cache using the Clear Now button. This marks the
old cached version of the data as expired so that new requests cannot use it;
then the system initiates a background task to clear the expired cache version
when it is released.
• On Failure—Selecting this option would clear the cache if a refresh fails. This
option allows access to previously cached data during a refresh.
• Before the Refresh—Selecting this option would automatically clear the cache
before starting a refresh. Any clients attempting to read from the cached data
must wait for the new data. Forces all new requests to use the new cache
version, marks the older cached version as expired but not actually cleared,
and new requests for the cached resource must use the latest data provided by
the cache refresh.
• Periodically At Expiration Time—On cache expiration, if a time is defined in
the Expiration Schedule portion of the screen, the old cache data is cleared
out.
What is Cache Versioning, Transaction Integrity, and When Are Old Cache Contents Cleared
Out?
TDV cache versioning preserves transaction integrity by maintaining old cache
versions until existing transactions are completed. Expired cache versions are not
deleted until all transaction holds using that version are completed or rolled back.
Even when you manually initiate cache clearing, that action only marks the old
cached version as expired so that new requests cannot use it; then the system
initiates a background task to clear the expired cache version when it is released.
TDV can be configured to gracefully handle long-running cache refreshes by
retaining all previous cache data snapshots, which preserves usability of the data
while a cache is actively being refreshed in the background.
To preserve read consistency, long-running queries that use data from a cached
view are allowed to complete computations with the original cached view that it
was using at the time the cache refresh started.
How Does Error Handling Work for Resources that use Cache Policies?
If a cache refresh is initiated by a cache policy and it fails at any point before
completion, all the caches are rolled back to the last successful snapshot.
How Does Error Handling Work for Resources with Individually Defined Cache Refresh and
Expiration Schedules?
If the data sources providing the data to be cached or the data source used to store
the cached data fail during the refresh, the cache is marked as failed or stale
depending on whether the previous snapshot is available.
If the TDV server fails during the refresh the cache is marked as failed or stale
when the server is restarted.
For caches that perform a full refresh, if the server dies in the middle of a refresh,
the refresh attempt is marked as Failed on server restart. Any partial data is
eventually processed by the garbage collector.
For caches that perform incremental refreshes, if the server dies in the middle of a
refresh, the cache status is marked as DOWN when the server is restarted.
The WHERE clause and the SQL in the UK_Sales_View are pushed to the Oracle
database for processing. Only 100,000 rows and three columns are extracted from
Oracle, sent over the network, and placed in the cache. This is 500 times less data
for the same result. The cache is smaller, the refresh is quicker, and there are fewer
loads on both the Oracle database and the network during the refresh.
In the following example, Composite_view joins a file, an Oracle table, and an
SAP table. Report_proc uses Composite_view and pushes a predicate against it.
Because File_1 does not support pushes, the predicate requires a full scan of File_1
and SAP_Table_1 each time, slowing performance with the joins and the fetches.
If Composite_view is cached, the join is performed only once.
If Proc_1 is called with a variant that is not currently cached, like A = 3, a “cache
miss” occurs, and the procedure is executed with A = 3. The result set is directly
returned to the caller, and both the variant and the result set are written to the
cache. If proc_1 (3) is called again prior to expiration of the result set, where A = 3,
the cached result set is returned immediately.
Because there can be many variants, it might not be possible to store them all. By
default, the maximum number of stored variants is 32, but you can change the
value in the Maximum number of procedure variants field in the Advanced
section of the Caching panel. If the cache is full, the least recently used variant is
swapped out for the latest variant.
If a procedure has input parameters, the cached results are tracked separately for
each unique set of input values. Each unique set of input parameter values is
called a variant.
For example, a procedure with an INTEGER input parameter would have
separate results cached for inputs of 1, 3, and 5, or whatever value. A procedure
with two INTEGER input parameters would have separate results cached for
inputs (1,1), (1,2), and (x,y).
Note: A procedure cache that uses non-null input parameters must be seeded
with at least one variant from a client application other than Studio, for the Cache
Status to change from NOT LOADED to UP. Using the Refresh Now button does
not change the status of a cache that is not loaded. Even if procedure caching
configuration is correct, the status does not show that it is loaded until a client
seeds the cache.
When a procedure cache is refreshed, all the cached variants already in the table
are refreshed. If no variants have yet to be cached, then nothing is refreshed or
only the null input variant is refreshed. You can refresh a procedure cache from
Studio or using the RefreshResourceCache procedure. (See the TDV Application
Programming Interfaces Guide.)
To enable transaction result caching, check the “Execute only once per transaction
for each unique set of input values” check box in the Transaction Options section
of the Info tab for any procedure.
In a cluster environment, any attempt to refresh a cache uses the TDV cache status
table to determine if any other refreshes are in progress. If no others are in
progress, the refresh starts and all other servers are notified to reread the TDV
cache status table (for information on what the cache status table is, see
TDV-Created Caching Objects, page 479). If a refresh is already in progress, the
server contacts the server that is currently refreshing and waits for that refresh to
complete.
In a cluster environment, an attempt to clear a cache marks the cache data for
clearing in the TDV cache status table. The background task to actually delete the
data contacts other cluster members to determine what cache keys are no longer
in use so the task can safely remove only rows that no server in the cluster is
accessing. After the clear task completes, all other servers are notified to reread
the TDV cache status table.
A scheduled refresh in a cluster relies on the trigger system cluster feature to
ensure that triggers are processed only the appropriate number of times. File
cache schedules are configured to fire separately on each server. Database cache
schedules are configured to fire only once per cluster. You can create trigger
resources, instead of using the schedule options provided on the Caching panel,
to control this behavior. (See Triggers, page 391.)
Cache Status
The status table (or file) is used to track which views and procedures currently
have cached data stored, when they were last refreshed, and so on.
Each Studio data source used as a caching target must have its own status table.
Cache Tracking
The tracking table (or file) is used to track the tables, views, and procedures that
are using a particular cache target. It is also used to track what tables in the cache
target are in use.
Each Studio data source used as a caching target must have its own tracking table.
Cache Data
The name of the object you want to store cached data for is appended with cache.
These are the file or database tables that will hold the cached data. Resources
cannot share the same data cache.
The cache metadata tables contain some abbreviated values. For example:
Metadata
Values Meaning Description
K key A special value used to indicate the key row in the table.
Microsoft SQL Server bcp.exe utility The bcp.exe path must be specified in
TDV.
Data caching has many important requirements and limitations to consider. This
section describes:
• Supported Data Sources for Caching, page 484
• Supported Cache Target Storage Types, page 484
• Privileges and Caching, page 487
• Caching Limitations, page 488
Native
Parallel Cache
TDV Cache
Cache Target Support Target
Target Notes
Suppo
Support rt
File Active Active Typically best for
demonstrations or caching of
a few hundred rows.
IBM DB2 LUW v10.5 Active Active Active Native load with insert and
select, and DB2 Load are
supported.
Microsoft SQL Server 2008 Active Active Active The DBO schema must be
selected and introspected as
a resource prior to
attempting to cache data.
Microsoft SQL Server 2012 Active Active Active The DBO schema must be
selected and introspected as
a resource prior to
attempting to cache data.
Microsoft SQL Server 2014 Active Active Active The DBO schema must be
selected and introspected as
a resource prior to
attempting to cache data.
Microsoft SQL Server 2016 Active Active Active The DBO schema must be
selected and introspected as
a resource prior to
attempting to cache data.
Parallel Native
Cache
Cache Target TDV Cache
Target Notes
Support Target
Support Suppo
rt
Netezza 6.0 Active Active Active Native load with insert and
select is supported. Parallel
cache processing is achieved
using the native
DISTRIBUTE syntax.
Procedure caching is
supported.
Netezza 7.0 Active Active Active Native load with insert and
select is supported. Parallel
cache processing is achieved
using the native
DISTRIBUTE syntax.
Procedure caching is
supported.
Parallel Native
Cache
Cache Target TDV Cache
Target Notes
Support Target
Support Suppo
rt
Sybase ASE 12.5 Active
Not all caching tables can support data types from all other supported data tables,
and so materialized views with certain data types are not compatible for caching
at that database. When a data type mismatch occurs, a warning is issued to the
developer to prevent further problems.
Not all data sources that could be used for cache storage can support the
generation or execution of DDL statements. In some cases, DDL can be generated
and presented, but use of an external tool might be required to create the table
metadata. In other cases, the DDL is not shown, because it is not applicable.
TDV caching typically requires the creation of objects on the data source that is
acting as the cache target. To create and delete the data and objects necessary for
caching with TDV you must have the necessary permissions set up for you in the
cache target. Performing a cache refresh or clear requires the READ, SELECT,
INSERT, UPDATE, and DELETE privileges on the status table, tracking table, and
the data tables. For example, if you use an Oracle database as your caching target,
the TDV user that is defining the cache must have privileges on the Oracle
database to READ, SELECT, INSERT, UPDATE, and DELETE data and objects
directly in the database. Those privileges can be granted by explicit assignment,
or by membership in a group with those privileges (implicit assignment).
For someone to use the view or procedure that is being cached to the cache target,
they must also have privileges granted to access the cache status and data tables.
For someone to read from the cache, they must be granted the SELECT privilege
on these tables and READ privileges on folders above these tables.
Privileges are managed automatically when using the Automatic mode of
caching.
Caching Limitations
TDV can cache data from many different sources to many different cache targets.
Occasionally, there are subtle limitations when dealing with so many systems. For
data caching, those limitations are discussed here.
• The view being cached must NOT have a column called cachekey.
• If the cache target does not support a data type available in the source, then
any attempt to store the unsupported data type in the cache results in an error.
However, it may be possible to cast the unsupported data type to a supported
one (for example, cast a BOOLEAN to an INTEGER type) before sending it to
the cache target.
• When caching to a table, ARRAY data types are not supported.
• The following resources cannot be cached without being wrapped in a view or
procedure:
— Procedures with no outputs; that is, no data to cache.
— Procedures with input parameters that require the use of cache policies.
— XML files that have been introspected.
— System tables.
— Non-data sources such as folders, and definition sets.
recreate the cache_status table and refresh the cache. This solution does not
work for data that contains multi-byte international characters because the
characters are not saved or retrieved correctly.
To solve just the caching problem, cache data in a different data source (instead of
Teradata).
To solve just the query problem, provide query hints (see Specifying Query Hints,
page 228) on queries against Teradata where filters are on CHAR columns:
{ OPTION IGNORE_TRAILING_SPACES="True" }
Data Source Cache Target Data Types Not Functions Not Supported
Supported
Sybase Teradata BINARY, IMAGE, TEXT,
VARBINARY
Cache
Data Source Target Data Types Not Supported Notes
SQL Server 2008 Netezza 5 or Netezza 6 VARBINARY, TIMESTAMP, BINARY, TEXT, IMAGE,
NTEXT
Setting Up Caching
This topic describes how to use data caching in TDV to improve performance:
• Caching Transaction Results of a Procedure, page 494
• Caching to the Default Database Target, page 495
• Caching to a File Target, page 496
• Pre-Creating Caching Objects for Database Caching, page 497
• Caching to a Single-Table Database Target, page 500
• Creating a Multiple Table Cache on a Database Target, page 501
• Enabling Cache Data Storage to Multiple Data Sources, page 505
• Setting Up Native (Bulk) Caching Options, page 507
• Setting Up the Parallel Cache Option, page 519
• Enabling JDBC-Only Cache Loading, page 520
• Canceling a Cache Refresh that Is Using Native or Parallel Loading, page 521
For transactional caching, TDV caching does not directly use memory to store the
cached result set, but instead persists to disk or a database. Memory-based tables
have been used in Oracle and MySQL to achieve even better performance. When
the refresh occurs, the object view or procedure is executed and the result set is
written to the cache.
Limitations
Cache policies cannot be stored in the default cache database.
When a cache is disabled, all existing cache settings are ignored. The view or
procedure is used as if caching did not exist. Toggling between the enabled
and disabled state does not cause refreshing of the data or resetting of the
expiration date for the data.
5. Save your changes.
Several fields will display with non-editable information.
The Data Source, Table Schema (Optional), and Table Prefix fields can be used
to determine the location of your cached data within localhost.
6. Set cache policy, cache refresh, expiration, Number of Buckets, Drop and
Create indexes on load, and advanced options as necessary.
After data is added to the cache tables by click Refresh Now, you can navigate to
the cached data source, execute the view where caching is enabled, and review
the cached data. By default, the tables are created on the PostgreSQL database
that was defined during the installation of TDV.
and disabled state does not cause refreshing of the data or resetting of the
expiration date for the data.
5. Under Storage, specify Automatic to store the cached data in a system-created
file accessible through Studio:
<HostName>/lib/sources/cacheDataSource
The physical files are stored by default in the TDV Server installation
directory at: <TDV_install_dir>/tmp/cache
Each cached resource has a cache table in the cacheDataSource data source.
The cache table’s name contains the resource type (view or procedure) and a
system-generated ID.
6. Save the cache settings.
resource that you are caching, and whether you want to index the data or not.
In the following table, UID stands for user identification. You can name the
tables anything you want <resource_metadata> is used to indicate that you
will need to specify the names, data types, and other table creation required
metadata for the columns you want to create in that table.
7. For multiple table caches, create tables with the following structures for each
TDV resource for which you want to define a cache. The DDL might need to
be altered depending on version of TDV your are using, database type where
you are creating the table, TDV resource that you are caching, and whether
you want to index the data or not.
...
8. Re-introspect the data source to make sure that the additional table you have
created are available through TDV.
9. Set up your cache and select these tables as the status, tracking and data
tables.
data source, or a schema. Follow the instructions on the screen to get through
the screens that might appear.
For example, when you select a data source, the right side of the screen might
say Create Table. When you type a name and click Create, a DDL for result
window appears showing the DDL generated by Studio. Click Execute in that
window to create the table in the data source. Even though a dialog box might
say DDL execution was successful, a Status of CONFIG ERROR plus an Error
message may appear on the Caching panel. Click Details for a full description.
11. Save the cache settings. After cache settings are saved, each cached resource
appears with a lightning-bolt icon in the Studio resource tree to show that the
resource is cached.
12. Optionally, if you would like to have actions happen before or after the cache
is refreshed, see Defining Pre- and Post-Actions for a Full Refresh Mode
Cache, page 530.
13. If you want to define pull-based incremental caching for your file-based data
cache, see Setting Up Pull-Based Incremental Cache, page 531.
14. If you have data type incompatibilities between your view and your data
storage type, see “Cache Data Type Mapping” in the TDV Reference Guide.
Field Description
Enable True means to use Teradata’s FastLoad or FastExport utility to speed
FastLoad/FastExport for up query times. Query cardinality is used to decide whether to use
large tables Fastpath or JDBC default loading.
FastExport Session Count The number of FastExport sessions to use for Teradata.
FastLoad Session Count The number of FastLoad sessions to use for Teradata.
Field Description
Create Tables Provides selections for indicating a table prefix, number of buckets,
Automatically and index creation for your required cache objects.
Field Description
Choose Tables for Provides a mechanism for you to select predefined cache tables. Your
Caching predefined cache tables must adhere to the metadata requirements of
the cache tables. For data integrity, each resource that is cached should
have its own cache data tables.
12. If you selected Create Tables Automatically, you must make choices in the
fields listed in the table.
Fields Description
Table Catalog Depending on how your data source was defined and introspected, you might
(Optional) have one or two levels of folders under the data source in the Studio resource
tree. The table catalog refers to the first level of folder under the data source. If
your data source contains subfolders, type the name of the first level here. The
name is case-sensitive.
Table Schema The table schema refers to the second level of folder under the data source. If
(Optional) your data source contains subfolders, type the name of the secondary level
here. The name is case-sensitive.
Table Prefix Prefix to add to the cache tables and any indexes that you want to create. A
unique prefix is required.
Number of Use to indicate the number of snapshots of the cache to keep. Start with a
Caching Tables value of 3 and then determine if that number needs to be increased to
accommodate your specific caching needs.
The number of cache tables needed depends on how often data is cached, how
much data is cached, how often refresh occurs, and the longest running
transaction that accesses this cache. For example, if the longest running
transaction is 4 hours and refreshes happen every hour, you should define at
least 4 cache tables. If a cache is to refresh every 2 hours but some transactions
run for 6 hours, configure at least 4 cache tables.
Each row in cache status table has the name of the cache table containing the
snapshot.
A cache table is not eligible for garbage collection as long as there is an active
transaction that uses it.
TDV attempts to detect cache table sharing, but it is your responsibility to
ensure that each cache table is used by only one resource.
Create Cache Runs the DDL to create the tables needed for caching.
Tables
Fields Description
Drop indexes When selected, all indexes listed in Indexes tab of the view are part of the
before load and generated DDL and are created. Enabling this check box makes TDV drop
create indexes indexes before loading data and recreates them after, to improve load time.
after load
13. If you want to have the multi-table caching option to drop and create indexes
after the cache loads, select the Indexes tab and review or define an index for
the view or table. For information on how to create the index using the tab, see
Defining Primary Key for a View or Table in the Indexes Panel, page 238.
14. If you selected Choose Tables For Caching, you must make choices in the
following fields:
Fields Description
Add Table Use to add additional tables for cache storage.
Table Name Browse to select or create the table for a multi-table cache. You can also type the
resource tree path.
This table must not be used to store data for other caches.
15. Save the cache settings. After cache settings are saved, each cached resource
appears with a lightning-bolt icon in the Studio resource tree to show that the
resource is cached.
16. Optionally, if you would like to have actions happen before or after the cache
is refreshed, see Defining Pre- and Post-Actions for a Full Refresh Mode
Cache, page 530.
17. Optionally, if you want to define pull-based incremental caching for your
file-based data cache, see Setting Up Pull-Based Incremental Cache, page 531.
18. If you have data type incompatibilities between your view and your data
storage type, see “Cache Data Type Mapping” in the TDV Reference Guide.
Requirements
• The cache data source and the data source that holds the cache status and
tracking tables must be owned by the same user.
• The cache policy requires the use of the same cached data source for status
and tracking tables.
• System data sources, such as the file cache or default cache) are not eligible to
hold the cache status and tracking tables.
• The data source being used as the cache data source is not eligible to hold the
status and tracking tables when attempting to enable this solution.
15. For the Data Source field, Browse to the data source which you want to use to
hold the cache data. For example, ds_orders.
16. Enable the cache.
17. Save the data source configuration.
18. Refresh the which you want to use to hold the cache data. For example,
ds_orders.
19. Enable the cache.
20. Save the data source configuration.
21. To test your configuration, select Refresh Now to populate the cache tables.
Open the cache data, cache status and cache tracking tables, validate that each
of them contains data and have timestamps that are consistent with the time
that you ran the latest cache refresh.
If both the native database loading and the TDV cache loading fail, the refresh
status is set to failed.
Configuring TDV Native Loading Uses the bcp utility (bcp.exe) for
Option for Microsoft SQL Server, bulk import and export.
page 510
DB2 Command-Line
Platform Utility Name Execution Details
Requirements
• The data being loaded must be local to the server.
• Requires advanced DB2 database level configuration
Limitations
This feature is not valid for:
• Binary and BLOB data types
• Kerberos implementations
To configure the DB2 LOAD utility to work with TDV for caching
1. Consult the IBM documentation on configuring and using the DB2 LOAD
utility.
2. Install and configure all relevant parts of the DB2 LOAD utility according to
IBM’s instructions for your platform.
The client drivers might need to be installed on all the machines that are part
of your TDV environment.
3. Verify the full path to the DB2 command-line utility that you want to use to
run the DB2 LOAD utility. For example, locate the path to your DB2 SQL
command-line client.
4. Open Studio.
5. Select Administration > Configuration.
6. Locate the Enable Bulk Data Loading configuration parameter.
7. Set the value to True.
8. Click Apply.
9. Click OK.
10. Locate the DB2 Command-Line Utility Path configuration parameter.
11. For Value, type the full directory path to the DB2 command-line program. If
there are spaces in the pathname, be sure to enclose the path in double quotes.
For example:
“C:\Program Files\IBM\DB2\Tools\Bin\DB2CMD.exe”
When the cache source and target exist on the same SQL Server data source,
native loading of cache data using a direct SELECT and INSERT can provide
significant performance gains. The feature requires that the value of the Enable
Bulk Data Loading Studio configuration parameter be set to True.
When configuring data ship or a cache for Microsoft SQL Server, you can
configure the bcp utility and then set up access permissions. Both of these
processes are described below.
The configuration steps are similar for caching and data ship, for more
information see Configuring Data Ship for Microsoft SQL Server, page 611.
Note: In the bcp utility, NULL values are interpreted as EMPTY and EMPTY
values are interpreted as NULL. TDV passes the ‘\0’ value (which represents
EMPTY) through the bcp utility to insert am EMPTY value into the cache or data
ship target. Similarly, TDV passes an EMPTY string (“”) to the bcp utility to insert
a NULL value into the cache or data ship target.
To configure the Microsoft bcp utility to work with TDV for caching
1. Verify that the bcp.exe utility has been installed and is located in a directory
that you can access. Note the full path to the bcp.exe file.
2. Open Studio.
3. Select Administration > Configuration > Data Sources > MS SQLServer
Sources.
4. Select the Microsoft BCP utility parameter.
5. For Value, type the full directory path to the bcp.exe. For example:
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp.exe
3. You can now complete the setup in Caching to a File Target, page 496.
To tune the TDV caching behavior when Microsoft SQL Server is the cache
target
1. Open Studio.
2. Select Administration > Configuration from the Studio menu bar.
3. Expand the Data Sources > Common to Multiple Source Types >Data Sources
Data Transfer.
4. Determine the best value for the Column Delimiter parameter.
The default is |.
5. Click Apply.
6. Click OK.
6. Click OK.
7. Select Administration > Configuration.
8. Navigate to TDV Server > Configuration > Debugging > Enable Bulk Data
Loading.
Or, search for ‘bulk’ using the Configuration Find field.
9. Set the value to True.
10. Click Apply.
11. Click OK.
12. You can now complete the setup of your Netezza cache data target.
Platform Instructions
Unix 1. Create an odbc.ini file. We recommend that you create it under
<TDV_install_dir>.
2. Create a Sybase IQ data source name (DSN) in the odbc.ini file. For work with
TDV, the Driver variable is the most important setting, because it points to the
SQL Anywhere client that you have installed. For example, data sources
named, test1 and test2, would look similar to the following:
#####################################
SybDB@machine64:cat odbc.ini
[test1]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbodbc<v
er>_r.so
host=10.5.3.73
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
[test2]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbodbc<v
er>_r.so
host=10.5.3.74
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
######################################
Platform Instructions
Windows 1. Start the ODBC Administrator, select Sybase > Data Access > ODBC Data
Source Administrator.
2. Click Add on the User DSN tab.
3. Select the Sybase IQ driver and click Finish.
4. The Configuration dialog box appears.
5. Type the Data Source Name in the appropriate text box. Type a Description of
the data source in the Description text box if necessary. Do not click OK yet.
6. Click the Login tab. Type the username and password. If the data source is on a
remote machine, type a server name and database filename (with the .DB
suffix).
7. If the data source is on your local machine, type a start line and database name
(without the DB suffix).
8. If the data source is on a remote system, click the Network tab. Click the check
box for a protocol and type the options in the text box.
9. Click OK when you have finished defining your data source.
10. On Unix, use the following commands to verify that the DSN is set up
successfully:
cd sqlanywhere<ver>/bin<ver>
./dbping -m -c "DSN=test1"
A “Ping server successful” message means that the DSN is working and any
application can use the DSN to contact the Sybase IQ through the ODBC
Driver.
11. On Unix, set the global ODBCINI environment variable to point at the SQL
Anywhere odbc.ini file. For example, to temporarily set the variable, type:
export ODBCINI=/opt/<TDV_install_dir>/odbc.ini
Sybase iAnywhere JDBC Select this check box to enable better performance. This option enables
Driver the Sybase IQ specific ODBC LOAD TABLE SQL tool to import data
into TDV.
Requirements
This performance option is available as listed for the Native Cache Target Support
column of the table in Supported Cache Target Storage Types, page 484.
It requires some setup beyond TDV configuration parameters. The parallel cache
loading option requires that the view that you are caching has:
• One or more user-defined or auto-inferred, simple or composite primary key
of numeric or character type.
• For numeric key columns, BOUNDARY cardinality statistics or DETAILED
cardinality statistics are needed.
• For character key columns, DETAILED cardinality statistics are needed.
Because partitioning is being determined on the key column, the cardinality
statistics help determine how many unique partitions can be loaded in parallel.
The number of unique partitions that can be found by the statistics collection
determines how many threads can be used to load the data into the cache.
To enable JDBC-only cache loading for Oracle and Netezza data sources
1. Open Studio.
2. Select Administration > Configuration.
3. Search for the Enable Native Loading parameter and set its Value to False.
If this parameter is disabled, TDV uses JDBC to perform the inserts into the
cache table.
By default this parameter is enabled. For cache targets saved to Oracle,
database links are used. For cache targets saved to Netezza, bulk loading
options are used to move data.
4. Select the Enable Parallel Loading parameter and set Value to False.
5. Click Apply.
6. Click OK.
Cache refresh behavior can be defined to suit your needs. The following topics
cover most of the TDV cache refresh behaviors that you can customize:
• Refresh Owner, page 522
• Controlling Cache Refresh Behavior for Individual Resources, page 522
• Controlling Cache Refresh Behavior for Multiple Resources, page 525
• Defining Pre- and Post-Actions for a Full Refresh Mode Cache, page 530
• Setting Up Pull-Based Incremental Cache, page 531
Refresh Owner
For Cache Refreshes Started From Studio—TDV always uses the user credentials
saved against the data source to perform all runtime cache refresh operations
including refresh of the cache data and update of user status in the status table
and the internal SYS_CACHES. This is true also even though you might have
pass-through logins enabled.
For Cache Refreshes Started From Triggers—If you have triggers defined that you
use to start your cache refreshes, the user that initiates the table refreshes is the
user that owns the refresh.
The user that is noted for specific actions within the refresh varies depending on
your full TDV environment setup, on how many times you have run the refresh,
and on whether you look in the Studio Manager, TDV log files, or your database
log files. Implementation of TDV is highly customizable. If you are interested in
determining which users are involved in the refresh actions at your site, perform
testing and track the results.
Refreshing and Clearing the Cache for One Resource Using Studio
You can use Studio to refresh your cache data immediately, or you can use Studio
to configure regularly scheduled refreshes of your cache data.
For views and tables, the expiration period applies to the whole result set. For
procedures, each input variants data has its expiration tracked separately.
Note: If you have multi-table caching defined, TDV first attempts to clear the
cache using a TRUNCATE command. If the TRUNCATE command is not
supported, a DELETE command is attempted.
Option Description
Manual The resource owner or an administrator must manually refresh the cache using
Refresh Now or programmatically by invoking the RefreshResourceCache
procedure. See the TDV Application Programming Interfaces Guide.
Any user with the ACCESS TOOLS right who is given appropriate privileges can
also refresh the cache manually.
If your cache is controlled by a cache policy, using a periodic or programmatic
refresh is suggested.
Exactly Once To refresh the cache just once at a specific time, select, and specify the time to
start caching in the set of drop-down boxes in the section labeled Start on. The
Hour field accepts a typed entry of 0—and when you save the resource, Studio
automatically converts the hour to 12 and AM/PM to AM.
Periodic To refresh the cache periodically, select Periodic and specify in the Refresh every
section how often to execute the resource for caching: every x number of
seconds, minutes, hours, days, weeks, months, or years. In the Start on fields,
specify the time and date to begin periodic refresh mode.
4. Under Expiration Schedule, select Never expire, or select Expire after and
specify the time period. The expiration period applies from the end of a
refresh. The Expire after option is disabled if you are using incremental
caching.
when refresh fails (For full refresh caching only) Selecting this option would clear the cache
if a refresh fails. This option allows access to previously cached data
during a refresh.
when refresh begins (For full refresh caching only) Selecting this option would automatically
clear the cache before starting a refresh. Any clients attempting to read
from the cached data must wait for the new data.
When load fails (For push-based incremental caching only) Selecting this option would
automatically clear the cache if the loading of data did not complete.
When initial load (For push-based incremental caching only) Selecting this option would
begins automatically clear the cache before starting a refresh. Any clients
attempting to read from the cached data must wait for the new data.
7. (For procedures only) Define the maximum number of variants you want
stored in the cache from the results of the procedure you are caching.
Maximum number of unique set of input parameter values. (Default is 32.)
8. Save the cache settings.
7. Select the Enable check box to enable this cache policy. If you decide to leave it
cleared, you can continue to define the policy, but it will not be active until
you select the Enable check box.
8. Accept the default or specify the data source where you want data about the
cache policy to be stored. You can use the Browse button to browse through
the Studio resource tree of available data sources.
9. Under Refresh Mode, specify the cache refresh mode. Refreshing a cached
resource retrieves data from the sources and clears stale data as specified.
You can refresh the cache immediately by clicking Refresh Now.
Option Description
Manual The resource owner or an administrator must manually refresh the cache using
Refresh Now or programmatically by invoking the RefreshResourceCache
procedure.
Any user with the ACCESS TOOLS right who is given appropriate privileges can
also refresh the cache manually.
If your cache is controlled by a cache policy, using a periodic or programmatic
refresh is suggested.
Exactly Once To refresh the cache just once at a specific time, select, and specify the time to
start caching in the set of drop-down boxes in the section labeled Start on. The
Hour field accepts a typed entry of 0—and when you save the resource, Studio
automatically converts the hour to 12 and AM/PM to AM.
Periodic To refresh the cache periodically, select Periodic and specify in the Refresh every
section how often to execute the resource for caching: every x number of seconds,
minutes, hours, days, weeks, months, or years. In the Start on fields, specify the
time and date to begin periodic refresh mode.
10. Under Expiration Schedule, select Never expire, or select Expire after and
specify the time period. The expiration period applies from the end of a
refresh. The Expire after option is disabled if you are using incremental
caching.
11. Optionally, you can define pre-and post-refresh actions. As described in
Defining Pre- and Post-Actions for a Full Refresh Mode Cache, page 530.
when refresh fails (For full refresh caching only) Selecting this option would clear the cache
if a refresh fails. This option allows access to previously cached data
during a refresh.
when refresh begins (For full refresh caching only) Selecting this option would automatically
clear the cache before starting a refresh. Any clients attempting to read
from the cached data must wait for the new data.
Studio does validation of the policy definitions for the resources in the list and
issues messages if there are things you need to fix.
6. Fix resource definitions if prompted by any Studio messages.
7. Save your changes.
Note: The Refresh Now button appears enabled for each resource that is not
associated with a cache policy. If the resource is associated with a cache policy,
however, the button will appear disabled.
4. Review other functions under the lib directory to determine if they can help
you achieve the action that you want for your pre- or post-action. Other
functions of possible interest might include:
— CreateResourceCacheKey is used to create a cache key for a given resource.
— TestDataSourceConnection is used to test to see if a data source's
connection is operational.
— ClearResourceCache(path, type) is used to clear the cache on a resource.
— UpdateResourceCacheKeyStatus(path, type, cacheKey, status, startTime,
message) is used to update the cache key for the specified resource.
— GetResourceCacheStatusProcedure is used with CreateResourceCacheKey
and UpdateResourceCacheKeyStatus to support external cache loading.
Returns Cache status information for a given cache key.
5. Save the script or procedure.
6. In Studio, open a view or procedure that has had caching enabled for which
you want to assign pre- and or post-actions.
7. Select the Caching tab.
8. Under Advanced, choose Full Refresh Mode.
9. For the Pre-Refresh or Post-Refresh Action fields, or both, specify a procedure
that exists in the Studio resource tree and has no parameters.
10. Save the cache settings. After cache settings are saved, each cached resource
appears with a lightning-bolt icon in the Studio resource tree to show that the
resource is cached.
11. If you have data type incompatibilities between your view and your data
storage type, see “Cache Data Type Mapping” in the TDV Reference Guide.
5. Review other functions under the lib directory to determine if they can help
you achieve the action that you want for your pull-based incremental caching.
Other functions of possible interest might include:
— CreateResourceCacheKey() is called to generate a new cachekey value.
— LoadResourceCacheStatus() is called to inform the server of the in-progress
refresh.
— LoadResourceCacheStatus() is called to inform the server of the newly
active data.
— TestDataSourceConnection is used to test to see if a data source's
connection is operational.
— ClearResourceCache(path, type) is used to clear the cache on a resource.
— UpdateResourceCacheKeyStatus(path, type, cacheKey, status, startTime,
message) is used to update the cache key for the specified resource.
— GetResourceCacheStatusProcedure is used with CreateResourceCacheKey
and UpdateResourceCacheKeyStatus to support external cache loading.
Returns Cache status information for a given cache key.
6. Consider using some of the following logic to control caching logic:
— The status table is queried to find the currently active cachekey. The
cachekey is in the row with the server ID, resource path, and status column
with value 'A'. The INSERT, UPDATE, and DELETE operations can be
performed on the data table using the cachekey as a filter to avoid updating
any other keys.
— Perform INSERTs and other operations to create new rows in the storage
table as appropriate, making sure all such rows have the new cachekey
value. This should be performed on an independent transaction so it can be
committed when done.
— Perform INSERTs to the status table with the server ID, resource path,
cachekey, the status set to 'I', and the starttime set to the current time to
indicate an in-progress refresh. This should be performed on an
independent transaction so it can be committed immediately.
— UPDATE all rows in the status table with the server ID and resource path
that have status 'A' to have status 'C'. This marks the previous active cache
data for clearing. Then UPDATE the row into the status table with the
serverID, resource path, and cachekey to set the status set to 'A' and
finishtime to the current time. This will indicate that this cache key is the
new active one. This should be performed on an independent transaction
so it can be committed immediately.
7. Save the script or procedure.
8. In Studio, open a view or procedure that has had caching enabled for which
you want to define Pull-Based incremental caching.
Note: If your cache objects were created prior to the current version of TDV,
you might need to recreate them.
9. Select the Caching tab.
10. Under Advanced, choose Incremental Refresh Mode to enable incremental
caching that can be refreshed on demand.
Incremental refresh caching only updates the client version of the caches
when the client requests the update.
11. Specify values for the following fields:
— Initialize the cache using—Specify a procedure or script that exists in the
Studio resource tree and has one output parameter. This script will be used
to create the initial cache.
— Refresh the cache using—Specify a procedure or script that exists in the
Studio resource tree and has one output parameter. The output parameter
should be a of VARCHAR data type.
12. Save the cache settings. After cache settings are saved, each cached resource
appears with a lightning-bolt icon in the Studio resource tree to show that the
resource is cached.
13. If you have data type incompatibilities between your view and your data
storage type, see “Cache Data Type Mapping” in the TDV Reference Guide.
/shared/INCREMENTAL_CACHING/QA/"db-lab-9"/QAN/incr_cache_test_targ
et
SELECT {option disable_data_cache}
cacheKey, S.*
FROM
/shared/INCREMENTAL_CACHING/QA/"db-lab-9"/QAN/INCR_CACHE_TEST S
WHERE
i > CAST(IncrementalMaintenanceLevel AS BIGINT) AND i <= maxI;
EXCEPTION
ELSE
--Log the exception
CALL /lib/debug/log('Exception raised in the delta
loader script');
CALL /lib/debug/log('Exception is : ' ||
CURRENT_EXCEPTION.NAME || ': ' ||
CURRENT_EXCEPTION.MESSAGE);
END
Cache Maintenance
Over time your caching needs might change. This section details some of the
common cache maintenance topics:
• Indexing Your Cache, page 537
• Managing Configuration Changes and Cache Behavior, page 538
• Displaying the Dependencies of a Cached View, page 539
• Displaying the Dependencies of a Cache Policy, page 540
• If you are using the automatic caching option, Studio drops the old files and
creates any necessary new files automatically.
If you change any of the following, be sure to fully reevaluate your caching
configuration:
• The storage data source type, physical location, name, number of columns, or
the column data types.
• The cache status table physical location, name, or other metadata.
• The number of columns or a column data type definition for the view you are
caching.
• The name or number of parameters for a procedure you are caching.
Method Description
Right-click a view that The Lineage panel opens in the workspace to the right. It displays all
has caching enabled in the resources related to the view in a graphical format. Use this
the resource tree and method if the view has lots of objects because the entire workspace is
select Open Lineage. used to display the dependencies and references. This lineage display
also allows you to easily navigate the related resources and saves the
history of the navigation.
Open a view that has The Lineage panel opens in the lower pane of the view’s editor and
caching enabled and displays all the resources involved with this view in a graphical
click the Open Lineage format.
Panel toolbar button.
2. For more information on how to interact with the lineage graph editor, see
Working with the Data Lineage Graph, page 450.
5. For more information on how to interact with the lineage graph editor, see
Working with the Data Lineage Graph, page 450.
To destroy a cache
1. Locate and open the resource for which you want to destroy the associated
cache.
2. Select the Cache tab.
3. Clear the Enable check box and Save to disable the cache.
4. On the upper right corner of Caching tab, locate the Destroy Cache icon.
5. Click it and click Yes.
The file or tables specific to the resource that you selected are destroyed.
However, the metadata associated with the files might continue to display in
Studio.
Set to a value high enough to work well for your environment and network
characteristics. Setting it too low renders caches unusable by TDV.
3. Save your changes.
The setting will take effect on the next server restart.
Managing Buckets
Occasionally when performing a cache refresh, you might encounter a situation
where you receive a message stating:
All buckets for the cached resource are in use
Typically this message is displayed because there are several open client
connections that are accessing the data in the cache buckets.
The Configuration window lets you view all of the configuration parameters for
TDV.
Special characteristics of parameters are provided in their names or descriptions:
• Configuration parameters with (Current) in their names are read-only and
exist only to display current settings.
• Configuration parameters with (On Server Restart) in their names change to
your new values only when TDV is restarted.
• Some configuration parameters have descriptions that contain “This value is
locally defined. It will not be altered when restoring a backup and will not be
replicated in a cluster.” See the Active Cluster Guide for more information
about replication in a cluster.
The left panel shows a list of folders indicated by a folder icon. You can
expand the folders to view the resource tree of configuration parameters. The
right panel displays properties for the currently selected parameter.
3. Click the Expand All button to display the full hierarchy of folders and
parameters, or click the triangle to the left of a folder name to expand only
that folder.
You can click the Collapse All button to return the display to the five top-level
folders.
4. Select a parameter in the left pane to view its properties in the right pane.
If the parameter value cannot be changed, its icon is dimmed.
List Click add button in the Value area to add a value to the list.
Click delete button adjacent to a value to remove it.
Map Click plus in the Value area to add a key-value pair: one field for a Key, the other field
for a Key Value.
Click delete button adjacent to a key-value pair to remove it.
Optional Type a password in the Password field and the identical password in the Confirm
password field.
Click Apply or OK to make the password required. The Confirm field returns to
blank for the user to type a matching password.
Required Type the required password in the Confirm field. The Password Value is a string of
password asterisks (*) representing the password that has been set.
5. If you want to show only the parameters that have pending changes, check
the second check box from the right under the Search button.
The check box tooltip says Show only modified files.
An asterisk (*) is displayed between a parameter icon and its name if a new
value is pending.
6. To apply none, some, or all of the pending changes, do one of the following.
To... Do This...
Clear the pending change to the Click Reset.
highlighted parameter.
The Configuration window contains a Find field and Search button to help you
find parameters.
To... Do This...
Display only parameters Select that type from the Show Type drop-down list. The Show Type
of a specific type. drop-down list is explained earlier in this topic.
Return to displaying all Select All from the Show Type drop-down list.
parameters.
Search by parameter name. Type part or all of the name in the Find field and click Search.
Search by parameter type. Select the parameter type from the Show Type drop-down list, or
select All to show all types.
Search by parameter value Check the Search through descriptions instead of names check box.
or description.
Display only parameters Check the Search Only Modified Parameters check box.
that have pending
changes
Display the folders and Click the triangle to the left of the folder, or double-click the folder
parameters immediately name.
below a folder.
Performance Tuning
You can tune TDV performance in several ways. Some performance tuning topics
are covered in this topic, some elsewhere. The following table summarizes and
links to the sections that describe performance tuning.
Defining indexes, and An index, or a primary or • Use Indexes, and Primary and
primary and foreign keys foreign key, can make data Foreign Keys, to Improve Query
easier to find. Performance, page 573
Using SQL hints You can use these when • Using Oracle SQL Optimizer Hints,
the default behavior of the page 588
query optimizer does not
result in the best query
performance.
TDV provides ways for you to generate and evaluate the execution plans for SQL
queries. The execution plan displays characteristics of the overall request, and of
each request node, depending on which you highlight. The request hierarchy is
displayed on the left (the Tree Panel), and the plan information is displayed on the
right (the Details Panel).
When you execute a particular SQL query for the first time, the TDV Server
caches the query execution plan so it can reuse it if you resubmit the query later.
Working with the SQL execution plan is described in these sections:
• Generating and Displaying a SQL Execution Plan, page 554
• Execution Plan Nodes in the Tree Panel, page 555
• Execution Plan Query Attributes in the Details Panel, page 558
• Execution Plan Node Attributes in the Details Panel, page 560
• Updating the Query Execution Plan, page 562
• Viewing How Much Data was Processed by a Query, page 563
• Refreshing All Execution Plan Caches, page 565
You can also explicitly gather and cache the statistics for a query, such as number
of rows in each table. The execution plan helps the TDV Server make better
decisions about how to process the query. However, if the statistics change, you
should regenerate the query plan so TDV uses up-to-date information. See
Creating Cardinality Statistics for Cost-Based Optimization, page 565.
3. Highlight the view name in the Tree Panel to see details for the overall query
in the Details Panel.
4. Highlight a node (SELECT, one of the JOINs, and so on in this example) in the
Tree Panel to see details for that node in the Details Panel.
Node Functionality
AGGREGATION Shows plans for aggregate functions that are not part of a GROUP BY clause.
CROSS JOIN Merges two streams of incoming rows and produces one stream of rows that
is the Cartesian product of the two streams.
Node Functionality
Pre Data Ship Represents the query execution analysis prior to the shipment of one or
Plan multiple nodes to the data ship target. After a data ship is executed, each of
the Pre Data Ship Plan nodes state:
Execution Status:
NOT EXECUTED. This operation was determined unnecessary or was
cancelled before any rows were retrieved.
The Pre Data Ship Plan nodes are not executed because the query is reworked
to execute the operation on the data ship target.
FETCH Produces the rows resulting from execution of a query on a data source. The
information that can be displayed includes:
• Estimated rows returned (or “Unknown”)
• Data source path, type, and driver name
• Data ship target, if data ship is being used
• Data ship notes, if any
• SQL of the fetch
• SAP BW OLAP View runtime notes with estimated rows returned
The FETCH node and the SQL submitted to the data ship target reveals how a
query that uses data shipment is rewritten to select data that was shipped into
a local temp table to make it available for a collocated operation compared to
the operation that was evaluated in the Pre Data Ship Plan.
FILTER Passes through only the incoming rows that satisfy the filter criterion.
FULL OUTER Merges two streams of incoming rows and produces one stream containing
JOIN the SQL FULL OUTER JOIN of the two streams.
Refer to SQL reference materials for a description of FULL OUTER JOIN.
GROUP BY Reorders the incoming rows so that they are grouped by some criterion. For
example, if the rows are grouped by a name, all rows with the same name are
combined into a single row.
Node Functionality
JOIN Merges two streams of incoming rows and produces one stream containing
rows that satisfy a criterion that applies to both streams. The information
displayed includes:
• Estimated rows returned (or “Unknown”)
• Criterion applied
• Algorithm used
• Algorithm notes
• Estimated left and right cardinality
PROCEDURE Produces rows resulting in the execution of a query or stored procedure call
on a data source. The information displayed includes:
• Estimated rows returned (or “Unknown”)
• Location of the SQL statement
• Reason for not pushing if that is the case
SELECT Applies functions on the column values of the rows. This node produces the
same number of rows that it reads. The information displayed includes:
• Estimated rows returned (or “Unknown”)
• Projection of the SELECT statement
• Data ship notes, if any
UNION Combines two streams of incoming rows and produces a single stream. The
cardinality of produced rows equals the sum of the cardinality of the
incoming streams. The order in which the node produces rows is undefined.
Node Functionality
FULL OUTER A full outer join performed as part of a stored procedure.
PROCEDURE
JOIN
MERGE Merges data into a data source, typically a table. Merge inserts a row if it is
not present, and updates a row if it already exists.
INTERSECT
EXCEPT
Field Description
Background data Time spent by background threads in all FETCH, page 556 and
source read time PROCEDURE, page 557 nodes in the execution plan.
Background server Time spent by background threads in all the nodes (except for FETCH,
processing time page 556 and PROCEDURE, page 557) in the execution plan.
Data ship data transfer Time required to transfer data from one data source to another.
time
Field Description
Elapsed execution Amount of wall-clock time that the server used to execute the query. This
time time is the total of Query initialization time, page 559, Foreground server
processing time, page 559, and Foreground data source read time,
page 561.
Foreground server Fraction of the Elapsed execution time, page 559 that the server used in
processing time the actual execution of the query; that is, the processing time of the nodes
in the execution plan. This time does not include the time used to read
rows from the data sources. By comparing this time with Foreground data
source read time, page 561, you can determine how much time was spent
by the server versus the time spent in the data sources.
Query initialization Time the server used to analyze the query, create and optimize an
time execution plan, rewrite the query, and establish connections (from the
pool if pooling is configured) to the data sources.
If data ship is involved, time is spent creating a data ship temp table at the
data ship target, and shipping the data. With multiple data shipments to
multiple temp tables, these tasks are done in parallel by separate threads,
and the shipped segment that takes the longest keeps the query
initialization time clock running.
Reset count
Rows returned Number of rows produced by an execution node. If you want to know
how many rows were read by the node, look at the returned row counts of
the node’s children.
Speed up due to Estimate of how much faster the query ran because of threading. For
concurrency example, 100% means the query ran twice as fast with threading.
Field Description
Algorithm Name of the algorithm used by the node; for example HASH JOIN or
SORT/MERGE JOIN.
Background data For SCAN or PROCEDURE: how much time the child thread spent on
source read time reading from the data source.
Background node For nodes other than SCAN, PROCEDURE, INSERT, UPDATE, DELETE, or
processing time MERGE: time spent processing this node by a background thread. This time
is zero if the node was processed by a foreground thread.
Criteria Shows the criteria used by a JOIN, FILTER, GROUP BY, or ORDER BY node.
Condition or predicate applied during an operation
Data ship notes For SCAN or SELECT. A report of the estimated number of rows to be
shipped, and the amount of time consumed to get the cost estimate. When a
condition blocks the use of data ship, this field usually describes the
condition, case, or option that blocked optimization.
Field Description
Data ship target For SCAN. Shows where the results of SQL execution will be sent. The
presence of Data ship target indicates that data ship will be performed.
Estimated left Estimated cardinality for the left input in a binary join. Pre-execution (Show
cardinality Execution Plan button) Details Panel only
Estimated right Estimated cardinality for the right input in a binary join. Pre-execution (Show
cardinality Execution Plan button) Details Panel only
Estimated rows Pre-execution (Show Execution Plan button) Details Panel only. Logged
returned execution plan records actual Rows Returned.
Execution status If the operation for the node is still running, status can be PARTIAL RESULTS
or NOT EXECUTED.
Foreground data For SCAN or PROCEDURE. Fraction of the Elapsed execution time, page 559
source read time that the main thread used to read data from the data sources. By comparing
this time with Foreground server processing time, page 559, you can
determine how much time was spent by the server vs. the time spent in the
data sources.
Foreground node For nodes other than SCAN, PROCEDURE, INSERT, UPDATE, DELETE, or
processing time MERGE: fraction of the elapsed time used by the node. This time is zero if the
node was processed by a background thread.
Name
No push reason Why the node was not pushed to the data source.
Field Description
Peak memory Peak memory reserved for current node.
reserved
Rows deleted
Rows inserted
Rows merged
Rows returned
Rows updated
Runtime notes
SQL SQL text executed by the node. In a FETCH, page 556 or PROCEDURE,
page 557 node, this field contains the actual data-source-specific query that
was pushed.
A simple way to force a refresh of an execution plan is to add a space to the query.
You can also flush the query plan cache for all views by temporarily toggling the
Query Plan Cache Enabled key in the TDV Configuration window as described in
Refreshing All Execution Plan Caches, page 565.
Each node name is followed by values in parentheses like (n) or (n, m%),
where n is the number of rows produced by that node, and m is the percentage
of elapsed time that the node used to process the data. If m is not shown, it is
0.
For example, if the elapsed time was 60 seconds and m is 20, the node
accounted for 12 seconds of the elapsed time (20 x 60). If m is 0, processing at
the node did not contribute to elapsed time—for example, if the node was
processed by a background thread and its processing was completed before
the rows were needed by the parent node. The m percentages help determine
which nodes to focus on to improve performance.
5. Optionally, set a refresh rate for the execution plan.
6. Optionally, click Refresh Now during the execution of the query.
This updates the statistics being gathered on data sources and tables, which
can take a long time for long-running queries.
For Example
Name:
SELECT
Rows Returned:
220
Estimated Rows Returned:
Unknown
Total execute time which include children time and wait time.:
121.4 msecs
Foreground Node Processing Time:
2.5 msecs
Peak memory reserved:
4000000
Projection:
orderdetails0.orderid OrderID, orderdetails0.productid ProductID,
orderdetails0.discount Discount, orders.orderdate OrderDate,
customers.companyname CompanyName, customers.contactfirstname
CustomerContactFirstName, customers.contactlastname
CustomerContactLastName, customers.phonenumber
CustomerContactPhone, productCatalog_xform.ProductName
ProductName, inventorytransactions.transactionid TransactionID,
purchaseorders.daterequired DateRequired,
purchaseorders.datepromised DatePromised, purchaseorders.shipdate
ShipDate, suppliers.supplierid SupplierID, suppliers.suppliername
SupplierName, suppliers.contactname SupplierContactName,
suppliers.phonenumber SupplierPhoneNumber
Data Ship Notes:
Data Ship Query is not possible. No suitable source scan or ship-to
target found.
The TDV SQL Query Engine might use statistics gathered from the data sources,
when they are available, to create an efficient SQL query execution plan for joins
or unions across tables, or to optimize caching. Statistics gathered on tables or
views provide the SQL Query Engine with estimates on the table cardinality (the
number of unique values or rows) so that optimizations can be applied.
By default, no statistics are gathered for any data source, and statistics storage
tables must be created explicitly at the level of the data source to enable selected
table and column scans.
Statistical data associated with a data source can be exported and imported with
the data source, or you can just back up the configurations.
Any user with Access Tools and Modify All Resources rights, and with READ and
WRITE privileges on a data source, can set up the gathering of cardinality
statistics on the data source and one or more of its tables.
This section includes:
b. Use one or more of the following fields to override and avoid processing
of statistics in the database.
This can be a valuable way to avoid gathering column boundary statistics and
have TDV use the values you specify when the queries are run.
When TDV gathers statistics, it retrieves all data from the source and builds a
histogram and string index based on this data. This operation can be
expensive, depending on the number of rows and columns, network
bandwidth, and so on. If you have limited time for statistics gathering,
overriding the values can be useful.
These column values are used to create a uniform distribution at run time for
the evaluation of the selections. If a value is specified for a column, statistics
are not gathered for that column.
— Minimum Value—An option to override the minimum value for the
column.
— Maximum Value—An option to override the maximum value for the
column.
— Distinct Count—An option to override the number of distinct values for the
column.
— Num of Buckets—An option to increase or decrease the granularity of the
statistics.
13. Save your selections.
The following built-in procedures are available for using resource statistics:
• localhost/lib/resource/RefreshResourceStatistics
• localhost/lib/resource/CancelResourceStatistics
• services/databases/system/SYS_STATISTICS
Types of Joins
Inner and outer joins can use an equality condition (equijoins) or not
(non-equi-joins). Most table joins are equi-joins, comparing columns and rows
with like values. The nested loop algorithm is used for non-equi-joins.
The following table lists algorithms used and the types of joins that they can be
used with.
Semijoin
Algorithm Inner equi-join Non-equi- Left/right outer Full outer join optimizatio
join join
n
Hash yes yes yes yes
TDV uses several algorithms to execute joins on table columns. By default, the
TDV query engine attempts to find the best algorithm and join optimization to
use for your SQL if none is specified. The Join Properties editor offers the
following join algorithm options:
• Automatic Option, page 574
• Hash Join Option, page 575
• NESTEDLOOP Join Option, page 575
• SORTMERGE Join Option, page 575
• Semijoin Optimization Option, page 579
• Star Schema Semijoin, page 587
Automatic Option
By default, the TDV system automatically optimizes the SQL execution plans for
any query, given estimates of left and right cardinality and other statistical
information. Automatic is not a join algorithm or a join optimization, but it lets
the TDV query engine optimize based on an analysis of the SQL using known
database statistics.
You can specify the SQL execution plan explicitly with one of the other options;
however, specification of a join option only influences what join algorithm or
optimization is first attempted, not necessarily the one ultimately used. If the
option specification is incompatible with the data sources or incorrectly specified,
the execution plan might change to a compatible join that was not specified.
To perform the SORTMERGE join, the left side has to be put into ascending order
by A and B, and the right side has to be put into ascending order by X and Y.
If a side is not ordered or does not have compatible ordering, the query engine
automatically checks whether a compatible ORDER BY can be pushed to the data
source so that the SORTMERGE join can proceed.
SORTMERGE is not compatible with the SQL when ORDER BY cannot be pushed
to the data sources. Ordering must be performed on both sides for a
SORTMERGE join.
To prevent the query engine from choosing SORTMERGE over HASH, specify the
value of the SORTMERGE option as false, as follows:
{option SORTMERGE="false"}
TDV automatically assesses SQL Join blocks (inner joins and all types of outer
joins) to determine suitability for rewriting the SQL join order. A Join block is a set
of contiguous joins. The TDV query engine analyzes each join block of the query
independently, and attempts to rewrite the query to maximize the number of joins
pushed down to data sources. While inner joins can be reassociated without
restriction, the TDV query optimizer checks a number of technical rules to
determine if it is safe to reorder a block that contains outer joins.
For queries that meet complex criteria, TDV rewrites the SQL join order if it
expects that to yield a performance gain. Joins are reordered so that tables from
the same data source are performed first, which maximizes push down and
makes the query more efficient.
The following examples show how SQL join ordering works.
SELECT O10BIN_ID, O100BIN_ID, O1KBIN_ID
FROM /users/composite/test/sources/oracleA/QABIN/O10BIN RIGHT
OUTER JOIN
/users/composite/test/sources/oracle/QABIN/O100BIN
ON O10BIN_NUMBER1 = O100BIN_NUMBER1 INNER JOIN
/users/composite/test/sources/oracle/QABIN/O1KBIN
ON O100BIN_FLOAT = O1KBIN_FLOAT
The query as written would perform first the outer join and then the inner join,
because by default joins are left-associative. The plan for this query looks like this:
[1] Request #2511
[2] + SELECT (13)
[3] + JOIN (13)
[4] + RIGHT OUTER JOIN (100)
[5] | + FETCH (10) [Oracle2]
[6] | + FETCH (100) [Oracle]
[7] + FETCH (1000) [Oracle ]
ON O100BIN_FLOAT = O1KBIN_FLOAT)
ON O10BIN_NUMBER1 = O100BIN_NUMBER1
The join between the two tables from the same data source is performed first, and
the optimizer is able to push down this join to the data sources. The
corresponding query plan looks like this:
[1] Request #2533
[2] + SELECT (13)
[3] + RIGHT OUTER JOIN (13, 9%)
[4] + FETCH (10, 36%) [Oracle 2]
[5] + FETCH (13, 59%) [Oracle]
TDV has to perform only one join. Most of the data is filtered out at the data
source, reducing data transfer. TDV performs automatically rewrites join blocks
that include inner joins and all types of outer joins.
A join block is a set of contiguous joins. The query engine analyzes each join block
of the query independently and attempts to rewrite the query to maximize the
number of joins pushed down to data sources. While inner joins can be
reassociated without restriction, the optimizer checks a number of technical rules
to determine if it is safe to reorder a block that contain outer joins. Cross-joins are
not eligible for reordering. Join reordering is not performed if it introduces new
cross joins. A join node with hints is not considered for reordering.
A precondition of the join ordering algorithm is that all outer joins can be
simplified. Here is an example:
SELECT O10BIN_ID, O100BIN_ID, O1KBIN_ID
FROM /users/composite/test/sources/oracle2/QABIN/O10BIN LEFT OUTER
JOIN
/users/composite/test/sources/oracle/QABIN/O100BIN
ON O10BIN_NUMBER1 = O100BIN_NUMBER1 INNER {OPTION SEMIJOIN}
JOIN
/users/composite/test/sources/oracle/QABIN/O1KBIN
ON O100BIN_FLOAT = O1KBIN_FLOAT
The left outer join can be converted to an inner join because the inner join above it
has a null-rejecting predicate that applies to the null-filled side of the outer join.
The ON clause of the top join rejects any rows produced by the left outer join that
have NULL for the O100BIN_FLOAT column.
The plan for this query looks like this:
[1] Request #3040
[2] + SELECT (2)
[3] + JOIN (2)
[4] + JOIN (1)
[5] | + FETCH (10) [Oracle2]
[6] | + FETCH (89) [Oracle]
[7] + FETCH (2) [Oracle]
This shows that left outer join was converted to inner join but that join reordering
did not take place because the join hint implies that the inner join (in the input
query) cannot be reordered.
TDV performs the SQL join reordering analysis automatically. Special
configurations or SQL options are needed only if you want to enforce SQL
processing as it is written because you want table construction or table
relationships.
You can join the tables in any order you want. (See Views and Table Resources,
page 215.)
If the source side has many rows and produces a prohibitively long SQL string on
the target side, the TDV query engine deals with it in one of these ways:
• Partitioned semijoin—The TDV query engine breaks up the target-side
predicate into multiple chunks, and submit one query for each chunk to the
target side. This technique applies regardless of whether the ON clause uses
conjunctions or is just a single predicate, and regardless of whether the target
side SQL is using the IN clause or OR syntax. The partitioned semijoin is also
applied to semijoin queries with non-equality conditions.
• Query predicate rewriting—If a join has conjunctions in the ON clause and the
target-side data source does not support the row-constructor form of the IN
clause, then instead of generating a target side row-constructor form using an
IN clause like:
(t1,t2) IN ((1,2), (6,3), (8,4),(93,3)…)
The TDV Server might generate a predicate with two or more scalar IN
clauses like this:
t1 IN (1,6,8,93") and t2 IN (2,3,4,3,…).
The scalar IN clauses syntax is more compact than the OR syntax, and it
reduces the chance of partitioned semijoin and load on the target data source.
However, this predicate is not as selective as the OR form, so it could bring
more rows into TDV. Because of this trade-off, you can configure the number
of rows on the source side that causes the engine to switch from the OR syntax
to multiple scalar IN clause syntax.
In this case, the TDV Server query engine might generate an additional
predicate (the row-constructor form of the IN clause) on the T-side that takes
the form:
(t1,t2) IN ((1,2), (6,3), (8,4), (93,3)…)
The second form is logically identical to the IN-clause form, but if the right
side produces many rows, it could generate a long SQL string on the target
side, potentially exceeding the capabilities of DS2.
• Pushing left outer joins with multiple predicates—If the query engine can
predetermine what is selectable based on an ON clause and a WHERE clause,
it pushes the query to the data source. For example:
SELECT
O10_Id
O10_CHAR
S_Id
S_CHAR
FROM /users/eg/t1 left outer { option semijoin=”false”, hash } join
/users/sqlsr/5kk
ON O10_Id = S_Id
WHERE O10_Id = 9
Here, the WHERE clause predicate is the same as the ON clause predicate, so
O10_Id = 9 can be pushed to the data source, which can dramatically reduce
the amount of data returned.
A semijoin between two tables returns rows from the first table whenever one or
more matches are found in the second table. The difference between a semijoin
and a conventional join is that rows in the first table are returned at most only
once, even if the second table contains two matches for a row in the first table.
This section contains:
For inner and left outer joins, the semijoin optimization always uses rows from
the LHS to constrain the RHS—for example, where LHS is the source side and
RHS is the target. For a right outer join the situation is reversed: RHS is the source
and LHS is the target.
As another example, consider the tables Employee and Dept and their semijoin.
Sales Thomas
Production Katie
Production Mark
The semijoin of the employee and department tables would result in the
Employee joined to
Dept Name EmpId DeptName
following table.
Consider another query like this:
SELECT * FROM DS1.R INNER JOIN DS2.T ON R.r = T.t
The usual evaluation would hash the R table expression and then iterate over T
table expressions, and do look-ups into the hash table to find matching rows. If
the join in the example is such that the number of rows from the R-side is much
smaller than the number of rows from the T-side, additional optimization is
possible. The rows from R are buffered in TDV, but an additional predicate of the
form “t IN (1,6,8,93…)” is added to the SQL generated for the RHS. The values on
RHS of the IN clause are all the ‘r’ values from LHS of the join. This additional
predicate ensures that TDV retrieves the smallest possible number of rows from
the T-side to perform the join.
A query hint, enclosed in curly braces immediately precedes the JOIN keyword.
The default value of OPTION SEMIJOIN is True, so the value of semijoin does not
have to be explicitly set. You can also specify something more specific like:
… INNER {OPTION SEMIJOIN, PARTITION_SIZE=20} JOIN …
This option forces the semijoin to be partitioned, with each partition having no
more than 20 elements.
The cardinality of both sides of a potential semijoin are evaluated, and the side
with smaller estimated cardinality is loaded into memory as the LHS. When the
cardinality is small enough, one IN clause or an OR expression is created
containing all the values in the join criteria from the LHS, which is then added to
the SQL sent to the RHS.
Semijoin is limited for databases whose vendors restrict how large a SQL
statement or IN/OR clause can be. If the cardinality exceeds specific data source
limitations on the size of the IN clause or the OR expression, the query engine
creates an execution plan that attempts a partitioned semijoin. The partition
breaks the IN list into chunks of 100 or fewer unique values, and multiple queries
are executed against the RHS source. If the cardinality is still too large, the system
uses the HASH algorithm.
Another restriction is set to keep the LHS from inordinately burdening the join.
You can configure the LHS cardinality to determine the row count that trigger an
automatic semijoin.
For example, if Max Source Side Cardinality Estimate is 200 and Ratio is 10, a
query where source cardinality is estimated at 50 and target at 600 triggers use of
the semijoin automatically; a query with source estimate 100 and target estimate
900 does not.
If the TDV query engine can estimate the source side but not the target side, it
assumes that the target-side cardinality is large and sets up a semijoin using the
value of Max Source Side Cardinality Estimate.
Parameter Description
Max Source Side Put an upper bound on the size of the predicate generated by the source
Cardinality Estimate side. If the cardinality estimate is greater than this setting, the TDV query
engine does not automatically choose a semijoin.
Min Ratio of Target Derive a minimum cardinality of the target to trigger automatic semijoin.
Cardinality to Source
For example, if the estimate for the source side is 50 and the ratio is set to
Cardinality
12, the target side must be estimated to be at least 600 rows to trigger use
of the semijoin optimization. If the cardinality of the target is not specified,
the target cardinality is assumed to be very large.
Ideally the TDV query engine generates a predicate in the WHERE clause against
DS2 that looks like this:
(s1, s2) IN ((2,3),(1,5), …)
• Use multiple scalar IN clauses syntax to generate a query predicate that looks
like:
s1 IN (2,1, …) and s2 IN(3, 5,…)
Field Description
Max Source Side Change the number of LHS rows that triggers an automatic semijoin. If
Cardinality Estimate the left side cardinality is less than this value, and the product of this
value and the Min Ratio of Target Cardinality to Source Cardinality (next
row of this table) is less than the estimated RHS cardinality, the TDV
query engine attempts to rewrite the query execution plan to use the
semijoin optimization.
Min Ratio of Target Sets a minimum ratio of RHS to LHS to trigger use of semijoin
Cardinality to Source optimization.
Cardinality
If the query optimizer knows that LHS cardinality is small enough compared
to RHS, it attempts a semijoin.
If LHS cardinality is less than the value of Max Source Side Cardinality
Estimate, and the product of LHS cardinality and the value of Min Ratio of
Target Cardinality to Source Cardinality is less than the estimated RHS
cardinality, the TDV query engine attempts to rewrite the query execution
plan to use the semijoin optimization.
3. Define the RIGHT_CARDINALITY query hint and its value.
The optimizer uses the hint to choose a better query plan. Specifically, it is
used when checking LHS cardinality against the minimum ratio set in Min
Ratio of Target Cardinality to Source Cardinality.
SELECT column1 FROM table1 INNER {OPTION RIGHT_CARDINALITY=10000}
JOIN table2 ON table1.id = table2.id
If the RHS cardinality of a table is not known or not specified, the TDV query
analyzer assumes that it is large, and determines whether to use semijoin
according to the specified cardinality of the LHS.
4. If statistics have not been collected on the data source, optionally use Studio to
specify the minimum, maximum, and expected cardinalities.
The target side predicate always uses the OR form, and looks like this:
(1 > t1 and 1 <= t2) OR (6 > t1 and 6 <= t2) …)
Two semijoins are possible. J2 can have S as source and T as target, and J1 can
have R as source and S as target. Both joins can use the semijoin optimization at
the same time. The definition of source and target of a semijoin needs to be
refined.
For an inner join or a left outer join with semijoin, the immediate LHS child of the
join is the source, and the target is a FETCH against a data source found in the
RHS sub-tree of the join. The ON clauses that define table connections determine
the specific FETCH node target. For right outer join, the definition is the reverse.
A join with a semijoin optimization can have more than one target. For example:
SELECT * from DS1.D1 INNER JOIN DS2.F INNER JOIN DS1.D2 on ds1=f1
on ds2=ds1
The join with D1 as source can target both F and D2. Because multiple nodes
target DS1, this join can be evaluated as a star schema semijoin.
A star schema semijoin is like a query with multiple joins. Consider this query
topology.
Here the ON clauses of the joins are such that both D1 and D2 are connected to F,
and both joins are inner joins. In this case two semijoins are possible—one from
D1 to F and the other from D2 to F—because they both target the same data
source.
If the data source is sufficiently robust, the data source can be marked in TDV
Server so that the query engine is aware that the data source supports star schema
semijoin optimization for multiple join nodes. In Studio, the Info > Advanced tab
of the data source configuration pane has a Supports Star Schema property.
When the Supports Star Schema check box is enabled, the TDV query engine is
made aware that both joins can be run using the semijoin optimization.
The reason this needs to be explicitly enabled is that even one semijoin with a
large source side can place a significant burden on the target data source, and if
several joins target the same data source at the same time the burden can
overwhelm some data sources. It is easier for star schema semijoin to generate
SQL strings that exceed the capabilities of the target data source. Thus this setting
is useful when the target data source is very powerful or when all source sides are
fairly small.
If the SQL sent to the target side in a star schema query exceeds the maximum
length of SQL for the data source, the engine runs with a partitioned semijoin if
possible, or it disables semijoin optimization for some of the joins.
The partitioned semijoin might not always apply to a star schema semijoin. If
there are n joins targeting F and all but one of them generate a short IN clause but
one generates a long IN clause, the query engine can still partition the long side of
the semijoin. If all the joins produce IN clauses of approximately equal size
partitioning might not be useful. In this case, the TDV query engine disables the
semijoin optimization for the join that generates the longest SQL string. It
continues to disable semijoin optimization until the total SQL for all remaining
joins is within the capabilities of the target data source or partitioning becomes an
option.
Because of the star schema semijoin, at most one of the predicates can be
partitioned. If the query specifies more than one join with the partition_size
option, at most only one of these requests can be satisfied.
For information about settings, see Setting Semijoin Configuration Parameters,
page 582.
You can override settings for specific data sources.
Data-source overrides apply when a given data source is the target of a semijoin.
Typically these values are set conservatively at the server level, and overridden if
necessary for specific data sources.
The target data source has the burden of processing the potentially large
predicates generated by a semijoin, and the target data source must impose limits
on how big the query predicate from the source side can be.
Several data sources support nested aggregate functions, but some do not.
Meanwhile, checking for these functions can significantly slow performance. You
can use a TDV configuration parameter to decide whether or not to check SQL
queries for nested aggregate functions.
For example, Netezza does not support nested aggregate functions, so you can
improve performance by setting the Check if nested aggregates is supported by
datasource to False for queries pushed to a Netezza data source.
A nested aggregate function looks like this:
SELECT AVG(MAX(salary)) FROM employees GROUP BY department_id;
The DB channel queue size is a setting that specifies the number of buffers within
the TDV Server for a client request to prefetch from the database. By default this
value is set to 0. If set to 1, the TDV Server can prefetch from the database and put
it in a buffer to store it while the TDV JDBC driver is still fetching data from the
TDV Server. When the TDV JDBC client asks for the next batch, it is already
available in the buffer.
You can control the amount of data fetched in each batch of rows retrieved from
an Oracle or MS SQL Server data source. In the data source capabilities file, the
fetch_size attribute controls the number of rows that are being fetched from the
data source on every ResultSet.next call.
Adjusting fetch_size is optional and for performance tuning only. Be sure to test
the effects of these adjustments to see if performance improves. If the configured
value of fetch_size is zero or negative, the default value for the driver is used.
Note: A fetch size that is too small can degrade performance.
2. Using a text editor like Notepad, open the capabilities file in the directory for
your data source:
microsoft_sql_server_2008_values.xml
oracle_11g_oci_driver_values.xml
oracle_11g_thin_driver_values.xml
3. Uncomment the fetch_size attribute and set its value. For example:
<ns5:attribute
xmlns:ns5="http://www.compositesw.com/services/system/util/common"
>
<ns5:name>/runtime/iud/fetchSize</ns5:name>
<ns5:type>INTEGER</ns5:type>
<ns5:value>12345</ns5:value>
<ns5:configID>jdbc.fetch_size</ns5:configID>
</ns5:attribute>
Subquery Optimizations
TDV can analyze your subqueries and rewrite them as joins, if they meet the
requirements for this subquery optimization. Often better performance can be
achieved with joins as opposed to correlated subqueries.
Requirements
• Subquery starts with IN or =.
• Subquery starts with ‘=’ and includes an aggregation.
• The subquery can contain only one selectable of type COLUMN, LITERAL,
FUNCTION, or AGGREGATE FUNCTION.
• The subquery can include SELECT, DISTINCT, AGGREGATE, FROM,
RELATION and WHERE.
• Co-relation in the subquery can occur only in the WHERE clause for this
feature. If co-relation occurs anywhere else, rewrite to join won't happen.
Sometimes, for partitioned data, you want to run a query against a specific
partition. This behavior is often referred to as join pruning. TDV has been set to
perform this for you automatically. If you want to manipulate or confirm the
settings, you can use the TDV configuration parameters that are accessed through
Studio.
Requirements
• Null values are not allowed in the tables for join pruning consideration.
DATA_SHIP_MODE Values
DBChannel Prefetch Improves the performance for read/write operations. By default, this
Optimization parameter is set to TRUE.
DBChannel Queue Size Allows you to tune the number of buckets. By default, this parameter
is set to 0. The appropriate starting value is 10.
When a query is submitted to TDV, TDV evaluates the query and produces an
execution plan without using the data ship optimization. This plan is a pre-data
ship plan. TDV goes on to examine the fetch nodes of this plan and decides
whether to use the data ship optimization. The data ship optimization is used if:
• The data sources corresponding to fetch node supports data ship.
• Data ship is not disabled for the query using query hints.
• Data source is configured as data ship source or target.
Among all the nodes involved, TDV dedicates one node as the target and all
others as sources. TDV creates temp tables in the target data source and moves
data from other nodes into this temp table. An alternate query plan is generated
that transfers the federated SQL operation of a prior plan to the targeted data
source.
Determining the target node is based on the:
• Data source corresponding to the fetch node is configured as source or target.
• Cost of retrieving the data with in the lower and upper bound set for data
ship.
The data ship optimization feature requires that at least one data source
participating in a join, union, intersect, or except in an eligible SQL execution be
configured to act as a target for the data ship, and that at least one other data
source participating in the data ship act as the data ship source. The data ship
source and the data ship target can be the same type of database or of different
database types. For example, you can configure data ship to have an Oracle data
ship source and an Oracle data ship target, or you can configure data ship to have
an Oracle data ship source and a Teradata data ship target.
Note: Netezza and DB2 assume that if they are used as a data ship target that they
can also be the data ship source, therefore if TDV determines that by reversing the
direction of the data ship that better performance can be achieved, it will do so
regardless of what you may have defined.
Data Data
Data Source Ship Ship Performance
Notes
Type Source Target Option
Support Support
DB2 v10.5 Active Active Bulk Load LUW
using the
LOAD utility
Data Data
Data Source Ship Ship Performance
Type Source Target Option Notes
Support Support
Oracle 11g Active Active Database Links To use an Oracle data source for data
ship, the DBA must install the
DBMS_XPLAN package in the database
and create an area for temporary tables.
For this data source to participate in
data ship, it must be specified as a data
ship source. Participation as a data ship
target is optional. If Oracle is both
source and target, DB Link needs to be
set up between the Oracle databases.
Data Data
Data Source Ship Ship Performance
Type Source Target Option Notes
Support Support
Teradata Active Active FastLoad/ For this data source to participate in
13.00 FastExport data ship, it must be specified as a data
ship source. Participation as a data ship
Teradata Active Active FastLoad/ target is optional.
13.10 FastExport
Teradata Fastload mode doesn't work
Teradata Active Active FastLoad/ correctly using the 14.10 JDBC driver
14.10 FastExport when Teradata is the Target Data Source.
To workaround this issue, use the
Teradata 15 Active Active FastLoad Teradata JDBC 15 driver.
Export to
another Vertica
database
Some resource constraints limit the use of data ship optimization. Most
limitations are related to data type mismatches. Data type mismatches are
handled by a transformation of the data, but certain data types from different
sources are incompatible for a selected target. Typically, these data type
mismatches are most notable for numeric precision.
The following is a list of cases where use of the Data Ship optimization is not
recommended.
Netezza cannot be the data ship target for tables with LONGVARCHAR or
VARCHAR columns more than 64 KB long, because creation of the
temporary table fails.
When Netezza is the data ship source, data types of FLOAT or DOUBLE
might lose precision because of rounding of values sent to a target of a
different type.
Netezza to Sybase IQ With data ship from Netezza to Sybase IQ, NULL values are replaced with
zeroes, which results in different query results than when data ship is
disabled.
Sybase IQ or When Sybase IQ or Netezza data sources use Oracle as a data ship target,
Netezza to Oracle trailing spaces sent in the shipped table are trimmed in the result sets with
the Oracle table.
Sybase IQ to Netezza When Sybase IQ data sources use Netezza as the data ship target can cause
a data mismatch because the Netezza database appends a padding space
character in result set data.
To Oracle If an Oracle database is a data ship target and the transferred data contains
UTF-8-encoded East Asian characters, a column length limitation exception
can occur.
Oracle databases with a UTF-16 character set does not have this problem.
To Sybase IQ If you are moving data of type FLOAT to a Sybase IQ database, the scale of
the data can be lost because of the way that the Sybase IQ JDBC driver
handles the FLOAT data type.
Sybase IQ Type4 Sybase IQ Type4 driver appears to lose the precision of time stamp columns
Driver by promoting or demoting it. To avoid this issue, use the Sybase IQ Type2
driver.
A row fetch size bigger than 64 KB causes a Teradata error. Refer to Teradata
documentation for the best solution to this problem.
Teradata FastLoad requires that the target table be empty. If the target
Teradata table is not empty, JDBC is used to load data into the table.
3. Navigate to and determine the best settings for the following parameters.
Parameter Description
Execution Mode Default server handling for data ship queries:
• EXECUTE_FULL_SHIP_ONLY—(Default) Generates an error to
alert you that the full data ship was not successful.
Allows data-ship execution only when the query can be performed
without federation after shipment. In other words, after shipment
the query must be pass-through.
With EXECUTE_FULL_SHIP_ONLY, if parts of the query are
federated after the data ship query optimization has been applied,
the query execution plan Data Ship Notes contain a message like
the following:
Data Ship Query is not possible because after ship query is still
federated and Data Ship Mode is EXECUTE_FULL_SHIP_ONLY.
• EXECUTE_PARTIAL_SHIP – This option allows both data ship
and federated queries to coexist and proceed to execution even if
they cannot be fully resolved into a full data ship.
EXECUTE_PARTIAL_SHIP allows a query to proceed without
throwing an error, even if a federated query is still required to
complete query execution.
• EXECUTE_ORIGINAL – If certain nodes cannot be pushed and
shipped because some predicates cannot be resolved prior to data
pass-through, the original (pre-data ship) query plan is executed.
EXECUTE_ORIGINAL causes query execution using the
pre-data-ship execution plan whenever a query cannot be
completely pushed to the data sources. The Data Ship Notes reveal
that fact on execution.
When a data ship query is not possible because of dependency on
external results or to support something that cannot be pushed to
the data source, the original pre-data-ship execution plan is used
without shipping results to the targeted table to complete the
invocation.
• DISABLED — causes the query execution plan to process the
request invocations without data ship. DISABLED mode is useful
for debugging.
Parameter Description
Maximum Number of Limits the number of concurrent data transfers (default 100,000), to
Concurrent Data Transfers avoid affecting the performance of other processes. Beyond this limit,
new queries requiring data transfers are queued until an existing
transfer is completed.
4. Click Apply.
5. (Optional) Navigate to and determine the best settings for the following
parameters.
Parameter Description
Buffer Flush Limits the size of the buffer. (Default is 10000.) Certain types of data ship
Threshold SQL executions buffer large tables before delivery to the data ship target,
and the buffer size can exceed available memory. The buffer is flushed
when this limit is reached.
DataShip Keep Temp The default value is false. When set to true, the server does not delete the
File or Table temp table or file generated for each data ship request. This option should
only be enabled when requested by a support team member.
This value is locally defined. It is not altered when restoring a backup and
is not replicated in a cluster.
This parameter is ignored if you are using data ship and check the ‘Use
global temporary space for temporary tables’ check box for the data ship
target.
6. (Netezza only) Navigate to and determine the best setting for the following
parameters.
Parameter Description
Keep nzload Log File The Netezza driver generates a log file for each nzload operation. If true,
all log files are kept. If false (default), no log files are saved to disk.
This value is locally defined. It is not altered when restoring a backup and
is not replicated in a cluster.
nzload Log Directory The directory to save the Netezza driver nzload log file for data ship. The
default log directory is $TDV/tmp/dataship/netezza.
This value is locally defined. It is not altered when restoring a backup and
is not replicated in a cluster.
Parameter Description
Escape Character The escape character to use while exporting contents to or importing
contents from a flat file. It defaults to backslash (\).
Truncate Long Boolean. True causes any string value that exceeds its declared CHAR or
Strings during native VARCHAR storage to be truncated to fit. False (default) causes the system
loading to report an error when a string exceeds its declared storage.
7. Click Apply.
8. (Teradata only) Navigate to and determine the best settings for the following
parameter:
Parameter Description
DataShip Limit for the FastLoad and FastExport processes within Teradata. If the row
FastExport/FastLoa number exceeds this limit, use FastLoad or FastExport.
d Threshold
This value is locally defined. It is not altered when restoring a backup and is
not replicated in a cluster.
Depending on the type of your TDV data source, one or more of the following
configuration tasks might be required:
• Configuring Data Ship for DB2, page 608
• Configuring Data Ship Bulk Loading Option for DB2, page 609
• Configuring Data Ship for Microsoft SQL Server, page 611
• Configuring Data Ship for Netezza, page 613
• Configuring Data Ship for Oracle, page 614
• Configuring Data Ship for Sybase IQ, page 615
• Configuring Data Ship for Sybase IQ Targets with Location, page 618
• Configuring Data Ship for PostgreSQL and Vertica, page 618
Requirements
• The DB2 database must have EXPLAIN plan tables created within the DB2
SYSTOOLS schema on the database that will participate in TDV data ship.
Limitations
This feature is not valid for:
• Binary and BLOB data types
• Kerberos implementations
To configure the DB2 data sources that might participate in data ship
1. Log into DB2 system and start DB2 command processor db2 and call a system
procedure:
$ db2
2. Connect as a user with privileges to create and modify tables within the DB2
SYSTOOLS schema on the database that will participate in TDV data ship. For
example:
db2 => CONNECT TO <db-name> USER '<db-user>' USING '<db-pass>'
WHERE TYPE = 'T' AND (NAME LIKE 'ADVISE\_%' ESCAPE '\' OR NAME
LIKE 'EXPLAIN\_%' ESCAPE '\')
6. Verify that the tables were created by using the following syntax:
db2 => list tables for schema systools
DB2
Command-Line
Platform Utility Name Execution Details
Windows DB2CMD.exe When it runs, a command window pops up, requires a password,
and stays active until the upload completes. While the window is
open, the password might be visible.
Requirements
• The data being loaded must be local to the server.
• Requires advanced DB2 database level configuration
Limitations
This feature is not valid for:
• Binary and BLOB data types
• Kerberos implementations
To configure the DB2 LOAD utility to work with TDV for data ship
1. Consult the IBM documentation on configuring and using the DB2 LOAD
utility.
2. Install and configure all relevant parts of the DB2 LOAD utility according to
IBM’s instructions for your platform.
The client drivers might need to be installed on all the machines that are part
of your TDV environment.
3. Verify the full path to the DB2 command-line utility that you want to use to
run the DB2 LOAD utility. For example, locate the path to your DB2 SQL
command-line client.
4. Open Studio.
5. Select Administration > Configuration.
6. Locate the DB2 Command-Line Utility Path configuration parameter.
7. For Value, type the full directory path to the DB2 LOAD command-line
program. For example:
For example: for Windows, set the value to “C:/Program
Files/IBM/SQLLIB/BIN/db2cmd.exe”; for UNIX, set it to
/opt/ibm/db2/V10.5/bin/db2. If there are spaces or special characters in the
pathname, be sure to enclose the whole string in double quotes.
8. Locate the Debug Output Enabled for Data Sources configuration parameter
and set the value to True.
9. Locate the Enable Bulk Data Loading configuration parameter.
10. Set the value to True.
11. Click Apply.
12. Click OK.
Note: Note: In the bcp utility, NULL values are interpreted as EMPTY and EMPTY
values are interpreted as NULL. TDV passes the ‘\0’ value (which represents
EMPTY) through the bcp utility to insert am EMPTY value into the cache or data
ship target. Similarly, TDV passes an EMPTY string (“”) to the bcp utility to insert
a NULL value into the cache or data ship target.
To configure the Microsoft bcp utility to work with TDV for data ship
1. Verify that bcp.exe has been installed in an accessible directory.
Note the full path to the bcp.exe file.
2. Open Studio.
3. Select Administration > Configuration.
4. Locate and select the Microsoft BCP utility parameter.
5. For Value, type the full directory path to the bcp.exe. For example:
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp.exe
3. You can now complete the setup in Finishing Data Ship Configuration for
Data Sources, page 618.
2. Create a database link in the Oracle client using syntax similar to the
following on the Oracle instance:
CREATE DATABASE LINK oral2_dblink connect to devdata identified by
password using '(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL
4. Continue with the instructions in Finishing Data Ship Configuration for Data
Sources, page 618.
Platform Instructions
UNIX 1. Create an odbc.ini file. We recommend that you create it under
<TDV_install_dir>.
2. Create a Sybase IQ data source name (DSN) in the odbc.ini file. For work
with TDV, the Driver variable is the most important setting, because it
points to the SQL Anywhere client that you have installed. For example,
data sources named, test1 and test2, would look similar to the following:
#####################################
SybDB@machine64:cat odbc.ini
[test1]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbo
dbc<ver>_r.so
host=10.5.3.73
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
[test2]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbo
dbc<ver>_r.so
host=10.5.3.74
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
######################################
Platform Instructions
Windows 1. Start the ODBC Administrator, select Sybase > Data Access > ODBC Data
Source Administrator.
2. Click Add on the User DSN tab.
3. Select the Sybase IQ driver and click Finish.
4. The Configuration dialog box appears.
5. Type the Data Source Name in the appropriate text box. Type a
Description of the data source in the Description text box if necessary. Do
not click OK yet.
6. Click the Login tab. Type the username and password. If the data source
is on a remote machine, type a server name and database filename (with
the .DB suffix).
7. If the data source is on your local machine, type a start line and database
name (without the DB suffix).
8. If the data source is on a remote system, click the Network tab. Click the
check box for a protocol and type the options in the text box.
9. Click OK when you have finished defining your data source.
10. On Unix, use the following commands to verify that the DSN is set up
successfully:
cd sqlanywhere<ver>/bin<ver>
./dbping -m -c "DSN=test1"
A “Ping server successful” message means that the DSN is working. At this
point any application can use the DSN to contact the Sybase IQ through the
ODBC Driver.
11. On Unix, set the global ODBCINI environment variable to point at the SQL
Anywhere odbc.ini file. For example, to temporarily set the variable, type:
export ODBCINI=/opt/<TDV_install_dir>/odbc.ini
12. Set the QUERY_PLAN_AS_HTML Sybase option to off. For example, run the
following SQL command:
SET OPTION <USER_NAME>.QUERY_PLAN_AS_HTML = OFF;
This setting prevents Sybase from creating extra HTML plan files that can take
up too much disk space.
13. Set the QUERY_NAME option to specify a name for each of the queries
generated to help with future database cleanup.
3. Configure the added server info in the interfaces file for the Sybase IQ server.
4. Repeat the configuration for each Sybase IQ server, pointing the SQL location
at each of the other Sybase IQ instances.
5. Continue with the instructions in Finishing Data Ship Configuration for Data
Sources, page 618.
All data sources that are to participate in data ship should be configured as data
ship targets. If only one node is configured to be a data ship target, the relative
sizes of the nodes are not considered; the SQL operation can only be performed
on the data source that can be the data ship target.
At least one data source in a federated join must be configured to accept data
shipments of external tables as the target; otherwise, data ship optimization is not
possible.
Data can only be shipped in the direction of the data ship target.
When all participating data sources are configured to act as a data ship target, the
estimated costs to obtain results from each of the nodes is analyzed. The node that
is likely to cost the most to retrieve is designated as the target and nodes with
smaller cardinality are fetched to ship to the target node. The SQL operation is
performed locally on the temporary tables in the data source to gain the
efficiencies available on the target.
3. Select the Configuration tab, and provide values for the following on the Basic
tab:
— Pass-through Login—For non-Netezza data sources, set to Disabled so
that data ship functionality can work properly.
— Transaction Isolation level—Set to Serializable for data ship functionality.
This prevents dirty reads, non-repeatable reads, and phantom reads.
4. Use the Advanced tab to set values for the following fields. The fields that you
see vary depending on the type of data source with which you are working.
For example, a Sybase IQ data source might have the Sybase iAnywhere JDBC
driver check box displayed, but a Teradata data source would not
Is dataship target All This must be checked if the physical data source
might be used to receive shipped tables from
another data ship enabled data source. Check Is
dataship target for all data sources so that the TDV
Server can analyze the query and determine the
side that would be best to ship to based on
expected or estimated query node cardinality.
Lower bound for All TDV uses Explain Plan to arrive at a numeric
data ship estimate of the cost of shipping data from a node to
the Data Virtualizer. When the cost of shipping a
Upper bound for
federated query node falls between the limits of the
data ship
Lowerbound and Upperbound, it is considered
LowerBoundForDat eligible for shipment so that it can be processed
aShip locally.
UpperBoundForDat
aShip
Schema path for All A relative path to set the location of the temp tables
Temp Tables on the data source. It is the name of a schema in the
data source.
Required for DB2, make sure that this name
matches a schema name known to TDV. Case
should match exactly.
Enable Bulk MS SQL Server Check box that enables the bulk movement of data
Import/Export using the bcp utility. You must have finished the
steps in Configuring Data Ship for Netezza,
page 613.
Database Link List Oracle with database Add your database links separated with
links semicolons, using the following syntax:
[DBLINK NAME]@[DBLINK OWNER DS PATH]
For example:
oral2_dblink@/users/composite/test/source
s/dship/DEV-DSJ-ORA11G-2
Enable Sybase IQ Sybase IQ with Select this check box to enable this option for the
SQL Location Location data source.
SQL Location Name Sybase IQ with The SQL Location name should take the form:
Location <server_name>.<database_name>
Path of data source Sybase IQ with The TDV full pathname to the other Sybase data
Location source. For example:
/shared/sybase_iq_1
Add Sql Location Sybase IQ with If there are more than two Sybase IQ instances you
Location can add multiple locations by using the green plus
button.
Enable Teradata Setting this option indicates that you want to use
FastLoad/FastExpor Teradata’s FastLoad or FastExport utility to speed
t for large tables up your query times. For a given query, cardinality
information is used to decide whether to use
Fastpath or JDBC default loading.
Enable Bulk Load Vertica Setting this option indicates that you want to use
Vertica’s Bulk Load utility to speed up your query
times. For a given query, cardinality information is
used to decide whether to use Bulk Load or JDBC
default loading.
Exported Databases Vertica Only available if you have selected the Enable
Export To Another Vertica Database option.
Exported database Vertica Name of the Vertica database to which you want to
name export data.
Path of data source Vertica The TDV full pathname to the other Vertica data
source. For example:
/shared/vertica
Studio execution plans can tell you whether a query can make use of data ship or
not. When you create views and procedures that involve data from tables that
exist on separate sources the execution plan reveals whether the data ship
optimization can improve performance of the query.
During the query development phase, verify that the data ship query plan
performs better than a standard query plan without the data ship. Sometimes, the
time cost of shipping a very large table across the network between two different
kinds of data sources can cost almost as much as processing the query normally.
After configuring your data source to use the data ship optimization, you need to
evaluate the queries that are run by that data source to determine if they benefit
from the data ship. If the queries benefit from the data ship optimization, there is
nothing further that you need to do. If the queries do not benefit from the data
ship, you need to disable the feature in the hint of the query SQL.
For more information about the execution plan fields and what they represent, see
Performance Tuning, page 551.
Detail Description
Estimated Rows A value estimated from table cardinalities which are not generally used to
Returned estimate the cost of shipping SQL requests. For Netezza, determine the cost of a
query with an Explain Plan function. For Netezza, the estimated rows returned
field will have a value of Unknown.
Data Ship target Where to send the results of SQL execution. The presence of a data ship target
indicates that data ship will be performed.
Data Ship Notes A report of the estimated number of rows to be shipped, and the amount of time
consumed to get the cost estimate. When a condition blocks the use of data ship,
this field usually describes the condition, case, or option that ruled out
optimization.
6. Execute other queries to show statistics like costs of table creation and other
query initialization costs that add to the overall retrieval time.
See Execution Plan Panel, page 664 for information about the buttons and controls
on the Execution Plan. See Working with the SQL Execution Plan, page 553 for
information about using the execution plan and evaluating the results.
There are a few special steps to consider when defining views that include Vertica
time series functions. The following steps are guidelines for how you can make
use of the following Vertica time series functions within a data ship target view
where the SQL will be run entirely on the Vertica database:
• TIMESERIES SELECT clause which names a time slice
• TS_FIRST_VALUE
• TS_LAST_VALUE
Restrictions
• Running the Vertica time series functions in any data source besides Vertica
results in processing errors.
• Time functions with time zone and timestamp functions with time zone, do
not have micro- and nanosecond precision, nor time zone information. A
workaround for that limitation includes using the CAST function to have the
time zone value converted to a VARCHAR value.
/users/composite/test/sources/netezza50/TPCDS100_2/TPCDS100/PROD/C
ALL_CENTER
ON cc_call_center_sk = store_key
timeseries ts as '3 hour'
over (partition by store_key, first_open_date, cc_class
order by cast(cc_rec_start_date as timestamp))
order by case when ts is not null then ts end , store_key
5. Test the view to make sure that all the SQL is pushed and processed on the
Vertica database.
The data ship optimization is configured for each data source. A data source
might have hundreds of queries that it runs. Some of those queries can benefit
from having the data ship optimization enabled and so you must configure the
data ship. However, there might be a few queries that do not benefit from using
the data ship optimization. For those queries that do not benefit from the
optimization, you must disable the optimization. For more information on data
ship options, see the TDV Reference Guide.
The driver properties are used to specify connection timeout settings as required
by the specific driver. By specifying the properties that are used by your specific
data source, you can avoid situations where connections are left open indefinitely.
This will prevent TDV threads from locking up on requests that need to establish
a database connection.
Because each database has different connection properties, you will need to
become familiar with them to determine which property to set and what values
are appropriate for your environment.
If data ship has been enabled and a query is submitted using a prepared
statement where the query is static but input variables can change, TDV evaluates
it for data ship. For queries, TDV computes an execution plan. The execution plan
considers data ship for each invocation of a prepared statement after the
parameter values are known and presented to TDV.
For queries with prepared statements, the value of each parameter for a given
execution is determined, the query is evaluated to build an execution plan, and
data ship is considered. If TDV determines that data ship can be used to optimize
the performance of the query, then TDV will process the query using data ship.
Push-based incremental caching can be set up using the native TDV caching
mechanism, plus the TDV Cache Management Service (CMS), to configure a
Central Event Server. If you have subscribed to notices from objects that store data
in an incremental cache, subscribed clients are notified whenever the result sets
are updated. After they are notified, clients can choose to update their result sets
or refresh the affected views. Each client instance of TDV can be configured to
process the messages and automatically update its incremental cache.
The following resources cannot be cached without being wrapped in a view or
procedure:
• Procedures with no outputs. There is no data to cache in this case.
• XML files that have been introspected.
• System tables.
• Non-data sources such as folders and definition sets.
The following topics are covered and should be performed in order:
• View Requirements and Restrictions for Push-Based Incremental Caching,
page 629
• Requirements for Push-Based Incremental Caching, page 630
• Adding the Change Notification Custom Java Procedures, page 636
• Defining Connectors, page 637
• Install the Central Server, page 638
• Configuring Push-Based Incremental Caching and Change Management,
page 638
• Publishing the cms_call_propagator Procedure, page 639
• Configuring Oracle Data Sources for Change Notification, page 640
• Setting Up an Push-Based Incremental Cache in Studio, page 641
• Publish Push-Based Incremental Caches, page 642
• Recovering from a Messaging Error, page 642
• Push-Based Incremental Caching Configuration Parameters, page 644
Caching
When you decide to use a view for push-based incremental caching, you must
consider how the view is defined and how the data source is monitored for
change capture.
Subscribed views must have explicitly defined or implicitly inferred keys.
Composite views (subqueries) are supported if they occur in the FROM clause.
Incremental caches do not support INTERVALDAYTOSECOND and
INTERVALYEARTOMONTH data types as key values.
The following view operators are supported.
Operator Supported
Renaming Yes.
Example: C1 AS C2
Projection Yes, for creating a view from two table columns with scalar function support.
Example: SELECT UPPER(C1)
Left Outer Join Yes, if only the left side is monitored by GoldenGate. Otherwise, no. The
restriction applies because processing of the data change for left outer joins
could result in inaccurate data.
Right Outer Join Yes, if only the right operand is monitored by GoldenGate. Otherwise, no. The
restriction applies because processing of the data change for right outer joins
could result in inaccurate data.
• You must install and configure the following products, using your own
licensing and customer agreements with the companies that distribute the
products:
— TIBCO Enterprise Message Service (EMS) and JMS
— Oracle database
— Oracle GoldenGate change data capture
— Apache Geronimo
This section provides some guidelines for configuring these systems to work
with the TDV push-based incremental caching feature, but you must follow
the requirements and instructions as detailed in the documentation for each
product.
• Installed a valid Active Cluster instance.
• Before you configure push-based incremental caching, you must locate the
following:
— cscms.jar, composite_cdc.jar, cscms_call_propagator.jar
— tibjms.jar, tibjmsadmin.jar
— axis2-adb-1.3.jar
• You must also complete the steps and address the suggestions in the following
sections:
— Configuring Oracle Database for Push-based Incremental Caching,
page 631
— Installing and Configuring Oracle GoldenGate, page 633
— Installing the TIBCO JAR File, page 635
— Configuring JMS Pump for Oracle GoldenGate Using TIBCO EMS,
page 635
“Change” messaging from the change-data capture (CDC) service to the specified
data source topic is required; without it, no view updates are possible. Refer to
Installing and Configuring Oracle GoldenGate, page 633 for information on
making the CDC service compatible for use with the TDV Software Change
Management Service.
Push-based incremental caching requires an Oracle account that grants the
following permissions to connect with and use the targeted Oracle database as a
subscription store:
• Create subscription-related tables and indexes.
• Introspect required tables. Both READ and SELECT access are required.
• Execute regular SQL queries against data sources and tables of interest.
• Grant EXECUTE,INSERT, UPDATE, and DELETE permissions against those tables.
• Execute flashback queries against data sources and tables of interest.
Oracle databases used to store incrementally cached views and views involving
joins of monitored tables must be configured with cursor_sharing set to EXACT.
Other settings for cursor_sharing result in degradation of performance.
Or, log in to the database server as the schema owner for the tables that are to
be monitored and change the supplemental logging for individual tables.
SQL> ALTER TABLE <tableName> ADD SUPPLEMENTAL LOG DATA (ALL)
COLUMNS;
3. Make sure that the Oracle databases have archivelog and flashback enabled.
To enable flashback, get the system change number (SCN) value using:
SELECT dbms_flashback.get_system_change_number FROM dual;
To configure GoldenGate
1. Install GoldenGate co-located with the Oracle data source that is to be
monitored. Refer to the GoldenGate documents for a description of the
requirements and installation considerations.
2. Copy the composite_cdc.jar file from the TDV installation directory to the
directory:
<$GoldenGateHome>/GGIEJ_304/javaue/resources/lib
The TDV CDC JAR inserts a header into the GoldenGate messages and
includes the SCN value for the transaction classpath of the Java UE.
3. Log in to the Oracle GoldenGate server and navigate to: $GGHOME/dirprm.
4. Using the GGSCI command line utility, create an EXTRACT group.
GGSCI (comp-name)
1> ADD EXTRACT <ExtractGroupName>, TRANLOG, BEGIN NOW
XX is the two-character name of the exttrail filename group. All trail files have
this prefix with a six-digit incrementing numeral, and they are placed in the
$GGHOME/dirdat directory. The GGSCI utility creates a new exttrail and an
extract parameter file. This file defines the table resources for monitoring and
extraction.
7. Edit the <ExtractGroupName>.prm parameter file, using a text editor, with
the following command:
GGSCI (comp-name)
3> EDIT PARAMS <ExtractGroupName>
--GROUPTRANSOPS 10
REPORTCOUNT EVERY 1 SECONDS, RATE
-- The following three parameters are required to ensure
compatibility with CMS:
GETUPDATEBEFORES
NOCOMPRESSUPDATES
NOCOMPRESSDELETES
-- Monitored tables are listed here. The following tables are used
as examples.
table qacntrial.address;
table qacntrial.bid;
table qacntrial.category;
table qacntrial.creditcard;
table qacntrial.favoritesearch;
table qacntrial.item;
table qacntrial.sale;
table qacntrial.users;
2. Copy the composite_cdc.jar file from the TDV cscms.jar installation directory
to the directory:
<$GoldenGateHome>/GGIEJ_304/javaue/resources/lib
The TDV CDC JAR inserts a header into GoldenGate messages, and includes
the SCN value for the transaction classpath of the Java user environment.
3. Restart the Oracle GoldenGate CDC processes.
Load the Java procedures to set up change notification and incremental caching.
The CMS Change Engine (CE) provides core functionality for the Change
Notification feature. For example, it implements and manages the Virtual View
Resolver, incremental cache updating, and Change Data Notification (CDN).
These features are provided with a TDV Custom Java Procedure (CJP).
5. In Additional classpath (semicolon separated), type the paths to the JAR files,
using forward-slashes (“/”) in the path:
— <TDV_install_dir>/../axis2-adb-1.3.jar;
— <TDV_install_dir>/../cscms_call_propagator.jar;
— <TDV_install_dir>/../tibjmsadmin.jar
6. Click Create and Introspect.
7. Select all the procedures listed.
8. Click Next, Finish, and OK.
Defining Connectors
You need to define connectors to establish messaging between the servers, and to
ensure fault-tolerance. The connector settings should be identical between the
servers in your environment. The Servers require Connector Management
configurations after the TDV Server is restarted with the TIBCO interface JAR.
The Input EMS and Output EMS could be the same connector on the same TIBCO
instance, but because of the amount of traffic expected, and for scalability, input
and output messages are best handled by separate queue connectors. The Input
EMS is the default Connector handling messages from GoldenGate JMS Pump.
The Output EMS handles TDV-to-client change message notifications for
subscribing clients.
To define connectors
1. Follow the steps in “Configuring TDV for Using a JMS Broker” in the TDV
Administration Guide to use Manager to configure TDV for use with a JMS
broker.
2. Still in Manager, select CONFIGURATION > Connectors.
You are going to create a JMS Connector to connect with the TIBCO Queue
and a Queue Connection Factory (QCF).
3. Click Add Connector.
4. Type a name for the connector.
5. Select the JMS via JNDI tab.
6. Define the Input EMS JMS connector information.
This connector takes the messages from GoldenGate and passes them to TDV
for processing. Your TIBCO EMS and deployment environment dictates the
values you need to enter for the Connector parameters. Type c in Initial
Context Factory and select
com.tibco.tibjms.naming.TibjmsInitialContextFactory from the list of
factories. Type the JNDI Provider URL (for example,
tibjmsnaming://dev-tibco-6:7222).
7. Record the name you gave to the Input EMS Connector.
The following Central Event Server settings are required for CMS functionality.
messaging
1. Open the TDV Configuration panel by selecting Administration >
Configuration from the Studio menu bar, and navigate to Change
Management Service > Central Function > Messaging.
2. Configure Input EMS > Default Connector.
This is the name of the Input EMS connector you defined using Manager. It is
the messaging connection that the Central Event Server uses to receive
Change Data Capture data source monitoring change notifications.
3. Configure Internal EMS > Queue Connector.
4. Configure Output EMS > Default Connector.
This is the name of the Output EMS connector you defined using Manager. It
is the messaging connection that the Central Event Server uses to forward
change notifications to subscribing clients.
5. Configure Output EMS > Security Domain.
6. Review the configuration parameters under Internal EMS and change them
based on the information in Push-Based Incremental Caching Configuration
Parameters, page 644.
5. Verify that you can see the procedure in the resource tree.
Data sources must have a change notification connector and a topic name
specified.
Derived views must be available to be subscribed to, or have cached views that
are incrementally maintained. The topic name must be specified on the Change
Notification tab of the data source; otherwise, the data source will not receive data
source change messages from the Oracle GoldenGate process.
Note: The Change Notification tab for the data source only appears after the
Central Event Server has been configured.
The data source Change Notification tab defines the Topic Name to which Oracle
GoldenGate is configured to send change notification messages.
Each watched data source must specify a topic on which to receive change
notification messaging from the Oracle GoldenGate utility. The topic can use the
Input EMS Default Connector or a different JMS Connector specific for that data
source.
If a single instance of Oracle GoldenGate has been sending nonconforming
messages, it is possible that only the topics affected by that Oracle GoldenGate
instance need to be purged rather than all of the data sources topics.
For the resource being cached, storage depends on the result set output, and on
retrieval of the entire result set or retrieval of a filtered subset. Decide on a storage
type for caching the execution result. For information on how TDV cached data
types are mapped to data source data types, see “Cache Data Type Mapping” in
the TDV Reference Guide.
Note: For tables, the expiration period applies to the whole table. For procedures,
each input variant’s data has its expiration tracked separately.
If a resource uses a data source that was originally added to TDV Server using the
pass-through mode without saving the password, row-based security can affect
the cache refresh functionality. For details on Save Password and Pass-through
Login, see Adding a Data Source, page 68.
For information about the objects created by the cache, see Caching to a File
Target, page 496.
To enable caching, select the cache storage type, and schedule caching
1. Make sure the data source and tables on which the view is based are set up to
be monitored for changes and the data source has a connector and topic
specified on which change notification messages will be received.
2. In Studio, open a view or procedure.
3. Select the Caching tab.
4. Click Create Cache.
5. Under Status, select the Enable check box.
6. Under Storage, specify the storage type as User Specified.
7. Use Browse to locate and specify the data source. After you select the data
source, its full path is displayed in the Data Source field.
8. Use the Open button to open the data source to create two tables, one for
storing cache status data and the other for storing cache tracking data.
9. (Optionally) If you want to rename the cache status table, in the Caching
section of the data source Configuration tab, select Browse to the right of
Status Table.
10. Close the data source Configuration tab. Navigate back to the view or
procedure with the open Caching panel.
11. Use Browse to the right of result in the Table for Caching section to specify a
location within the data source to create a table for caching.
This location can be the root level of the data source or a schema.
12. In the Select cache table for result window, select the location and type a name
under Create Table.
13. Click Create, examine the DDL for result code, and click Execute.
14. Open the table you created to the Caching panel and click Create Cache.
15. On the Caching panel, select Incrementally Maintained.
16. Save the cache settings. After cache settings are saved the resource appears
with a lightning-bolt icon in the Studio resource tree to show that the resource
is cached.
After a brief initial data load task, the cache view is actively updated. Status
for the cache is UP when the resource is saved. If cache initialization fails for
any reason, clear the Incrementally Maintained check box, save the view, and
then check the box again and save the view again. Because cache initialization
is not automatically fault-tolerant, a retry re-initializes the cache.
When a view cache is marked as incrementally maintained and the connection to
the source table or the target table caching database instance is down, the time it
takes to report the connection problem depends on operating system network
settings, such as the TCP default timeout.
If you have data type incompatibilities between your view and your data storage
type, see “Cache Data Type Mapping” in the TDV Reference Guide to determine the
best data storage option for you cache.
The cs_server.log should be checked for an exception message like the following
example:
Exception: java.lang.Exception: Invalid message element: expecting
'before' element, got 'missing'...
Invalid message element indicates that the messaging format was broken and
outside of schema requirements. CMS issues “<ops/>” messages to subscribing
clients, invalidates caches, and stops so that CMS and messaging can be restarted
from a configured state.
If the exception is a data type mismatch, it is likely that the source of the error was
not the Oracle GoldenGate process, and so those processes do not need to be
recreated. However, queues and topics might still have to be purged to clear the
system of corrupt data.
8. Use TIBCO EMS utility purge topic <topic_name> to purge the input EMS
topics specified on the Change Notification panels of all data sources
configured for incremental caching.
9. Restart the TDV Change Messaging Service on the Central Events Server.
10. Restart all disabled caches with the following for each incrementally
maintained cache.
a. Clear the Incrementally Maintained check box on the Caching tab of the
view.
b. Save the view.
c. Select the Incrementally Maintained check box and then save the view
again. Wait a few moments while the cache re-initializes itself by creating a
materialized view based on the SCN timestamp reported by the
GoldenGate monitoring utility.
Incrementally maintained caches that were marked as Disabled should then show
a status of Up, and those caches will be kept in sync, mirroring all changes made
to the watched data sources.
Outbound EMS topics with subscribers listening for change notifications on
subscribed views automatically begin receiving change notification messaging
when changes to the watched sources are reported.
Parameter/Group Description
Administration For a description, see under parameter group Logging Options.
Attempt Count Under Central Function > Fault Tolerance. The number of times CMS attempts
connection recovery after a failure. With the default setting of “0” CMS
attempts connection recovery with data sources, EMS, and other active cluster
members until all connections are restored.
Parameter/Group Description
Attempt Delay Under Central Function > Fault Tolerance. This value denotes the delay, in
milliseconds, between attempts to recover from a connection failure. The
default value is 1000.
Change Inference
Change Inference
Messages
TDV-Managed Under Central Function > Messaging. Typically, TDV is set to manage output
Destinations messaging queue and topic destinations. TDV creates queues and topics
according to the needs of the client subscriptions, in transit events, and
transient cache buffers. If TDV-Managed Destinations is set to false, all of the
Messaging > Internal EMS properties must be set.
Connection
Attempt Delay
Connection
Attempt Timeout
Data Sources For a description, see under parameter group Logging Options.
Default Connector Under Central Function > Messaging > Input EMS. The name of the EMS topic
connection factory for receiving change notification messages from Oracle
GoldenGate through TIBCO EMS. Data source specific connectors and topics
can be defined to override the default connector. The value of the Input EMS
Default Connector server property is the name of the JMS via JNDI Connector.
JMS connector attributes like initial context factory and JNDI provider URL
are defined on the CONNECTOR MANAGEMENT page of the Manager
(Web) interface.
Default Connector Under Central Function > Messaging > Output EMS. The name of the EMS
topic connection factory to be used for creating topics to send change
notification messages from TDV to subscribed clients. Individual topics are
created dynamically based on client subscription requests. Data source
specific EMS Connectors and topics can be defined to override the default
connector. The value of the Output EMS Default Connector is the name of the
JMS via JNDI Connector.
Parameter/Group Description
Error Handling The name of the procedure to call in case an invalid message has been
Procedure received. The procedure must accept three arguments: a string value for the
error, a string value for the offending message, and a timestamp for the time at
which the error was reported.
Expiration Period The number of hours a client subscription remains valid without renewal to
monitor a view for changes. Client subscription renewals before the expiration
period has elapsed result in extensions that last for another expiration period.
The expiration period begins when the client subscription is created and it is
reported to the client subscriber.
In Transit Event Under Central Function > Messaging > Internal EMS. If TDV-Managed
Queue Destinations is set to false, this parameter must be set to a valid queue created
to host in-transit change events. The In Transit Event Queue is used to ensure
fault tolerance so that if a server instance crashes, event processing can
resume, starting from the last successfully processed message.
Include Column Under Central Function > Messaging > Message Format. The following
Names example shows an update message (“U”) with a format that includes both the
rank (“r”) and the column names (“n”):
<U a="true">
<k r="0" n="Column Name 0">value0</k>
<k r="1" n="Column Name 1">value1</k>
<c r="2" n="Column Name 2">
<n>newvalue2</n>
<o>oldvalue2</o>
</c>
<c r="3" n="Column Name 3">
<n>newvalue3</n>
<o>oldvalue3</o>
</c>
<c r="4" n="Column Name 4">
<n>newvalue4</n>
</c>
<c r="5" n="Column Name 5">
<n>newvalue5</n>
</c>
</U>
Parameter/Group Description
Include non-Key Under Central Function > Messaging > Message Format. Change notification
Deleted Values messages always include a primary key that might have one or many columns
to ensure a unique key on which create, update, and delete operations were
detected. For Delete operations, the message setting can be set to include only
key identifying values, or all values, of the row that has been deleted.
If set to true, extraneous column information is included in the change
notification message, as in the following example where a delete operation,
<D>, includes the key columns (“k”), and non-key values {value3, value4,
xsi:nil=”true”}:
<ops>
<D>
<k r=“0” n=“Column Name 0”>value1</k>
<k r=“1” n=“Column Name 1”>value2</k>
<c r=“2” n=“Column Name 2”>value3</c>
<c r=“3” n=“Column Name 3”>value4</c>
<c r=“4” n=“Column Name 4” xsi:nil=“true”></c>
</D>
</ops>
If set to false, the “c” columns would not appear in the message.
Include Under Central Function > Messaging > Message Format. When set to false,
non-Updated change notification messages exclude values that are not changed since the
Values last checkpoint of the table being monitored.
For example:
<U a="true">
<k r="0" n="Column Name 0">value0</k>
<k r="1" n="Column Name 1">value1</k>
<c r="2" n="Column Name 2">
<n>newvalue2</n>
<o>oldvalue2</o>
</c>
<c r="3" n="Column Name 3">
<n>newvalue3</n>
<o>oldvalue3</o>
</c>
</U>
Parameter/Group Description
Include Old Under Central Function > Messaging > Message Format. If set to true, change
Updated Values notification messages include the old values that are being replaced.
For example, this message highlights the old values, (<o> </o>), that would be
omitted if the property were set to false.
<U a="true">
<k r="0" n="Column Name 0">value0</k>
<k r="1" n="Column Name 1">value1</k>
<c r="2" n="Column Name 2">
<n>newvalue2</n>
<o>oldvalue2</o>
</c>
<c r="3" n="Column Name 3">
<n>newvalue3</n>
<o>oldvalue3</o>
</c>
</U>
Logging Options Parameter group. Typically, logging options are set to false. Log entries and
events can be found in the TDV events log (cs_server_events.log) and/or
server log (cs_server.log) files present in the server <TDV_install_dir>/logs
directory. Each of the following logging options can be turned on (true) or off
(false):
Administration—CMS installation or uninstallation, CMS start or stop.
Caching—Logging for incrementally maintained caching of views.
Change Inference—This setting logs messages received through connected
EMS topics. The Input EMS Default Connector and change messages are
passed from CDC when triggered by changes to watched data source tables.
This setting does not show change data in the logs.
Change Inference Messages—This setting logs more detailed messages
(including data changes) received through connected EMS topics. The
messages logged by this setting can help verify whether change data capture
messaging is arriving to the EMS topics and connectors currently listening for
change messaging.
Data Sources—This setting logs activities on the push-based incremental
caching data sources. Metadata Inference—Information captured by this log
setting shows dependency analysis messages.
Subscriptions—Subscription requests, unsubscribe, and renew subscription
requests are logged when this setting is true.
Parameter/Group Description
Message Format All of the messaging parameters can be turned on (true) or off (false). The
parameter settings affect the content of the change notification message
format. Message formatting is a global setting that affects all change
notification messaging received by subscribed clients.
Message Handling Correct change notification is dependent on correct change messaging sent
from the monitored data source tables. Messaging ensures that incrementally
maintained caches and subscribed users have notification of changes in the
source data. If messages from the monitored data sources are malformed or
lacking in content, the Change Management Service might treat the message
as a systemic error requiring manual assessment and reconfiguration,
generate logging errors (in cs_server.log), and possibly abort. If messaging
errors occur, check the CMS server log file, and check the Oracle GoldenGate
*.prm parameter file for configuration.
PROCEDURE OnError(IN ex VARCHAR(2147483647), IN msg
VARCHAR(2147483647), IN t TIMESTAMP)
BEGIN
-- Add your code here
CALL /lib/debug/Log(concat('M5 Test[GG erroneous
message]---Exception: ',ex));
CALL /lib/debug/Log(concat('M5 Test[GG erroneous
message]---Received message: ',msg));
CALL /lib/debug/Log(CAST(concat('M5 Test[GG erroneous
message]---Time: ',t) AS VARCHAR(255)));
END
On Error Sets a procedure to execute when a messaging error occurs. Specify the full
Procedure path name of the TDV Server procedure to call if an invalid message is
encountered. The procedure specified must accept three input arguments: a
string value for the error, a string value for the offending message, and a
timestamp for the time at which the error is reported.
Persistence Store The TDV resource path name that points to a relational data source container
that should be used to host a subscription store table, which is created and
maintained by CMS. On application of a valid TDV persistence store location,
CMS creates the cis_cms_appl_subs table. That table records client invocations
of cms_client_subscribe with data such as the Subscription ID, Application ID,
an encoded expiration, the full path of the subscribed view, and subscribed
columns if different from the full view.
Because the persistence store, cis_cms_appl_subs table, should not be shared
between TDV nodes, the Persistence Store target setting should be changed
after synchronization by backup server import.
Parameter/Group Description
Queue Connector Under Central Function > Messaging > Internal EMS. The name of the JMS
Connector to be used by CMS to create an in-transit event queue. The
in-transit event queue is used to maintain in-transit change events in a
fault-tolerant manner, while they are being processed by the Central Event
server. The value of the Internal EMS Queue Connector server property is the
name of the JMS via JNDI Connector. A new or a changed Queue Connector
setting requires a CMS restart to initialize the in-transit queue.
Security Domain Under Central Function > Messaging > Output EMS. The LDAP domain value
that can be used to restrict topic connection to permitted groups and users for
the subscribed view.
Started Under Central Function. This setting is an indicator that shows whether CMS
is started or not. It should not be modified because the value is not a switch. It
is an indicator provided to show CMS status. CMS can be started and stopped
with the procedures cms_admin_start and cms_admin_stop.
Stop on Any Error Under Central Function > Fault Tolerance > Message Handling. CMS aborts
on receiving source change messages with an invalid message structure. This
flag specifies whether CMS aborts if any other types of errors are detected in a
source change message notification. Invalid GG configuration is a systemic
error.
Partial compliance with the expected change notification message schema is a
more serious error than inadvertent messages. Some data type mismatches are
handled, but a mismatch invalidates all derived incremental caches and sends
a notification to all subscribed clients (<ops/>, set with a message property of
aborted=”true”) to signify a break in the monitored data.
When Stop on Any Error is set to false, most messaging errors are still serious,
and such errors invalidate all incrementally maintained caches, subscription
failure messages, and a purposeful stop of CMS. If a message significantly
deviates from the expected CMS schema and is inserted into a monitored
input topic or queue, that message is ignored.
Parameter/Group Description
Tibco Parameter group. These optional settings override the TIBCO EMS connection
factory settings, so that CMS can manage timing and number of connection
attempts. The switch from a primary to a backup EMS can initiate a large
number of connection and reconnection attempts that can exceed the number
allowed.
Connection Attempt Count—Number of attempts made to connect to the JMS
server.
Connection Attempt Delay—Milliseconds to wait before the next connection is
made.
Connection Attempt Timeout—Milliseconds the connection factory waits
before declaring a failure if the attempt is unsuccessful.
Reconnection Attempt Count—Number of attempts to make to reconnect to
the JMS server. (TIBCO has separate connect and reconnect attempt
parameters.)
Reconnection Attempt Delay—Milliseconds to wait before the next connection
is made.
Reconnection Attempt Timeout—Milliseconds the connection factory waits
before declaring a failure.
For hard connection failures, problems can be reported earlier than these JMS
vendor settings indicate.
Topic Connector Under Central Function > Messaging > Internal EMS. If TDV-Managed
Destinations is set to false, this parameter must be set to a valid topic
connector for TDV to manage transient cache buffers, which might be needed
while an incrementally maintained data cache is being initialized.
Transient Cache Under Central Function > Messaging > Internal EMS. If TDV-Managed
Buffering Topic Destinations is set to false, this parameter must be set to a valid topic so that a
transient cache buffer can be set up to provide a temporary topic to handle
change messages during cache initialization. The Central Event Server uses
this topic to handle change notifications that occur while the cache table is
being created and loaded for the first time.
This topic describes managing of legacy web services.The following topics are
covered:
• Limitations for Legacy Web Service Conversion, page 653
• Converting Legacy Web Service, page 654
If the data service contains a service or operation that TDV does not support, the
resource tree does not display that service or operation. For every object that is
not supported in TDV, an error is generated in the cs_server.log.
After migration, a report dialog displays messages indicating what needs to be
fixed for the web service to be fully functional within TDV. Resolve all warning
messages. Consider reconfiguring any older communication protocols between
the converted web service and any client applications that consumed the legacy
web service.
Limitations Include
• JMS transport is not supported.
• WSDL access authentication method is not supported.
• Combined authentication methods for service endpoint URLs are not
supported.
• Multiple services and multiple ports are not supported.
They are converted to different services, and ports are split into separate web
services after migration.
• Wild XML output types are not supported.
If present they cause conversion of the legacy web service to fail.
• Enclosed keystores are not supported.
Before upgrade, the enclosed keystore of the legacy web service needs to be
integrated with a server key store.
Legacy Web services that you might have can be converted, with some
limitations, to the currently used Web services paradigm within Studio. In TDV, a
legacy WSDL Web service is an older way to create WSDL that has been retained
to support existing implementations.
3. Right click the legacy web service that you want to convert.
4. Select Upgrade Legacy Web Service.
5. Type a name you want to give to the copy of your old web service.
If warnings are listed, your web service might not convert using the wizard.
Consider deleting or rewriting your older web services to comply to the
updated TDV web services paradigm.
6. Click OK twice to close the window and complete the action.
7. Right click the legacy web service that you want to convert.
8. Select Upgrade Legacy Web Service
9. Select the radio button underneath Web Service Name:.
10. Type a name for the converted web service.
11. Click OK twice. Depending on the original structure of your legacy web
service you might get more than one new web service. For example:
This topic contains panel, toolbar, menus and other user interface reference
material. It includes the following topics:
• Studio Menus, page 657
• Studio Status Bar, page 659
• Studio Code Editing Keyboard Shortcuts, page 660
• View and Table Editor Panel Reference, page 661
• Procedure Editor Panel Reference, page 664
• Trigger Editor Panel Reference, page 670
• SQL Definition Set Editor, page 675
• XML Definition Set Editor, page 675
• WSDL Definition Set Editor, page 677
Studio Menus
The Studio menus contain options applicable to the current window and context.
These menus are subject to change and this section might not include listings for
options that you see in your version of Studio.
Almost every container or resource in the Modeler resource tree has a right-click
menu. The exact listing of options varies by resource type and version of Studio
that you are running. See these sections for a quick summary:
• File Menu, page 658
• Edit Menu, page 658
• View Menu, page 658
• Resource Menu, page 659
File Menu
Available menu options vary depending on where you are in Studio. Many of the
options standard to most user interfaces have been omitted.
Refresh “<resource>” F5
Save As Alt+S
Edit Menu
Many of the options standard to most user interfaces have been omitted.
Shift-Left Alt+Left
Shift-Right Alt+Right
View Menu
Many of the options standard to most user interfaces have been omitted.
Resource Menu
Available options vary depending on where you are in Studio.
The Studio status bar is displayed in the lower right corner of the Studio window.
It displays the encryption status of the connection between Studio and TDV. It
also displays the current Studio memory usage, and enables you to run JVM
garbage collection to free up memory space no longer in use.
Object Description
Garbage Collection Button This button initiates a JVM (Java Virtual Machine) garbage collection
cycle to free up unused Studio memory. If the TDV Server is on the
same computer as the Studio client, this button initiates a garbage
collection cycle on the shared JVM.
Studio Memory Use This bar shows the total memory available to Studio and the memory
Indicator Bar it is currently using. Even if TDV Server is on the same computer, this
indicator shows only the Studio usage.
Object Description
Studio-Server Connection A closed-lock icon indicates that SSL protection is being applied to the
Type connection between Studio and the TDV Server.
Encrypted connection— If the Studio and TDV are on the same computer or if they are both
within a secured LAN, it is less important to encrypt the connection
Unencrypted
between the two.
connection—
If the connection must be secured, use the Encrypt check box at login
to initiate an SSL connection between Studio and TDV.
Studio keyboard shortcuts help you write and edit code in the SQL, XML, XSLT,
or XQuery editors.
Redo Ctrl+Y
Searching
Search text in panel Ctrl+F Opens a Search field at the top of the text editor. All
searches wrap at the end to include the entire text. If you
type a character that creates a string that is not present in
the SQL text, the search field turns red. Up and down
arrow-icons let you determine search direction. Check or
uncheck the Case sensitive box to set case sensitivity.
Search and replace Ctrl+R Opens a window to find and replace a string within the
text in panel text editor. Options are the same as in the Search field and
its controls.
Selecting
Select text to left of Shift+Home Selects the text from the cursor position to the beginning of
cursor the line.
Select text back to Ctrl+Shift+ Selects all text from the cursor to the start of the text.
start Home
Select all to end of Ctrl+Shift+ Selects from the cursor to the end of the text.
text End
Select word left Ctrl+Shift+ Selects from the cursor back to the beginning of the word
Left_arrow the cursor is in.
Select word right Ctrl+Shift+ Selects from the cursor forward to the end of the word the
Right_arrow cursor is in.
Select one character Shift+ Selects one character to the left or right of the cursor or
Left_arrow selection.
Shift+
Right_arrow
Moving
Move cursor home Ctrl+Home Moves cursor to the start of text in the editor.
Move cursor to end Ctrl+End Moves cursor to the end of the text in the editor.
of text
Shift lines Alt+ Shifts selected lines left or right one tab setting.
Left_arrow
Alt+
Right_arrow
Deleting
Delete selection Ctrl+D Deletes selected text or lines.
Delete one word to Ctrl+ Deletes one word (or the remainder of the word) to the left
left of cursor Backspace of the cursor.
Delete one word to Ctrl+Delete Deletes one word (or the remainder of the word) to the
right of cursor right of the cursor.
The editor has a collection of right-pane panels for viewing and editing view
properties, depending on what is available for the resource you are working with.
You can open these panels by clicking tabs at the bottom of the right pane:
Model Panel
Use the Model panel as a graphical tool to add resources to a view, and to
jump-start design of the query that defines the view. For further details, see
Designing a View and Table Resource, page 220.
Grid Panel
For details on using the Grid panel, see Designing a View in the Grid Panel,
page 230.
SQL Panel
For details on using the SQL panel, see Designing SQL for a View in the SQL
Panel, page 237.
Columns Panel
See Designing Column Projections Using the Columns Panel, page 245 for more
information.
Indexes Panel
For details about using this panel, see Defining Primary Key for a View or Table in
the Indexes Panel, page 238.
Rebind Panel
The Rebind panel allows you to rebind the view resources to the appropriate
sources if the names or locations of those sources have changed. Display the
Rebind panel by clicking the Show Rebind Panel button on the editor toolbar. The
Rebind panel displays the schema of the resources that the given view depends
on.
For details about using this panel, see Rebinding a View, page 248.
Result Panel
The Result panel appears in the bottom area of a view or procedure editor panel
when you click the Execute button on the editor toolbar. For queries, the Result
panel displays up to 50 lines of execution results. For some extremely large
results, such as those generated by using EXTEND on an array to generate
BIGINT number values, some of the data might be truncated without a visual
indication by Studio.
For transformations, the Result panel displays the target schema, XML text, or
query output.
When you close the Result panel, any query or procedure execution in progress
for the Studio display is canceled.
For details, see Generating a Query Execution (Explain) Plan, page 247 and
Executing a Procedure in Studio, page 305.
Note: The words “Identity Simulated” appear in the upper right corner of the
Execution Plan if row-based security is in use and you are using a simulated
identity to test that the results are as expected. This is done using the Test Identity
tab in the View editor.
See Working with the SQL Execution Plan, page 553 for information about
generating and understanding the execution plan.
Studio provides a procedure editor for creating and modifying various types of
procedures.
This section introduces the following panels:
• Introduction to Procedure Panels and Tabs, page 665
• Model Panel, page 666
Panels
Parameters
SQL Script
Data Map
Procedure Type
Caching
Outputs
XQuery
Inputs
Model
XSLT
Grid
SQL
Info
SQL script x x x x
Custom Java x x x
procedure
Packaged query x x x x
Parameterized x x x x x x
query
Panels
Parameters
SQL Script
Data Map
Procedure Type
Caching
Outputs
XQuery
Inputs
Model
XSLT
Grid
SQL
Info
Physical stored x x x
procedure
XML to tabular x x
mapping
XSLT x x x x x x
transformation
XSLT procedure x x x x
Streaming x x x x x
transformation
XQuery x x x x x x
transformation
XQuery x x x x
procedure
Model Panel
The Model panel is available for Parameterized Queries and XQuery
Transformations.
• For a parameterized query, use the Model panel to add resources. You can add
any type of table (relational, LDAP, delimited file) and a procedure that
outputs either a set of scalars or one cursor. This panel works with the Grid
Panel where you can specify the output column projections and the
constraints on a query. The Model panel is permanently disabled if you edit
the script in the SQL Script Panel for a parameterized query.
• For an XQuery transformation, use this panel for specifying the sources and
target values that provide the data for the output XML document. The Model
panel is permanently disabled if you edit the script in the XQuery panel.
For information about using the Model panel for XQuery transformations, see
Creating an XQuery Transformation, page 320.
Grid Panel
The Grid panel in the procedure editor is available for Parameterized Queries.
Use this panel for including columns in the output that is obtained when a
parameterized query is executed. This panel works with the Model Panel. The
Grid panel is permanently disabled if you edit the script in the SQL Script Panel
for the parameterized query.
SQL Panel
The SQL panel is available for Packaged Queries. Use this panel to formulate and
edit the SQL native to the data source targeted for your packaged query.
This panel works with the Parameters Panel. The parameters you use in this panel
must match the parameters designed in the Parameters panel, including the order
in which they are provided.
For details about using this panel, see Creating a Packaged Query, page 285.
Parameters Panel
The Parameters panel is available for custom Java Procedures, SQL scripts, certain
Transformations, Packaged Queries, and Parameterized Queries. This panel
displays the input and output parameters for a procedure, and works with the
SQL Script panel (or the SQL panel). The parameters you design in this panel
must match the parameters defined in the SQL Script panel (or the SQL panel)
including the order in which they are provided.
• For Java procedures and transformations, you can only view the parameters
in the Parameters panel, because the parameters are defined in the source Java
code (for a Java procedure) and in the XML or WSDL (for a transformation).
Each parameter is displayed with its TDV JDBC data type and the data type
native to the corresponding data source. The output parameters shown in this
panel are rendered as columns in the result set when you execute the
procedure.
• For a SQL script, packaged query, or parameterized query, use the Parameters
panel to design and edit the parameters.
XQuery Panel
The XQuery panel is available for XQuery transformations and procedures. Use
this panel to view an XQuery that returns an XML document when the
transformation is executed. The system auto-generates the XQuery when you use
the Model Panel to design the output for the transformation.
You can edit the XQuery in this panel, but after you start editing the XQuery, the
Model panel is permanently disabled. When editing XQuery directly, you can use
the same keyboard shortcuts as those used to edit SQL. See Studio Code Editing
Keyboard Shortcuts, page 660.
Inputs Panel
The Inputs panel is available for certain types of transformations (XSLT,
streaming, and XQuery). Use this panel to:
• View the input columns if you have used the Data Map panel to generate the
XSLT for an XSLT transformation.
• Supply global input parameters for an XQuery transformation.
Outputs Panel
The Outputs panel is available for certain types of transformations (XSLT,
streaming, and XQuery). Use this panel to:
• View the output columns if you have used the Data Map panel to generate the
XSLT for an XSLT transformation. If you edit the XSLT, disabling the Data
Map panel, you must use the Outputs panel to manually design the output
columns for the transformation.
• Supply global output parameters for an XQuery transformation. If you edit
the XQuery, disabling the Model panel, you must choose an appropriate
schema for the output XML document.
The output parameters shown in the Outputs panel are rendered as columns in
the result set when you execute the transformation.
For details about using the Outputs panel, see Adding Target Columns through
the Outputs Panel (XSLT), page 319.
Condition Tab
The Enable Trigger check box is at the top of the Condition tab. This check box
must be selected to activate a trigger.
The Trigger Condition tab has four panes in which you specify the trigger
information:
Condition Pane
The contents of the Condition pane depend on the condition type selected.
Action Pane
The contents of the Action pane depend on the action type (described in Action
Types for a Trigger, page 395). A description of the options for each action type are
as follows.
Execute Procedure
This action lets you trigger execution of a procedure without sending any
notification. The Action pane lists the fields to fill in:
Procedure Path—Specify the procedure to execute:
• Type the procedure path, or open the procedure and copy the path and name
from the Name field at the top of the procedure’s Info tab.
• Click the Browse button to open a dialog box that lists the resource tree folders
that contain procedures. When you make a valid selection, the OK button is
ungrayed.
Parameter Values—Specify any parameters and values that the procedure or
view uses:
• Click the Edit button to open a dialog box if the procedure takes parameters.
Type values in the fields. For fields left blank, check or leave checked the Null
box to the right of the field. When you click OK, an ordered, comma-separated
list of parameter values is posted to the Parameter Values field.
• Type (or edit after copy-pasting a starter set from the Parameter Values field)
an ordered, comma-separated list of input parameter values.
Note: Only static input parameters can be used by the trigger, but the procedure
invoked can use whatever variables it might require.
Exhaust output cursors—If the procedure is a SQL script that contains a PIPE
cursor, and if the pipe needs to fully execute and load all rows to complete
execution of a trigger (for example, to email a result set) checking this box forces
the trigger to wait until all the rows are buffered and read before completing. If
you leave this unchecked, the trigger returns while the pipe is buffering, cutting
off the buffering of the pipe.
Note: As soon as you begin typing text in any field of the dialog box, the NULL
box for that field is unchecked. If you decide to delete all text from the field, be
sure to recheck its NULL box.
Gather Statistics
This action lets you specify a data source for which to gather statistics data.
Data Source Path—Specify the data source path and name:
• Type the data source path; or open the data source, copy the contents from the
Name field at the top of the Info tab, and paste it here.
• Click the Browse button to open a dialog box that lists the resource tree folders
that contain data sources. When you make a valid selection, the OK button is
ungrayed.
To see how this action works, select the data source to be scanned for data
statistics. After creating the trigger, view the Triggers console in Studio Manager
to check if the trigger fired as scheduled.
Send E-mail
This action lets you specify a procedure or view to execute and then send the
results through email to one or more recipients. The Action pane lists the fields to
fill in:
The SQL definition set editor lets you create your own SQL type definitions,
exceptions, and constants.
• SQL Definition Set Editor Toolbars, page 675
See Creating a SQL Definition Set, page 379 for more information.
Label Functionality
Add Opens a window where you can choose a data type for a type definition
you want to add to a definition set. The new type definition is
automatically assigned a unique, descriptive name that you can change
by selecting Rename from the context menu in the Name column.
Available data types are listed in About Definition Sets, page 377.
Change Type (Types panel only) Opens a Choose Data Type dialog box in which you
can change the column type.
Available data types are listed in About Definition Sets, page 377.
Move Parameter Up Moves a column up, down, to the left, or to the right, respectively.
Move Parameter
These buttons are enabled only if an entry in the definition set is complex
Down
such as a CURSOR and it has non-complex base types (for example,
Move Parameter Out
binary, decimal, integer, string, time, or array) under it which can be
Move Parameter In
reordered.
The XML definition set editor lets you create XML schema definitions. You can
import definitions from existing XML files, validate definitions, format
definitions, and create definitions from an XML instance.
This editor has two panels—XML Schema and Info—which are described in the
following sections:
• XML Schema Panel (XML Definition Set), page 676
• XML Schema Panel Toolbar, page 676
See Creating an XML Definition Set, page 382 for more information.
Label Functionality
Import XML Schema Opens a dialog box in which to type or browse to the location of a file
Definitions from File(s) containing an XML schema definition.
Validate XML Schema Requests validation from the definition set editor. If it is not valid, an
Definitions error box describes the error and its line and column. (The current cursor
location is shown in the lower right corner of the panel in line:column
format.)
Format XML Schema Formats the definition set with hierarchical indentation, and with
Definitions keywords in orange type and literal definitions in blue type.
Create XML Schema Opens a window in which to navigate to an existing XML instance. When
Definitions from XML you click Open, the XML file appears on the XML Schema panel of the
Instance editor.
Label Functionality
Add Definition Source Opens an Add Definition Set Source window where you can type a name
for a definition source. The name is added to the Select Definition Source
drop-down list, from which it can be referenced from other sources
within the definition set.
Delete Definition Deletes the definition source currently displayed in the Select Definition
Source Source drop-down list (except XML DefinitionSet).
Rename Definition Opens a dialog box in which to enter a new name for the definition source
Source named in the Select Definition Source drop-down list. All references to
the old name will be updated.
The WSDL definition set editor lets you create your own WSDL definition set.
You can import WSDL and XML schema definitions, validate definitions, and
format definitions. This editor has two panels—WSDL and Info—which are
described in the following sections:
• WSDL Panel, page 677
• WSDL Panel Toolbar, page 678
See Creating a WSDL Definition Set, page 384 for more information.
WSDL Panel
The WSDL panel displays the XML schema defined for the WSDL, including type
definitions, message definitions, port type definitions, binding definitions, and
service definitions.
Studio displays keywords in orange type and literal definitions in blue type.
In the WSDL panel, you can type the definitions or import WSDL and XML
schema definitions from a file. You can also add definition source files that can be
referenced from other sources in the definition set.
Label Functionality
Import WSDL or XML Opens a dialog box in which to type or browse to the location of a file
Schema Definitions from containing an XML schema definition.
File(s)
Validate WSDL or XML Requests that the definition set editor validate the definition set. If it is
Schema Definitions valid, a dialog box says so; if it is not valid, an error dialog box
describes the error and at which line and column of the definition the
error was found. (The current cursor location is shown in the lower
right corner of the panel in line:column format (for example 1:100 for
line 1, column 100).
Format WSDL or XML Formats the definition set with hierarchical indentation, and with
Schema Definitions keywords in orange type and literal definitions in blue type.
Add Definition Source Opens an Add Definition Set Source window where you can type a
name for a definition source. The name is added to the Select
Definition Source drop-down list, from which it can be referenced
from other sources within the definition set.
Delete Definition Source Deletes the definition source currently displayed in the Select
Definition Source drop-down list (except WSDL DefinitionSet).
Rename Definition Source Opens a dialog box in which to enter a new name for the definition
source named in the Select Definition Source drop-down list. All
references to the old name will be updated.
Select Definition Source Provides a drop-down list of available definition sources (WSDL
DefinitionSet plus any you have added).