Migration Databases From Oracle To Teradata
Migration Databases From Oracle To Teradata
Paper 91
ABSTRACT
We carefully planned and provisioned our database migration from Oracle to Teradata. We had timelines, weekly
progress and planning meetings, presentations comparing current state to the future state, detailed project plans for
each task, Oracle DBAs, consultants from Teradata, and support from all departments. It was an ideal situation for
moving the Enterprise to a new system of record based upon Teradata.
Our team had delays and data issues that no one could anticipate. I had researched every issue that may happen
with SAS® upgrades, Teradata, and our particular environment. But the literature and support did not prepare us for
anticipating or solving our migration problems. Instead of 6 months, we only had 6 weeks to finish the project.
We had no time to hire an army of experts, so we had to solve our own issues. I will describe those issues, our
solutions, and our tricks that facilitated rapid development. I'm still surprised at our rapid progress, and I am thankful
for the team's efforts and ingenuity. Our experiences should help others who are planning or performing database
and SAS migrations.
Our industry is financial services, we are regulated by the federal government, and we must keep records for every
change. Our software environment is SAS on UNIX, SAS on PC, Teradata, Oracle, and Data Integration Studio with
the framework of SAS Credit Scoring for Banking. The UNIX hosts are multi-tiered with separate development and
production platforms.
INTRODUCTION
As you plan and execute your conversion project, you may need extra time to respond to the 19 items on this list.
Most of these items will be discussed later in more detail.
1. Getting the SAS contracts signed.
2. Getting the correct SAS Depot delivered.
3. Finding the UNIX software utilities for SAS/Access® software for Teradata.
4. Getting SAS/Access to Teradata working on the UNIX servers.
5. Getting SAS® software and Data Integration Studio installed on the laptop (64-bit, Windows 7).
6. Teradata views initially had many spool space errors.
7. Teradata engineers did not consult SAS experts on numeric precision in SAS.
8. Teradata numeric data had to be cast using SAS pass-through SQL.
9. BIGINT data was not visible to SAS/Access without casting the columns.
10. Very big or very small DECIMAL data in Teradata could cause runtime errors in SAS.
11. Data Integration Studio 4.9 did not include a Teradata custom transform for pass-through SQL.
12. Our development environment had recurring issues with Teradata libnames in Data Integration Studio.
13. Data Integration Studio objects were too large to export, and had to be broken into 10 pieces.
14. Java may need to be updated to prevent export errors in Data Integration Studio.
15. Validation of the unit test results took a long time.
16. A few omissions and errors occurred after conversion.
17. Training on Teradata.
18. Presentations to users on SAS/Access to Teradata and What's New in SAS.
19. Projects with deadlines in November, December, or January.
1
Migrating Databases from Oracle to Teradata, continued SESUG 2015
2
Migrating Databases from Oracle to Teradata, continued SESUG 2015
When using SAS Data Integration Studio, you should backup everything before you change it, and don't limit yourself
just to active jobs and tables. Keep the backup SPKs for 6+ months, in case you discover conversion issues or
omissions later. Also create backup SPKs for all major changes in Data Integration Studio, in case you need to revert
those changes. When SPK exports are created, save the SPK export logs in case there are any issues with creating
the SPKs. Keep all backup SPKs and SPK logs for 6+ months.
Keep all code, logs, and listings for all deployed jobs and for the SAS scheduler. If possible, keep historical data for
the last 4 or 6 months to cover errors that are not detected until end of quarter or end of year processing.
Data Integration Studio backups may take some time to complete. You may need to export Data Integration Studio
structures in smaller pieces, since the export may fail when too many objects are exported at one time. You may
have Java memory issues, which require customization by setting Java options or upgrading Java. This may be
difficult in a regulated environment where you have no administrative rights to your computer.
When data is imported into a new version of Data Integration Studio, you may have to import Data Integration Studio
packages in a specific sequence, because some packages may be dependent on metadata that is in another
package.
3
Migrating Databases from Oracle to Teradata, continued SESUG 2015
A Windows 7 and 64-bit upgrade was required before PC SAS 9.3 or SAS/Access to Teradata or Data Integration
Studio 4.9 would run. The Windows 7 install was a packaged release, and it was a piece of cake. The SAS software
was not a packaged release, and we had to download the SAS Depot to the laptop – all 30 GB! It looked impossible
after several days of waiting for the download to finish.
Network throughput was about 1% to 1.5% for one FTP session, which was too slow. So I aimed for 15% throughput
by running 8 simultaneous FTP sessions to get different parts of the SAS Depot. Eventually I downloaded all 30 GB
to my laptop in about one day. Then I was able to get all my SAS programs and utilities installed, so that I could start
working on the Data Integration Studio tasks.
We had the interactive client tools for Teradata and SAS. But SAS/Access for Teradata required a special set of
server tools (Teradata Utilities) that could only be got from Teradata customer support. SAS/Access to Teradata
would not run without it.
It was difficult to know what to ask for. SAS admin documentation had two different names for the Teradata tools, but
the names did not match the Teradata documentation. To find the correct set of tools, you may need to talk to
Teradata and SAS representatives, and get them to specify and deliver the required software by name and by
version. This was a critical issue for us, and delayed our project, so we had to get it solved quickly.
SAS COULD GET TO TERADATA, BUT DATA INTEGRATION STUDIO COULD NOT
Development environments are always imperfect and challenging. Early on, we only had SAS with SAS/Access to
Teradata in command-line mode and in batch mode on UNIX, with a temporary SAS license. But you work with
whatever you have at the time, and make as much progress as possible towards the final goal.
Since we could not test with Data Integration Studio, we ran prototype Teradata queries in SAS/Access on UNIX to
see if there were issues with our tables and views.
Uncovered issues with spool space.
Uncovered issues with poorly written views.
o "Explain plan" revealed the issues in Teradata.
4
Migrating Databases from Oracle to Teradata, continued SESUG 2015
5
Migrating Databases from Oracle to Teradata, continued SESUG 2015
Here is the generic Teradata SQL query, which is run in Teradata Studio to create the input for the SAS program.
Insert your own database name expressions at the bottom for “Your_TD_Database_%” and “Your_TD_DB2_%”.
select * from (select DatabaseName, TableName, ColumnName, ColumnType,
CASE ColumnType
WHEN 'BF' THEN 'BYTE(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'BV' THEN 'VARBYTE(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'CF' THEN 'CHAR(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'CV' THEN 'VARCHAR(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'D ' THEN 'DECIMAL(' || TRIM(DecimalTotalDigits) || ',' || TRIM(DecimalFractionalDigits) || ')'
WHEN 'DA' THEN 'DATE'
WHEN 'F ' THEN 'FLOAT'
WHEN 'I1' THEN 'BYTEINT'
WHEN 'I2' THEN 'SMALLINT'
WHEN 'I8' THEN 'BIGINT'
WHEN 'I ' THEN 'INTEGER'
WHEN 'AT' THEN 'TIME(' || TRIM(DecimalFractionalDigits) || ')'
WHEN 'TS' THEN 'TIMESTAMP(' || TRIM(DecimalFractionalDigits) || ')'
WHEN 'TZ' THEN 'TIME(' || TRIM(DecimalFractionalDigits) || ')' || ' WITH TIME ZONE'
WHEN 'SZ' THEN 'TIMESTAMP(' || TRIM(DecimalFractionalDigits) || ')' || ' WITH TIME ZONE'
WHEN 'YR' THEN 'INTERVAL YEAR(' || TRIM(DecimalTotalDigits) || ')'
WHEN 'YM' THEN 'INTERVAL YEAR(' || TRIM(DecimalTotalDigits) || ')' || ' TO MONTH'
WHEN 'MO' THEN 'INTERVAL MONTH(' || TRIM(DecimalTotalDigits) || ')'
WHEN 'DY' THEN 'INTERVAL DAY(' || TRIM(DecimalTotalDigits) || ')'
WHEN 'DH' THEN 'INTERVAL DAY(' || TRIM(DecimalTotalDigits) || ')' || ' TO HOUR'
WHEN 'DM' THEN 'INTERVAL DAY(' || TRIM(DecimalTotalDigits) || ')' || ' TO MINUTE'
WHEN 'DS' THEN 'INTERVAL DAY(' || TRIM(DecimalTotalDigits) || ')' || ' TO SECOND(' ||
TRIM(DecimalFractionalDigits) || ')'
WHEN 'HR' THEN 'INTERVAL HOUR(' || TRIM(DecimalTotalDigits) || ')'
WHEN 'HM' THEN 'INTERVAL HOUR(' || TRIM(DecimalTotalDigits) || ')' || ' TO MINUTE'
WHEN 'HS' THEN 'INTERVAL HOUR(' || TRIM(DecimalTotalDigits) || ')' || ' TO SECOND(' ||
TRIM(DecimalFractionalDigits) || ')'
WHEN 'MI' THEN 'INTERVAL MINUTE(' || TRIM(DecimalTotalDigits) || ')'
WHEN 'MS' THEN 'INTERVAL MINUTE(' || TRIM(DecimalTotalDigits) || ')' || ' TO SECOND(' ||
TRIM(DecimalFractionalDigits) || ')'
WHEN 'SC' THEN 'INTERVAL SECOND(' || TRIM(DecimalTotalDigits) || ',' || TRIM(DecimalFractionalDigits)
|| ')'
WHEN 'BO' THEN 'BLOB(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'CO' THEN 'CLOB(' || TRIM(CAST(ColumnLength AS INTEGER)) || ')'
WHEN 'PD' THEN 'PERIOD(DATE)'
WHEN 'PM' THEN 'PERIOD(TIMESTAMP('|| TRIM(DecimalFractionalDigits) || ')' || ' WITH TIME ZONE'
WHEN 'PS' THEN 'PERIOD(TIMESTAMP('|| TRIM(DecimalFractionalDigits) || '))'
WHEN 'PT' THEN 'PERIOD(TIME(' || TRIM(DecimalFractionalDigits) || '))'
WHEN 'PZ' THEN 'PERIOD(TIME(' || TRIM(DecimalFractionalDigits) || '))' || ' WITH TIME ZONE'
WHEN 'UT' THEN COALESCE(ColumnUDTName, '<Unknown> ' || ColumnType)
WHEN '++' THEN 'TD_ANYTYPE'
WHEN 'N' THEN 'NUMBER(' ||
CASE WHEN DecimalTotalDigits = -128 THEN '*'
ELSE TRIM(DecimalTotalDigits) END
|| CASE WHEN DecimalFractionalDigits IN (0, -128) THEN ''
ELSE ',' || TRIM(DecimalFractionalDigits) END || ')'
WHEN 'A1' THEN COALESCE('SYSUDTLIB.' || ColumnUDTName, '<Unknown> ' || ColumnType)
WHEN 'AN' THEN COALESCE('SYSUDTLIB.' || ColumnUDTName, '<Unknown> ' || ColumnType)
ELSE '<Unknown> ' || ColumnType
END
||
CASE WHEN ColumnType IN ('CV', 'CF', 'CO') THEN
CASE CharType
WHEN 1 THEN ' CHARACTER SET LATIN'
WHEN 2 THEN ' CHARACTER SET UNICODE'
WHEN 3 THEN ' CHARACTER SET KANJISJIS'
WHEN 4 THEN ' CHARACTER SET GRAPHIC'
WHEN 5 THEN ' CHARACTER SET KANJI1'
ELSE ''
END
ELSE ''
END as Column_Type_Length,
CASE WHEN (DecimalTotalDigits > 17 OR DecimalFractionalDigits > 17) THEN 'CAST'
WHEN ColumnType = 'I8' THEN 'BIGINT'
WHEN (DecimalTotalDigits between 16 and 17 OR
DecimalFractionalDigits between 16 and 17) THEN 'MAYBE'
END as Cast_Required
from DBC.ColumnsV
where (DatabaseName like ('Your_TD_Database_%') OR DatabaseName like any ('Your_TD_DB2_%')) and
(ColumnType in ('I8') or DecimalTotalDigits > 15 or
DecimalFractionalDigits > 15 or Cast_Required is not NULL)) Z
order by 1, 2, 3, 4;
6
Migrating Databases from Oracle to Teradata, continued SESUG 2015
Once you have created the CSV file, the SAS program inputs the CSV data using these data attributes:
format DatabaseName $50. ;
format TableName $40. ;
format ColumnName $40. ;
format ColumnType $8. ;
format Column_Type_Length $20. ;
format Cast_Required $8. ;
Now sort the query output, and process the file to produce the min/max query. Database names have been altered.
proc sort data=WORK.TD_Dictionary;
by DatabaseName TableName ColumnName;
run;
data MinMax_Cmds;
set WORK.TD_Dictionary (where=(DatabaseName in (Your_Name1', 'Your_Name_2'))) end=Done;
by DatabaseName TableName ColumnName;
retain From_Line From_Line2 DatabaseName2 TableName2;
length From_Line From_Line2 DatabaseName2 TableName2 $ 100
lineA lineB lineC lineD lineC2 lineD2 lineF $ 1000;
7
Migrating Databases from Oracle to Teradata, continued SESUG 2015
The min/max query output looks like this, where it performs a union of all column ranges per table. You can run this
program in Teradata Studio, and it should produce one table that you could export to Excel for analysis. One table is
produced because the columns have the same names for all queries of all tables. This information could be used to
create Teradata views that respect the 15-digit limit in SAS.
select
'Your_Database' as DB
,'Table1' as TableName
,'Var1' as ColumnName
,min(Var1) as min_Var1
,max(Var1) as max_Var1
from Your_Database.Table1
UNION
select
'Your_Database' as DB
,'Table1' as TableName
,'Var2' as ColumnName
,min(Var2) as min_Var2
,max(Var2) as max_Var2
from Your_Database.Table1 ;
8
Migrating Databases from Oracle to Teradata, continued SESUG 2015
Every test required evidence, so artifacts must be created and kept for every test. If you forgot to create a test log,
then you had to rerun the program to get the SAS log.
When three people are developing and testing so many programs, the lone testing resource can get overloaded. So
we made his job easier by doing our own testing and initial QA. If we found errors, then we corrected them before
passing them on to testing.
Any differences had to be explained, and data research was needed for any unresolved difference. We had more
than our share of differences to resolve because our surrogate keys would not be in synch until our project ended.
Since we used Proc Compare, we had to find good keys to use instead of our surrogate keys.
You may start your key search by sorting with all columns, and then trimming the list until you find a sort order that
gives good results. If you still have issues, then you had to look for a match in the common subset between the pair
of files. If you still have no match, then you have a problem that should be corrected or explained. So we really
missed the surrogate keys. And we needed a good SAS macro to make our testing easier.
When the macro finished, you had an RTF file on UNIX with the comparison results, which could be sent to the tester
for analysis. I ran this through Proc Connect, but you can run it any way you like. This will run slow on large files
because it does several sorts and several subsets.
9
Migrating Databases from Oracle to Teradata, continued SESUG 2015
Here is the macro that was used for testing, with a few names hidden to protect directory paths:
/*-- Define all libraries here!! These libnames are used in the macro --*/
libname ext_1D "xyz";
/*-- Library for Terdata test results--*/
libname ext_1TD "xyz_teradata";
/*-- Library for the sorted datasets for Proc Compare analysis --*/
/*-- CHANGE THIS TO YOUR OWN LIBRARY FOR DATASETS AND RTF REPORTS --*/
libname R2_Home "wherever/you/live";
/*-- Set Compare options to print more columns with differences: 1.11E-11 --*/
%let Cmp_Options = MAXPRINT=(10, 32627) Method=Absolute Criterion=-100000
ListCompvar;
%macro compare_datasets (
base=ext_1D, /*-- Oct 2014 saved datasets --*/
compare=ext_1TD,/*-- Teradata test output datasets --*/
out=R2_Home, /*-- UNIX location for sorted dataset to compare --*/
tbl=Tbl_Name, /*-- Table to compare --*/
Common=N, /*-- Pick the common subset to compare --*/
/*-- Set to N when Base and Compare have same # of OBS --*/
BD2=); /*-- Make this equal to _BD2 for BD2 comparisons --*/
/*-- Reset the length of the name so it's <= 32 characters --*/
%if %length(&tbl) < 25 %then %let strlen = %length(&tbl);
%else %let strlen = 25;
%let tbl_Short = %substr(&tbl,1,&strlen);
options linesize=110;
ods rtf file="/Some_local_UNIX_Dir/Where_to_Save_files/dat/&tbl.&BD2..rtf";
See the definition of the macro for the documentation for the parameters. Define all libnames at the top of the code.
10
Migrating Databases from Oracle to Teradata, continued SESUG 2015
Next, backup the *.sas files, rename the *.sas_new to *.sas, and copy the edited files back to your
production directory to run them. Restore the backup *.sas after your testing is completed.
Create and deliver presentations on using Teradata with SAS, including examples of Teradata pass-through
SQL, the Teradata libname access method, and Teradata In-Database technology.
Create and deliver a presentation on What's New in SAS.
o Cover What's New for all intervening versions, since SAS does not repeat itself on the What's New.
o Present information, references, working examples, and an indexed list of relevant SAS papers to
the user community. Post the papers on a shared storage device.
o Make sure that new features work before you announce those features.
11
Migrating Databases from Oracle to Teradata, continued SESUG 2015
WHAT'S NEXT?
More conversions and more data to learn, of course.
Converting our modeling database to a new platform.
Creating our Analytics Competency Center for Enterprise Data Management.
REFERENCES
SAS Institute Inc. 2014. SAS/ACCESS® 9.4 for Relational Databases: Reference, Sixth Edition. Cary, NC: SAS
Institute Inc.
SAS Institute Inc. 2014. SAS® 9.4 Intelligence Platform: System Administration Guide, Third Edition. Cary, NC:
SAS Institute Inc.
SAS Institute Inc. 2014. SAS® 9.4 Companion for UNIX Environments, Fourth Edition. Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. System Requirements for SAS® 9.4 Foundation for Solaris for x64. Cary, NC: SAS Institute
Inc.
SAS Institute Inc. 2014. SAS® Credit Scoring for Banking 5.3: Administrator's Guide. Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. SAS® Credit Scoring for Banking 5.3: Migration Guide. Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. SAS® Credit Scoring for Banking 5.3: User's Guide. Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. SAS® 9.4 Intelligence Platform: Desktop Application Administration Guide, Third Edition.
Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. SAS® Data Integration Studio 4.9: User's Guide. Cary, NC: SAS Institute Inc.
SAS Institute Inc. 2014. SAS® 9.4 In-Database Products: User's Guide, Fourth Edition. Cary, NC: SAS Institute Inc.
Teradata Corporation. 2013. SQL Data Types and Literals. San Carlos, CA: Teradata Corporation.
Teradata Corporation. 2013. SQL Functions, Operators, Expressions, and Predicates. San Carlos, CA: Teradata
Corporation.
“Teradata Developer Exchange”. 2015. Available at http://forums.teradata.com/forum/teradata-studio
“Teradata Studio Features”. 2015. Available at http://developer.teradata.com/tools/articles/teradata-studio
“Teradata Studio Download”. 2015. Available at http://downloads.teradata.com/download/tools/teradata-studio
“Beyond Compare”. 2015. Available at http://scootersoftware.com
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Phillip Julian
PNC Financial Services Group, Inc.
131 N. Church St.
Rocky Mount, NC 27804
(252) 454-3604
[email protected]
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
© 2015 The PNC Financial Services Group, Inc. All Rights Reserved.
The information contained herein: (1) is proprietary to The PNC Financial Services Group and/or its subsidiaries,
affiliates or content providers; (2) may not be copied or distributed without the express written permission of The PNC
Financial Services Group, Inc.; and (3) is for general information purposes only and is not warranted to be accurate,
complete or timely. Neither PNC nor its subsidiaries, affiliates or content providers are responsible for any damages
or losses arising from any use of the information contained in this article.
The opinions and views expressed by the authors do not necessarily reflect the opinions and views of The PNC
Financial Services Group or any of its subsidiaries or affiliates, nor does the reference to third party products and
services in this article constitute endorsements by the authors or The PNC Financial Services Group or any of its
subsidiaries or affiliates of any of the products or services of others referenced herein. The statements contained
herein are based upon the data available as of the date of this article and are subject to change at any time without
notice. The information presented in this article has been obtained from, and is based upon, sources believed to be
reliable; however, no representations, guarantee or warranty, express or implied, can be made as to its accuracy,
completeness or correctness.
12