Teradata Fast Load Utility
Teradata Fast Load Utility
Fastload is the Teradata utility which can be used to load large amount of data in empty table
on a Teradata System. The property which makes Fastload UNIQUE is its fast speed to load huge
amount of data in tables.
You can load data from:
Disk or Tape drives from the channel attached client system.
File data on a network attached workstation.
Any other device which can provide the properly formatted source data
Fastload uses multiple sessions and block level operation to load data into table that is why its
much faster than any other loading utilities.
TIPHowever, one Fastload job loads data into one table only. If you want to load data into more
than one table, you must submit multiple Fastload jobsone for each table.
Three requirements for Fastload to execute LOG TABLE This table is used to record all the progress during execution of Fastload. This table
exists in the SYSADMIN database under the name of FASTLOG. The structure of the FASTLOG
table is as follows
EMPTY TABLE Fastload needs the table to be empty before inserting rows into that table. It does
not care how this is implemented but it requires empty tables without any prior records. If the
table is not empty then Fastload will pause and show the error.
TWO ERROR TABLES Fastload require two error tables to catch any error which occurs during the
execution of the Fastload. It creates the error tables itself. Each error tables record error of
specific type.
The first error table record any translation or constraint violation error. For e.g. if a row has data
which does not match with the data type of its particular column then this row will be captured
by error table 1. Insert CHAR data in INTEGER data type column.
The second error table captured any error which is related to duplication of values for Unique
Primary Indexes (UPI). Fastload will capture only one occurrence of the value and store the
duplicate occurrence in the second error table. However if the entire row is duplicated then
Fastload count it but does not store the row.
These tables are analyzed later for error handling.
Data Conversion Capabilities
Fastload also allow the data conversion on the column value before inserting it into target table.
Following are the conversion allowed by Fastload.
Phases of Fastload
Fastload divides its job into two phases
1) Phase 1 Acquisition
2) Phase 2 Application
Phase 1 Acquisition
The main objective of this phase is to send rows of data file from HOST computer to
Teradata AMPs as fast as possible.
Rows of the data file are packed into 64k blocks and send it to PE.
PE will parse the SQL of Fastload and send the Explain plan to each AMP. It will by default
create 1 session per AMP, so if your system has 200 AMP then it will make 200 sessions for
1 Fastload job.
Tip - It is advisable to restrict the session with the help of .SESSION command so that Fastload job
dont end up taking all the available resources of the system.
After creation of sessions the 64k blocks data is passed to AMP with the help of PE and
BYNET where it is quickly hashed according to its PI value.
Based on this row hash value the rows are then redistributed to its proper AMP. Internal
redistribution takes place within AMP, so that each AMP gets the correct row. To know
more about this redistribution based on row hash please refer Primary Index in
Teradata
Now each row is placed in its proper AMP, but they are not sorted till now.
Any error in this phase will be recorded in the Error table 1.
Phase 2 Application
The main objective of this phase is to store each row into the actual target table.
The rows are sorted by AMP which is temporarily stored in its DISK during phase 1
These sorted rows then send to actual target table where they will reside permanently.
All these operation are BLOCK level operation thus giving more speed when we
compare it ROW level operation.
Any error in this phase will be stored in Error table 2.
NO SECONDARY INDEXES ARE ALLOWED ON TARGET TABLE - Fastload can load tables only
with primary indexes defined on it. If we have a secondary index on the table then
Fastload will not load that table. We get an error message if we load such type of table
NO REFERENTIAL INTEGRITY IS ALLOWED - Fastload cannot load data into tables that are
defined with Referential Integrity (RI). This would require too much system checking to
prevent referential constraints to a different table
NO TRIGGERS ARE ALLOWED AT LOAD TIME - Fastload is much too focused on speed to
pay attention to the needs of other tables, which is what Triggers are all about.
Additionally, these require more than one AMP and more than one table. Fastload does
one table only. Simply ALTER the Triggers to the DISABLED status prior to using Fastload.
DUPLICATE ROWS ARE NOT SUPPORTED - Multiset tables are a table that allow duplicate
rows that is when the values in every column are identical. When Fastload finds
duplicate rows, they are discarded. While Fastload can load data into a multi-set table,
Fastload will not load duplicate rows into a multi-set table because Fastload discards
duplicate rows
A sample Fastload script
Below is the sample Fastload script which is designed to work in WINDOWS OS. To run the
Fastload in your system you need to install Fastload utility and then go to command prompt
and type >fastload < c:\fload_1.txt
where c:\fload_1.txt is the name and path of the Fastload which you want to execute.
A sample Fastload script
/***** Section 1 *****/
/* In this section we give the LOGIN credentials which is required to connect to TD system.
Sessions command is used to restrict the number of sessions Fastload will make to connect to
TD. Default is one session per AMP. */
.SESSIONS 4;
.LOGON 127.0.0.1/tduser,tduser;
/***** Section 2 *****/
/* In this section we are defining the table which we want to load from Fastload. DROP
commands are optional. There is no need to define the structure of ERROR tables theyll be
created itself by Fastload. */
drop table retail.emp_test;
drop table retail.emp_test_er1;
drop table retail.emp_test_er2;
create table retail.emp_test
(
emp_id integer not null,
emp_name varchar(50),
dept_id integer,
salary integer,
dob date formatyyyy-mm-dd
)
unique primary index(emp_id);
/***** Section 3 *****/
/* In this section we give the BEGIN loading statement. As soon as Fastload receives this
statement it starts PHASE 1. */
BEGIN LOADING
retail.emp_test
ERRORFILES
retail.emp_test_er1, retail.emp_test_er2;
/***** Section 4 *****/
/*RECORD command is used to skip the starting rows from the data file. RECORD THRU command
is used to skip the last rows of data file. SET RECORD command is used to define the records
layout and the , is the delimiter which we are using in our data file to separate columns. */
.RECORD 1;
.RECORD THRU 3;
SET RECORD VARTEXT ,;
/***** Section 5 *****/
/* DEFINE statement is used to define the structure of the data file. This should be in accordance
with the actual target table structure. Fastload DEFINE statement allows only VARCHAR format. */
DEFINE
emp_id (VARCHAR(9))
emp_name (VARCHAR(50))
dept_id (VARCHAR(9))
salary (VARCHAR(9))
dob (VARHAR(50))
/***** Section 6 *****/
/* FILE command defines the data file path and name. */
FILE = C:\fload_data.txt;
/***** Section 7 *****/
/* INSERT command is used to load the data file into actual target table. NOTE For DATE
columns we can use the data conversion by the syntax given below. */
INSERT INTO retail.emp_test
(
:emp_id ,
:emp_name,
:dept_id ,
:salary,
:dob (format yyyy-mm-dd)
);
/***** Section 8 *****/
/* END LOADING ends PHASE 1 and starts the execution of PHASE 2. LOGOFF is required to close
all the sessions created by Fastload. */
END LOADING;
.LOGOFF;
This is the simplest Fastload script which can be used to understand the concepts of Fastload.
Besides the commands given here there are also some other commands as well, which you can
see in next post.
Based on the requirements we can always tweak the Fastload.
Fastload Commands
Given below is the list of Fastload commands which is extensively used in creation of Fastload
scripts.
ERRLIMIT
LOGON/LOGOFF
or, QUIT
NOTIFY
NOTIFY command used to inform the job that follows that some event has
occurred. It is often used for detailed reporting on the Fastload jobs success.
RECORD
Specifies the beginning record number (or with THRU, the ending record
number) of the Input data source, to be read by Fastload. Syntactically, This
command is placed before the INSERT keyword.
SET RECORD
Used only in the LAN environment, this command states in what format the
data from the Input file is coming: Fastload, Unformatted, Binary, Text, or
Variable Text. The default is the Teradata RDBMS standard, Fastload.
SESSIONS
TENACITY
Suppose Fastload is not able to obtain all the session required by it. In that
case only two options are left. Either terminates the Fastload or re try to
obtain all the sessions.TENACITY specifies the amount of time, in hours;
Fastload will re try to obtain the sessions. The default for Fastload is no
tenacity, meaning that it will not retry at all.
SLEEP