You are on page 1of 2

Teradata Utilities are specialised tool for loading or exporting huge volume of

data as compared to conventional SQL tools like SQL Assistant & BTEQ. I wish to
learn the fundamental performance gain reason between them. I have provided my u
nderstanding on them below. I would appreciate any addition/correction to them.
(a) FastExport Vs BTEQ Export: BTEQ process each row individually while exportin
g whereas FastExport deals in block's export (63 or 64 KB).
The Block size is an indication of the message size which carries the rows.
Large the block size, more is the message size and hence, more number of rows ex
ported per message. The Block size (63 or 64 KB) is
not user modifiable in FastExport.
(b) FastLoad Vs BTEQ Import: Same as above. BTEQ Import process 1 row at a time,
meaning it import 1 row from the source file to the table.
FastLoad uses the buffer's size (63 or 64 KB as indicated by the Buffer Size).
This size of 63 or 64 KB as indicated by Buffer Size is defaulted to the maximum
with user's modification allowed.
No Transient Journaling is maintained for FastLoad, whereas BTEQ uses Transient
journaling.
(c) FastLoad Vs MultiLoad: MLoad allows journaling, NUSIs, fallback whereas FLoa
d doesn't allow any of them.
This is point # 1 for performance gain. To cater to duplicate records, MLoad use
s SEQUENCE NUMBER.
This sequence number is used to avoid loading the same row twice when restart ha
ppens.
As FLoad doesn't allow duplicate records, the SEQUENCE NUMBER isn't maintained i
n FLoad.
This is point # 2 for performance gain.
05 phases of MLoad and 02 phases of FLoad is point # 3.
MLoad works via Work Table whereas FLoad works directly on the Target Table, whi
ch is point # 4
for performance gain factor. The BLOCK SIZE as indicated by BUFFER SIZE pointin
g the message size
carrying rows is SAME for FastLoad and MultiLoad, and hence there is no discrep
ancy in this context.
===========================================
A few clarifications: (as this is read, all cases of Multiload can be read as TP
T Update, Fastload as TPT load, TPT INSERT--Bteq insert
TPump as TPT Streams - TPT should be being leveraged at this point for any new d
evelopmet rather than the legacy utilities)

a1) BTEQ returns rows in blocks from the database to the underlying CLI in block
s as well. Depending on your setup 64K may be the default
but I believe this can be adjusted upwrd to 1MB (CLI parameter not BTEQ paramet
er). This should be changed if exporting medium to large result sets.
a2) The primary difference between fastexport and BTEQ export is the ability to
ship data over multiple session connections simultaneously
thereby leveraging the total connectivity available between the client platform
and the database engine.
In order to do this, Fastexport spends more resources on executing the query in
order to prepare the blocks in such a way that when
they are exported over multiple sessions they can easily be reassembled in the r
ight order by the client without additiona sorting or
processing of the rows.

b1) BTEQ import does process a row at a time. Thus it is generally appropriate o
nly for very small imports.
b2) Fastload does use a buffered approach. But its real differentiation comes fr
om its use of multiple simultaneous sessions
which allow it to leverage all of the parallelism of the parallel database engin
e.
Multiple AMPs are working on receiving, deblocking, data type transformation, h
ashing and redistribution of the data.
b3) Fastload is only capable of inserting data into an empty table. This obviate
s the need for journaling because if it doesn't finish properly,
the proper recovery is to simply delete the table and start over.
BTEQ import uses journaling because it can load into an existing table and
needs to be able to roll back an unfinished import row.
c1) Fastload only allows insert into empty tables. Thus the thought is that it i
s easy to add indexes and other
table attributes after the table has been loaded with the data. No need for jour
nals because the data is either loaded or it is not.
No need to roll forward or back.
c2) The primary reason Fastload does not allow duplicate records is that at the
time it was designed and built,
multi-set tables did not exist.
Teradata's early days allowed only SET tables so Fastload did not have to be des
igned for being able to load duplicate records.
It was deemed a use friendly feature to automatically eliminate them rather tha
n report them as errors.
When Multiload was designed it had a very different set of requirements includin
g Multiset and the ability to insert, update and delete
in existing populated tables. This in turn led to very different design includi
ng the sequence numbers in the incoming rows to allow ordered apply,
insert of duplicate rows and the checkpoint/restart case while still preserving
the duplicate rows.
It was decided that we would not go back and redesign Fastload to cover all of
these cases but rather to leave it for the simple case of insert
into empty tables and have Multiload handle any other case not supported by Fas
tload.
Choosing between the bulk tools should be less about relative performance and mo
re about matching the required functionality to the use case.
If inserting data into an empty table (without dups) then Fastload else Multilo
ad.
BTEQ import and TPump are chosen if the import data is small or medium respectiv
ely and it is desirable to avoid the overhead of using a bulk
utility slot and the overhead of startup and shutdown of the bulk utility for s
mall data sets.
BTEQ export likewise is appropriate for small to medium result sets.
FastExport is designed to handle th high volume exports in an expeditious way.
And of course TPump can be used when the requirement is to load continuously.

You might also like