You are on page 1of 25

Indexes

Objectives
After completing this module, you should be able to:

Define primary and secondary indexes and their purposes.


Distinguish between a primary index and a primary key.
Distinguish between a UPI and a NUPI.
Define a Partition Primary Index and its purpose.
Distinguish between a USI and a NUSI.
Explain the makeup of the Row-ID and its role in row storage.
Describe the sequence of events for locating a row.
Explain the roles of the hashing algorithm and hash map in locating a
row.

Describe the operation of full table scans in Teradata.

Indexes in Teradata
Indexes are used to access rows from a table without having to search the whole
table. In the Teradata RDBMS, an index is made up of one or more columns in a
table. Once Teradata indexes are selected, they are maintained by the system.
While other vendors may require data partitioning or index maintenance, these
tasks are unnecessary with Teradata.
In the Teradata RDBMS, there are two types of indexes:

Primary Indexes define the way the data is distributed.


Primary Indexes and Secondary Indexes are used to locate the data
rows more efficiently than scanning the whole table.

You specify which column(s) are used as the Primary Index when you create a
table. Secondary Index column(s) can be specified when you create a table or at
any time during the life of the table.

Data Distribution
When the Primary Index for a table is well chosen, the table rows are evenly
distributed across the AMPs for the best performance. The way to guarantee

Teradata Indexes - Workshop

even distribution of data is by choosing a Primary Index whose columns contain


unique values. The values do not have to be evenly spaced, or even "truly
random," they just have to be unique to be evenly distributed.
The even distribution enables each AMP to be responsible for only a subset of
the rows in a table. If the data is evenly distributed, the work is evenly divided
among the AMPs so they can work in parallel and complete their processing
about the same time. Even data distribution is critical to performance because it
optimizes the parallel access to the data.

Unevenly distributed data, also called "skewed data," causes slower response
time as the system waits for the AMP(s) with the most data to finish their
processing. The slowest AMP becomes a bottleneck.

When data is loaded into the Teradata RDBMS:

The system automatically distributes the data across the AMPs based on

Teradata Indexes - Workshop

row content (the Primary Index values).


The distribution is the same regardless of the data volume being loaded.
In other words, large tables are distributed the same way as small tables.

Data is not distributed in any particular order. The automatic, unordered


distribution of data eliminates tasks for a Teradata DBA that are necessary with
some other relational database systems. The DBA does not waste time on laborintensive data maintenance tasks. Some benefits of unordered data include:

Prior to loading the data, no initial data ordering or sorting is necessary.


Once data is loaded, no data maintenance is necessary to preserve the
order.
SQL requests can be formulated without regard to the data order.

A Teradata system provides high performance because it distributes the data


evenly across the AMPs for parallel processing.
Question
Which of the following statements do you think are true about data distribution and
Teradata indexes? (Choose two answers.)
A. If a table has 103 rows but there are 4 AMPs in the system, each AMP will not
have exactly the same number of rows from that table. However, if the Primary
Index is chosen well, each AMP still will contain some rows from that table.
B. The rows of a table are stored on a single disk for best access performance.
C. Skewed data leads to poor performance in processing data access requests.
D. Teradata RDBMS performance can be increased by maintaining the indexes and
conducting periodic data partitioning and sorting.

Primary Index (PI)


A Primary Index is the mechanism for assigning a data row to an AMP and a
location on the AMPs disks. It is also used to access rows without having to
search the entire table. You specify the column(s) that comprise the Primary
Index for a table when the table is created. For a given row, the Primary Index
value is the combination of the data values in the Primary Index columns.
Choosing a Primary Index for a table is perhaps the most critical decision a
database designer makes, because this choice affects both data distribution and
access.

Primary Index Rules

Teradata Indexes - Workshop

The following rules govern how Primary Indexes implemented in a Teradata


system must be defined as well as how they function:
Rule 1: One Primary Index per table.
Rule 2: A Primary Index value can be unique or non-unique.
Rule 3: The Primary Index value can be NULL.
Rule 4: The Primary Index value can be modified.
Rule 5: The Primary Index of a table cannot be modified.
Rule 6: A Primary Index has a limit of 16 columns.

Rule 1: One PI Per Table


Each table must have a Primary Index. The Primary Index is the only way for
the system to determine where a row will be physically stored. While a Primary
Index may be composed of multiple columns, the table can have only one
(single- or multiple-column) Primary Index.

Rule 2: Unique or Non-Unique PI


There are two types of Primary Index:

Unique Primary Index (UPI) - For a given row, the combination of the
data values in the columns of a Unique Primary Index are not duplicated
in other rows within the table. This uniqueness guarantees uniform data
distribution and direct access. For example, in the case where old
employee numbers are sometimes recycled, the combination of the Last
Name and Employee Number columns would be a UPI.

Non-Unique Primary Index (NUPI) - For a given row, the combination


of the data values in the columns of a Non-Unique Primary Index can be

Teradata Indexes - Workshop

duplicated in other rows within the table. A NUPI can cause skewed
data, but in specific instances can still be a good Primary Index choice.
For example, either the Department Number column or the Hire Date
column might be a good choice for a NUPI if you will be accessing the
table most often via these columns.

Rule 3: PI Can Be NULL

If the Primary Index is unique, you could have one row with a null value. If you
have multiple rows with a null value, the Primary Index must be Non-Unique.

Rule 4: PI Value Can Be Modified


The Primary Index value can be modified. In the table below, if Loretta Ryan
changes departments, the Primary Index value for her row changes.
When you update the index value in a row, Teradata re-hashes it and
redistributes the row to its new location based on its new index value.

Teradata Indexes - Workshop

Rule 5: PI Cannot Be Modified


The Primary Index of a table cannot be modified.
In the event that you need a new Primary Index, you must drop the table,
recreate it with the new Primary Index, and reload the table.
In Teradata RDBMS V2R5, the ALTER TABLE statement allows you to change
the PI of a table if the table is empty.

Rule 6: PI Has 16-Column Limit


You can designate a Primary Index that is composed of 1 to 16 columns.
In Teradata RDBMS V2R5, the maximum number of columns in an index is
increased to 64.

SQL Syntax for Creating a Primary Index


When a table is created, it must have a Primary Index specified. The Primary
Index is created in the CREATE TABLE statement in SQL.
If you do not specify a Primary Index in the CREATE TABLE statement, the
system will use the Primary Key as the Primary Index. If a Primary Key has not
been specified, the system will choose the first unique column. If there are no
unique columns, the system will use the first column in the table and designate
it as a Non-Unique Primary Index.

Teradata Indexes - Workshop

Creating a Unique Primary Index


The SQL syntax to create a Unique Primary Index is:
CREATE TABLE sample_1
(col_a INT
,col_b INT
,col_c INT)
UNIQUE PRIMARY INDEX (col_b);

Creating a Non-Unique Primary Index


The SQL syntax to create a Non-Unique Primary Index is:
CREATE TABLE sample_2
(col_x INT
,col_y INT
,col_z INT)
PRIMARY INDEX (col_x);

Modifying thePrimary Index of a Table


As mentioned in the Primary Index rules, you cannot modify the Primary Index
of a table. In the event that you need a new Primary Index, you must drop the
table, recreate it with the new Primary Index, and reload the table.

Data Mechanics of Primary Indexes


This section describes how Primary Indexes are used in:

Data distribution

Data access

Distributing Rows to AMPs


Rows are distributed to AMPs during the following operations:

Loading data into a table (one or more rows, using a data loading utility)
Inserting or updating rows (one or more rows, using SQL)
Changing the system configuration (redistribution of data, caused by
reconfigurations to add or delete AMPs)

Teradata Indexes - Workshop

When loading data or inserting rows, the data being affected by the load or insert is not
available to other users until the transaction is complete. During a reconfiguration, no
data is accessible to users until the system is operational in its new configuration.
Row Distribution Process
The process the system uses for inserting a row on an AMP is described below:

1. The system uses the Primary Index value in each row as input to the hashing
algorithm.
2. The output of the hashing algorithm is the row hash value (in this example, 646).
3. The system looks at the hash map, which identifies the specific AMP where the
row should be stored (in this example, AMP 3).
4. The row is stored on the target AMP.
o UPI: The system automatically checks for duplicate UPI values when rows
are loaded or inserted. If a row already exists with the UPI value, the new
row is not added.
o NUPI: The system does not check for duplicate NUPI values. If a row
already exists with the NUPI value, the new row is added to the same
AMP.
Hash Map
A hash map is an array that associates hash bucket numbers with specific AMPs. While it
has a limited number of hash buckets, there are enough hash buckets to minimize the
number of hash collisions (when the hashing algorithm calculates the same row hash
value for two different rows).
The hash map is a GDO (globally distributed object), which is a file that is copied and
distributed to every node in the system. If an AMP is executing a request that requires
information in a GDO, it can access the copy of the GDO on its node.
Teradata Indexes - Workshop

Duplicate Row Hash Values


It is possible for the hashing algorithm to end up with the same row hash value for two
different rows. There are two ways this could happen:

Duplicate NUPI values: If a Non-Unique Primary Index is used, duplicate NUPI


values will produce the same row hash value.
Hash synonym: Also called a hash collision, this occurs when the hashing
algorithm calculates an identical row hash value for two different Primary Index
values. Hash synonyms are very rare. When using a Unique Primary Index, you
will still get uniform data distribution.

To differentiate each row in a table, every row is assigned a unique Row ID. The Row ID
is the combination of the row hash value and a uniqueness value.

Row ID = Row Hash Value + Uniqueness Value


The uniqueness value is used to differentiate between rows whose Primary Index values
generate identical row hash values. In most cases, only the row hash value portion of the
Row ID is needed to locate the row.

When each row is inserted, the AMP adds the row ID, stored as a prefix of the row. The
first row inserted with a particular row hash value is assigned a uniqueness value of 1.
The uniqueness value is incremented by 1 for any additional rows inserted with the same
row hash value.

Duplicate Rows
A duplicate row is a row in a table whose column values are identical to another
row in the same table. In other words, the entire row is the same, not just an
index. Although duplicate rows are not allowed in the relational model (because
every Primary Key must be unique), Teradata does allow duplicate rows
because the capability is a part of the ANSI standard.
Because duplicate rows are allowed in Teradata, how does it affect the UPI,
which, by definition, is unique? When you create a table, the following

Teradata Indexes - Workshop

definitions determine whether or not it can contain duplicate rows:

MULTISET tables: May contain duplicate rows. Teradata will not check
for duplicate rows.

SET tables: The default. Teradata checks for and does not permit
duplicate rows. If a SET table is created with a Unique Primary Index,
the check for duplicate rows is replaced by a check for duplicate index
values.

Accessing a Row With a Primary Index


When a user submits an SQL request using the table name and Primary Index,
the request becomes a one-AMP operation, which is the most direct and
efficient way for the system to find a row. The process is explained below.

Hashing Process
1.
2.
3.
4.
5.

The primary index value goes into the hashing algorithm.


The output of the hashing algorithm is the row hash value.
The hash map points to the specific AMP where the row resides.
The PE sends the request directly to the identified AMP.
The AMP locates the row(s) on its vdisk.

6. The row data is sent over the BYNET to the PE, and the PE sends the
answer set on to the client application.

Teradata Indexes - Workshop

10

Choosing a Unique or Non-Unique Primary Index


Criteria for choosing a Primary Index include:

Uniqueness: A UPI guarantees even data distribution, so is often a good


choice. A NUPI with few duplicate values could provide good (if not
perfectly uniform) distribution, and might meet the other criteria better.

Use in value access: Retrievals, updates, and deletes that specify the
Primary Index are much faster than those that do not. Because a Primary
Index is a known access path to the data, it is best to choose column(s)
that will be frequently used for access. For example, the following SQL
statement would directly access a row based on the equality WHERE
clause:
SELECT * FROM employee WHERE employee_ID = ABC456789

A NUPI may be a better choice if the access is based on another, mostly unique
column. For example, the table may be used by the Mail Room to track package
delivery. In that case, a column containing room numbers or mail stops may not
be unique if employees share offices, but a better choice for access.

Use in join access: SQL requests that use a JOIN statement perform the
best when the join is done on a Primary Index. Consider Primary Key
and Foreign Key columns as potential candidates for Primary Indexes.
For example, if the Employee table and the Payroll table are related by
the Employee ID column, then the Employee ID column could be a good
Primary Index choice for one or both of the tables.

Non-volatile values: Look for columns where the values do not change
frequently. For example, in an Invoicing table, the outstanding balance
column for all customers probably has few duplicates, but probably
changes too frequently to make a good Primary Index. A customer ID,
statement number, or other more stable columns may be better choices.

When choosing a Primary Index, try to find the column(s) that best fit these
criteria and the business need.
Questions
What do you think are key considerations in choosing a Primary Index? (Choose three.)
A. Column(s) containing unique (or nearly unique) values for uniform distribution.
B. Column(s) with values in sequential order for best load and access performance.
C. Column(s) frequently used in queries to access data or to join tables.
D. Column(s) with values that are stable (do not change frequently), to minimize
redistribution of table rows.

Teradata Indexes - Workshop

11

E. Column(s) with many duplicate values for redundancy.

Partitioned Primary Index

In Teradata RDBMS V2R5 there is a new indexing mechanism called


Partitioned Primary Index (PPI). PPI is used to improve performance for large
tables when you submit queries that specify a range constraint. PPI allows you
to reduce the number of rows to be processed by using a new technique called
partition elimination. PPI will increase performance for incremental data loads,
deletes, and data access when working with large tables with range constraints.

How Does PPI Work?


Data distribution with PPI is still based on the Primary Index:
Primary
Index

Hash
Value

Determines which AMP gets the row

With PPI, the ORDER in which the rows are stored on the AMP is affected.
Using the traditional method, No Partitioned Primary Index (NPPI), the rows
are stored in row hash order.
4 AMPs with Orders Table Defined with NPPI

Using PPI, the rows are stored first by partition and then by row hash. In our
example, there are four partitions. Within the partitions, the rows are stored in
row hash order.
4 AMPs with Orders Table Defined with PPI on O_Date

Teradata Indexes - Workshop

12

Data Storage Using PPI


To store rows using PPI: specify Partitioning in the CREATE TABLE statement.
The query will run through the hashing algorithm as normal, and come out with
the Base Table ID, the Partition number(s), the Row Hash, and the Primary
Index values.
Data Storage Using PPI

Teradata Indexes - Workshop

13

Access Without a PPI


Let's say you have a table with Store information by Location and did not use a
PPI. If you query on Location 3 on this NPPI table, the entire table will be
scanned to find records for Location (Full Table Scan).
Access Without a PPI
QUERY
PLAN

SELECT * FROM Employee_NPPI


WHERE Location_Number = 3;
ALL-AMPs - Full Table Scan

Teradata Indexes - Workshop

14

Access With a PPI


In the same example for a PPI table, you would partition the table with as many
Locations as you have (or will soon have in the future.) Then if you query on
Location 3, each AMP will use partition elimination and each AMP only has to
scan partition 3 for the query. This query will run much faster than the Full
Table Scan in the previous example.
Access With a PPI
QUERY
PLAN

SELECT * FROM Employee


WHERE Location_Number = 3;

ALL-AMPs - Single Partition


Scan

Secondary Index (SI)


A Secondary Index is an alternate data access path. It allows you to access the
data without having to do a full table scan. Secondary indexes do not affect how
rows are distributed among the AMPs.
You can drop and recreate secondary indexes dynamically, as they are needed.

Teradata Indexes - Workshop

15

Unlike Primary Indexes, Secondary Indexes are stored in separate subtables that
require extra overhead in terms of disk space, and maintenance which is handled
automatically by the system. So, Secondary Indexes do require some system
resources.
Question
In what instances would it be a good idea to define a secondary index for a table? (This
information will be covered in this module, but here is a preview.)
1. The Primary Index exists for even data distribution and data access, but a
Secondary Index is defined to efficiently generate monthly reports based on a
different set of columns.
2. The Product table is accessed by the retailer (who accesses data based on the
retailer's product code column), and by a vendor (who access the same data based
on the vendor's product code column).
3. The table already has a Unique Primary Index, but a second column must also
have unique values. The column is specified as a Unique Secondary Index (USI)
to enforce uniqueness on the second column.
4. All of the above.

Secondary Index Rules


Several rules that govern how Secondary Indexes must be defined and how they
function are:
Rule 1: Secondary Indexes are optional.
Rule 2: Secondary Index values can be unique or non-unique.
Rule 3: Secondary Index values can be NULL.
Rule 4: Secondary Index values can be modified.
Rule 5: Secondary Indexes can be changed.
Rule 6: A Secondary Index has a limit of 16 columns.

Rule 1: Optional SI
While a Primary Index is required, a Secondary Index is optional. If one path to
the data is sufficient, no Secondary Index need be defined.
You can define 0 to 32 Secondary Indexes on a table for multiple data access
paths. Different groups of users may want to access the data in various ways.
You can define a Secondary Index for each heavily used access path.

Teradata Indexes - Workshop

16

Rule 2: Unique or Non-Unique SI


Like Primary Indexes, Secondary Indexes can be unique or non-unique.

A Unique Secondary Index (USI) serves two possible purposes:


o

Enforces uniqueness in a column or group of columns. The


database will check USIs to see if the values are unique. For
example, if you have chosen different columns for the Primary
Key and Primary Index, you can make the Primary Key a USI to
enforce uniqueness on the Primary Key.
Speeds up access to a row. Accessing a row with a USI requires
one or two AMPs, which is less direct than a UPI (one AMP)
access, but more efficient than a full table scan.

A Non-Unique Secondary Index (NUSI) is usually specified to prevent


full table scans, in which every row of a table is read. The Optimizer
determines whether a full table scan or NUSI access will be more
efficient, then picks the best method. Accessing a row with a NUSI
requires all AMPs.

Rule 3: SI Can Be NULL

As with the Primary Index, the Secondary Index column may contain NULL
values.

Rule 4: SI Value Can Be Modified

Teradata Indexes - Workshop

17

The values in the Secondary Index column may be modified as needed.

Rule 5: SI Can Be Changed

Secondary Indexes can be changed. Secondary Indexes can be created and


dropped dynamically as needed. When the index is dropped, the system
physically drops the subtable that contained it.

Rule 6: SI Has 16-Column Limit


You can designate a Secondary Index that is composed of 1 to 16 columns. To
use the Secondary Index below, the user would specify both Budget and
Manager Employee Number.
In Teradata RDBMS V2R5, the maximum number of columns in an index is
increased to 64.

Using Secondary Indexes


In the table below, users will be accessing data based on the Department Name
column. The values in that column are unique, so it has been made a USI for
efficient access. In addition, the company wants reports on how many
departments each manager is responsible for, so the Manager Employee Number
can also be made a secondary index. It has duplicate values, so it is a NUSI.

Teradata Indexes - Workshop

18

How Secondary Indexes Are Stored


Secondary indexes are stored in index subtables. The subtables for USIs and
NUSIs are distributed differently:

USI: The Unique Secondary Indexes are hash distributed separately


from the data rows, based on their USI value. (As you remember, the
base table rows are distibuted based on the Primary Index value). The
subtable row may be stored on the same AMP or a different AMP than
the base table row, depending on the hash value.

NUSI: The Non-Unique Secondary Indexes are stored in subtables on


the same AMPs as their data rows. This reduces activity on the BYNET
and essentially makes NUSI queries an AMP-local operation - the
processing for the subtable and base table are done on the same AMP.
However, in all NUSI access requests, all AMPs are activated because
the non-unique value may be found on multiple AMPs.

Data Access Without a Primary Index


You can submit a request without specifying a Primary Index and still access the
data. The following access methods do not use a Primary Index:

Unique Secondary Index (USI)


Non-Unique Secondary Index (NUSI)

Full Table Scan

Accessing Data with a USI

Teradata Indexes - Workshop

19

When a user submits an SQL request using the table name and a Unique
Secondary Index, the request becomes a one- or two-AMP operation, as explained
below.

USI Access
1. The SQL is submitted, specifying a USI (in this case, a customer number
of 56).
2. The hashing algorithm calculates a row hash value (in this case, 602).
3. The hash map points to the AMP containing the subtable row
corresponding to the row hash value (in this case, AMP 2).
4. The subtable indicates where the base row resides (in this case, row 778 on
AMP 4).
5. The message goes back over the BYNET to the AMP with the row and the
AMP accesses the data row (in this case, AMP 4).
6. The row is sent over the BYNET to the PE, and the PE sends the answer
set on to the client application.
As shown in the example above, accessing data with a USI is typically a twoAMP operation. However, it is possible that the subtable row and base table row
could end up being stored on the same AMP, because both are hashed separately.
If both were on the same AMP, the USI request would be a one-AMP operation.

Accessing Data with a NUSI

Teradata Indexes - Workshop

20

When a user submits an SQL request using the table name and a Non-Unique
Secondary Index, the request becomes an all-AMP operation, as explained
below.

NUSI Access
1. The SQL is submitted, specifying a NUSI (in this case, a last name of
"Adams").
2. The hashing algorithm calculates a row hash value for the NUSI (in this
case, 567).
3. All AMPs are activated to find the hash value of the NUSI in their index
subtables. The AMPs whose subtables contain that value become the
participating AMPs in this request (in this case, AMP1 and AMP2). The
other AMPs discard the message.
4. Each participating AMP locates the row IDs (row hash value plus
uniqueness value) of the base rows corresponding to the hash value (in
this case, the base rows corresponding to hash value 567 are 640, 222,
and 115).
5. The participating AMPs access the base table rows, which are located on
the same AMP as the NUSI subtable (in this case, one row from AMP 1
and two rows from AMP 2).
6. The qualifying rows are sent over the BYNET to the PE, and the PE
sends the answer set on to the client application (in this case, three

Teradata Indexes - Workshop

21

qualifying rows are returned).

Accessing Data Without Indexes


In Teradata, you can access data on any column, whether that column is an
index or not. You can ask any question, of any data, at any time.
If the request does not use a defined index, Teradata does a full table scan. A
full table scan is another way to access data without using Primary or Secondary
Indexes. In evaluating an SQL request, the Optimizer examines all possible
access methods and chooses the one it believes to be the most efficient.
While Secondary Indexes generally provide a more direct access path, in some
cases the Optimizer will choose a full table scan because it is more efficient. A
request could turn into a full table scan when:

An SQL request searches on a NUSI column with many duplicates. For


example, if a request using last names in a Customer database searched
on the very prevalent "Smith" in the United States, then the Optimizer
may choose a full table scan to efficiently find all the many matching
rows in the result set.

An SQL request uses a non-equality WHERE clause on an index


column. For example, if a request searched an Employee database for all
employees whose annual salary is greater than $100,000, then a full
table scan would be used, even if the Salary column is an index. In this
example, full table scan can be avoided by using equality WHERE
clause on a defined index column.
An SQL request uses a range WHERE clause on an index column. For
example, if a request searched an Employee database for all employees
hired between January 2001 and June 2001, then a full table scan would
be used, even if the Hire_Date column is an index.

For all requests, you must specify a value for each column in the index or
Teradata will do a full table scan. A full table scan is an all-AMP operation, and
each data row is accessed only once. As long as the choice of Primary Index has
caused the table rows to distribute evenly across all of the AMPs, the parallel
processing of the AMPs working simultaneously can accomplish the full table
scan quickly.
While full table scans are impractical and even disallowed on some commercial
database systems, Teradata routinely permits ad hoc queries with full table
scans.

Teradata Indexes - Workshop

22

Summary of Keys and Indexes


Some fundamental differences between Keys and Indexes are shown below:

Keys

Indexes

A relational modeling convention


used in a logical data model.

A Teradata mechanism used in a


physical database design.

Uniquely identify a row (Primary


Key).

Used for row distribution (Primary


Index).

Establish relationships between


tables (Foreign Key).

Used for row access (Primary Index


and Secondary Index).

While most commercial database systems use the Primary Key as a way to
retrieve data, a Teradata system does not. In a Teradata system, you use the
Primary Key only when designing a database, as a mechanism for maintaining
referential integrity according to relational theory. The Teradata RDBMS itself
does not require keys in order to manage the data, and can function fully with no
awareness of Primary Keys.
The Teradata parallel architecture uses Primary Indexes to distribute and access
the data rows. A Primary Index is always required when creating a Teradata
table.
A Primary Index may include the same columns as the Primary Key, but does
not have to. In some cases, you may want the Primary Key and Primary Index to
be different. For example, a credit card account number may be a good Primary
Key, but customers may prefer to use a different kind of identification to access
their accounts.

Rules for Keys and Indexes


A summary of the rules for keys (in the relational model) and indexes (in the
Teradata RDBMS) is shown below.
Rule

Primary Key

Foreign Key

Primary Index

Secondary
Index

One PK

Multiple FKs

One PI

0 to 32 SIs

Unique values

Unique or nonunique

Unique or nonunique

Unique or nonunique

Teradata Indexes - Workshop

23

No NULLs

NULLs allowed

NULLs allowed

NULLs allowed

Values should not Values may be


change
changed

Values may be
changed
(redistributes row)

Values may be
changed

Column should
not change

Column may
change

Column cannot be
changed (drop and
recreate table)

Index may be
changed (drop
and recreate
index)

No column limit

No column limit

16-column limit

16-column limit

n/a

FK must exist as
PK in the related
table

n/a

n/a

Defining Primary and Foreign Keys in Teradata


Although Primary Indexes are required and Primary Keys are not, you do have
the option to define a Primary Key or Foreign Key for any table. When you
define a Primary Key in a Teradata table, the RDBMS will implement the
specified column(s) as an index. Because a Primary Key requires unique values,
a defined Primary Key is implemented as one of the following:

Unique Primary Index (If the DBA did not specify the Primary Index
in the CREATE TABLE satement.)

Unique Secondary Index (If columns other than the Primary Index are
chosen)

When a Primary Key is defined in Teradata SQL and implemented as an index,


the rules that govern that type of index now apply to the Primary Key. For
example, in relational theory, there is no limit to the number of columns in a
Primary Key. However, if you specify a Primary Key in Teradata SQL, the 16column limit for indexes now applies to that Primary Key.
In Teradata RDBMS V2R5, the maximum number of columns in an index is
increased to 64.
Questions
What provides uniform data distribution through the hashing algorithm?
UPI
NUPI
Both UPI and NUPI
Neither UPI nor NUPI

Teradata Indexes - Workshop

24

The output from the hashing algorithm is the:


hash map
uniqueness value
row ID
row hash
Choose the appropriate answers from the drop-down boxes that complete each sentence:
Accessing a row with a Unique Secondary Index (USI) typically requires one/ two/all
AMP(s).
Accessing a row with a Non-Unique Secondary Index (NUSI) requires one/two/ all
AMP(s).
A full table scan accesses one/two/ all row(s).
Accessing a row with a Unique Primary Index (UPI) accesses one/two/all row(s) on one
AMP.
Accessing a row with a Non-Unique Primary Index (NUPI) accesses multiple rows on
one/two/all AMP(s).
The row ID helps the system to locate a row in case of a(n):
even distribution of rows.
Unique Primary Index.
multi-AMP request.
hash synonym.

Teradata Indexes - Workshop

25

You might also like