You are on page 1of 45

Bedrock Computer Technologies, LLC v. Softlayer Technologies, Inc.

et al

Doc. 284 Att. 8

EXHIBIT 5
PART 3 OF 6

Dockets.Justia.com

198

Tables

and

Information

Retrieval

CHAPTER

Table Ahst data rid type

or Access table

Array
access

lmp/emerisy/

Figure

6.9

Implementation

of

table

functions
this

have
is

no

such

order
in

If

the

index
this
list

set
is

has
not

order

some

natural

reflected

order
aspect

then of

the

sometimes tables
the

table from

but

Hence
rotc

informatiort
in

necessary
involves

retrieval

using like

studied

naturally

the

previous chapter methods


list

search
table

but information go
directly to the

ones

retrieval the

methods
for

from

access

that

requires

differen requit ed
at

desired

searching
Ig in

entry of items

The
in

time
list

generally

depends
accessing
it

on

least

number
does
not

but the
is

the

time

the

for

and

is

table

of items
table

table

that

is

is

usually

access

0l
iist

usually

depend
in

on

the

number

For

this

reason

significantly

faster traversal

than
is

many

applications

list

On
It is

the

other

searching operation performing


nearly special so for
list

hand
to

natural

generally
in the

hut

easy
In

move
it

not

for

tale
cbery

through

item on
in tablp.v

list
in

some
to for

operation perform
the

general table

with

may

not
if

be

every

item

easy order

an
is

particularly

advance
Finally

some

operation
specified

items

ano

across

we
ave

should
use the

clattfr table as

the

In

distinction

between
it

general array

the this
in

terms
section

shall

table

and
rc

we

have

array
the high-

defined

term
level

in

to

mean
and

and and

trict

prograrsming
for

feature

available tables

languages

Pascal contiguous

used

ntlst

implementing

both

and

6.5

HASHING
Sparse
Functions

lists

6.5.1

Tables

Index

We
an up

can

continue
that

to

exploit be

table

lookup
as
in

even

in

situations

index

where

the

can

used

key can
.....n

is

no

directly

honyt to
Set

array

indexing

What we weh

one-to-one

we

do

correspondence

between

the

keys by which

hash
Ji1

BTEX0000262

Hashing

199

tion

and

indices will

that

we

can

use

to

access

an

array

The
of

index

function

that

we

produce
it

be
to
it

somewhat
convert
the

more complicated
key

than

those

previous
to

sections an
itt

since but

may

need

from

say alphabetic number of example


loll

information

.er

in

principle

can

still

he

dune when
If the for possible

The
of space
eight

only

difficulty for therc of

irises
.1

keys
are

exceeds

the

amount
words
of

available

table

our

keys keys

alphabetical

letters the

then

are

26
that of
\vill

possible
in

number

much
In is set

greater

than

number
unIv

poshions

he

available

high-speed

memory
That
large

practice
the but table

however
is

small

frction

these

keys
it

will

actually

occur
.cry

sparse

Conceptually few
of positions

we

can

regard occupied such

indexed
In as

by
for

with
think

relatively
in

actually

Pascal

xample

we

might

terms

conceptual

declarations type

sparse not
in

table

of

item such

Even directly only


it

though
often
tie

it

may
helpful the

he

possible

to solving
it

implement
to

declaration

as

this arid

mnblem
of

begin

with

such

picture

slowly

down

details

how

is

puf

into

practice

Hash

Tables

The
of tOmetime5 ng tables ones
C/ax fir

idea

of

hash

table

such
keys

as that

the

one

shown
to

in

Figure

6.10
to

is

to

alluw

many

the

different

possible the will


is

might index
the to

occur

be

mapped
there
if

the will

same
be

location

coon

rot

in

an

array two

under

action

of
to

the

function

Then
but the

possibility

ne-to-one

that that

records

want small

he

in

same
the

place
of

the

number
then
array this are

of

records

the

actually

occur
little

relative

size

array
in the

possibility

rsdifferent will
tc

cause

loss

of

time

Even

when

most

entries

occupied

required

hash ctnd
ha is

methods

can

be

an

effective

at

means of information

retrieval

number

rppbcations

table
with

every

oOt

totted

below

tperation
i.specified

tO

ii

12

13

15

lB

18

t9

20

21

22

23

24

aid

array
the

r$trict

00

iv

st

high

25

28

27

28

29

30

31

32

33

34 hash

35

36

37

38

39

40

41

42

43

44

45

46

47

Figure

6.10

table

tiO

longer to set hash function the

We
array

begin This

with
function

hash

function
generally

that

takes several

key

and

maps
keys

it

to to

some index

in

Ylnforma_

will

map

different

the

same index

BTEX0000263

200

Tables

and

Information

Retrieval

CHAPTER
record must
is in

Et

lithe
col/rrioir

desired

the

location

given to the

by

the the

index

then

our problem
that are

is

solved

otherwise between

we

use

some method
to

resolve

collision

may
thus

have

occurred
questions

two

records wanting
to use

go

to

same location
find

There hash

two and

we must we
must

answer

hashing
to

First

we must

good

functions

second

determine

how

resolve

collisions
let

Before needed
to

approaching implement

these

questions

us

pause

to

outline

Trw informally
the steps

hashing

Algorithm

Outlines

First an
the

array used
the

must be declared
to locate entries itself

that are for

will usually

hold
the

the

hash

table
so

With
is

ordinary no need

arrays
to

keys

indices
several in

there

keep

them within
keis
fri

array
so

but
field

hash

table

possible the

keys

will

correspond be resen
cd Foldi

ia/nc

to for

the the

same index
key
itself

one

within

each

record

array

must

Next
ri/no/I

all

locations

tn

the

rri
the that

must
applic
is

be

triitialized

to
it

show
is

that

they

arc hs actual

emp1s
setti

a//wi

How
the

thts

is

done
to

depends

on

mon

often

accomplished occur
all

key

fields

some
keys

value for

guaranteed key

never

to of

is

an

ke

With
an

alphanumeric

example

consisting

blanks

might

represent

empty position To insert


If

re

ord

into

the

hash
location insertion

table
is

the

hash then

function
the

fo

the

is

first

calculated
else
if

the

corresponding equal
case the

empty
the

record would
the

can
not

be he
it

inserted
alto red

the the

keys

are

then

of

ne
key

record
is in

and

in

remaining
to resolve the

record
collision th the

iith

different

location

become

Modu

necessary
1/i ni/

To
for the the

retrieve

record
If

gis

en

kes

is

entirely
is

similar

First

the

hash

functio

key

is

computed
has succeeded

desired

record while
the

in

iL

corresponding
is

location

iht
not a/

retrieval

otherwise
follos

the

location

nonempte
collision th

and

locations

have

been

examined
is

same

steps se
is

used been

for

resolution no record

an

enpt
the

position given

found
in

or

ill

lo

inons
the

considered

with

key

is

the

table

and

search

unsuccessful

52

Choosing

Hash The
and
that

Function nso
quick
actually will princip to criteria
in

selecting
it

hash

function an

are

that

ii

should of

be
the

eass

compute
occur

and

that the

should
of to

achieve
If

even

distribution in

keys what
sery Pascal

across
it

range

indices
construe1
in

we

know

advance
that

exactly wilt

keys

occur

then

is

possible not

hash
cc the

functions keys
is ill

efficient thc usual

but generall
is .ss

we
thi

do hash and

knos

ads tike in

what key

occur up
tin ntis

Theec
the piece

is

for

function thereb
is II

to

chop
hat

it

together

in

various

ssas

tatrt be

mdc

like

pseadorandorn
tl

numbers
indices
It is

generated

by

compi

tniformI

distributed

over

range

from

this

process
th

thai

the

word

host

comes
ince
iii ii

stncc ihc

the samc
will

process iie be
it

eonscr
is

the that that

key any
the

into

something
or will

be

irs

little

resernhl

\t

patterns results

regularities

that distr

occui

kess

destre

be

randomls

BTEX0000264

.3

Hashing

201

olved
curred lestions

Even
terms

though

the

term hash
or

is

very descriptive
are

in

some books
in its

thc

more

technical

.ccritler-srorage shall

key-transformation
three

used
be

plac.
in

We
build

consider function

methods

that

can

put

together

various

ways

to

second

hash

trtication
Ic

steps

Ignore

part

of

the fIelds and

key
as the

and
their

use

the

remaining

part
If

directly the

as

the

index considering

non-numeric
digit iy integers

nun1rical
table

codes
1000 hash

keys then
so to

for the

example second

are

eightfifth

hash might
fast

has
the

locations function
often
fails

first

and
to

arrays

digits

from

he
is

right

make method

that

62538194
the

maps
keys

394

4to keep
4respond

Truncation
thr.3ugh the

very

but

it

distribute

evenly

table

eserved
Folding Partition tie

ttt

empty
setting

key
or

into

several

parts

and
to

combine
the

the

parts

in

convenient

way often
eight-digit

using
integer

addition

multiplicat
into
if

obtain three
to

index and

For two

example
digits the of to

an

4ual

key

can

be

divided

gr2ups

of

three
in the

groups indices 100

added

ogether

and

truncated
to the

essary 381

be

proper
is

range

Hence
Since
all

Present
first

o2538194 information
better

maps
in

625 key can

94
the

1100
value of

which
the

truncated
folding

affect

function

often

achieves

ktiserted

spread

of

indices

than

does

truncation

by

itself

t1lowed

Modular

Arithmetic

ecomes
Convert
size ltfunction the the Icy to

an

integer

using
the

the

above
as

devices the

as

desired
This

divide to

by

the

of

index range operator


the

and

take

retnainder achieved
the stze

result

amounts

using very
is

1in
It1ot qjtion

then
all

the

Pascal

mod
in
integer indices

The
this like

spread case
or

by
the

taking hash

remainder

depends modulus

much power
P22

on of

modulus
small other

of

array

If the

If record
titicliiliis

10

then

index

while

remain has
the

unused
effect also

many keys tend to map to the same The best choice for modulus is prime
the

number
shall see

which
later

usually that

of

spreading improves hash an

keys

quite

uniformly
for
is

We
to the since
is in

prime modulus
rather or

important
size

method 1000
it

collision better

resolution choose
either
is

Hence
997

than 1024
best

choosing

table

of poor
the

1009
the

would
to

usually

be

choice hash
that

Taking

remainder easy keys what very efore


pieces Pascal
it

usually

way
at

conclude

calculating that
is
it

function
the result

can

achieve range

good

spread
the

the

same

time

ensures on

the

proper

About
the

only can

reservation

that

tiny

machine

with

no

hardware

division

calculation

be

slow

so

other methods

should

be considered

Example

As

simple of

example
eight

let

us

write

hash

function
into

in

Pascal
in

for the

transforming range

key

ndorn
of

consisting

alphanumeric

characters

an

integer

hashsize That

is

we

shall

begin

with type

the

type

keytype hash funcion

array1 follows

of

char

so

We
4H

can

then

write

simple

as

BTEX0000265

202

Tables

and

Inlormation

Retrieval

CHAPTER
Hashx keytype
integer

sample

has/c

function

function var

integer begin

for

to

do

ordx
Hash

mod

hashsizo

end

We
however
codes
tion

have

simply
is

added no reason

the to

integer believe

codes
that

corresponding
this

to

each
better

of

the

eight

instab

characters

There than any

method
for

will

be

or some

worse
of the

number
in

of

othersor

We
is

could

example

subtract

multiply
will

them
that to

pairs hash on

ignore every

other character than another

Somettmes an applica
sometimes
lnct
it

suggest

one

function good

better

requires

experimentation

settle

one

re/i

as/c

6.5.3

Collision

Resolution

with

Open

Addressing

Linear

Probing

The

simplest

method
the

to

resolve

collision

is

to

start

with

the

hash
for

address
the

t-

location

where

collision

occurred
1-Jetice

and

do

sequential

search
straight

desi. and
so

key

or an

empty
called

location
linear

this

method
array

searches be

in

lii

therefore

probing
is

The
the

should

considered
to the
first

circular
location

Qua proceeds
of ne

when
array

the

last

location

reached

search

Clustering

The
there

major
is

drawback
tendency
positions find

of

linear

probing

is

that
is the

as

the

table start

becomes
to

about
in

half

full

toward with

clustering gaps

that

records
strings

appear

long

strings

of

adjacent
to

between

Thus

the

sequential

searches the

needed
ste

an

empty

position

become

longer and
are

longerin

For color

consider Suppose any


in the but
if

example
there with

mph

vi

c/us

erittg

in are

Figure

6.11

where
in the

thc

occupied and with


to

positions that the

shown
function spread
it

that

locations probability

array Begin

hash

chooses
as

of

them

equal
If

1/n

fairly

uniform then
also the

shown
there
the

top diagram
it

new

insertion

hashes
is

location
it

will

go

hashcs
that

to

location will

which
filled

full

then
to or

\-ill

go
next
in

into

Thus
an
the

probability insertion

be of

has

doubled

2/n
will

At end

stage
ci

attempted
probability so
full

into
is

any

locations this the

up

sn

of

filling

4/n
are

After

has

probability effect
is

5/n
to

of

being
the

filled string

and of

as

additional

insertn

made

most

likely

make
thc

positions the

beginninf
table starts
nun-c/sc-

location

longer toward

and
that

longer
of

and

hence

performance

of

hash

degenerate

sequential

search

probes

BTEX0000266

SECT

ON

Hashing

203

LL
is

LLI
c/

LI

II

LHT

11

LV

f1

11ff

1tF1L
lilt
ii

1ff

LCHt11
Figure

ii

lU
in

tL1t Ift
1lil
table

611

Clustering

hash

the

instability

eight

The
randomly keys
will

problem
to join

of

clustering

is

essentially

one

of

instability

if

few

keys
that

happen
other

worse
of the

be

near

each
the

other

then

it

becomes
will

more and

more

likely

them and

distribution

become

progressively

more unbalanced

applica
It

Increment

Functions

requires
If

we

are to

to select

avoid
the to

the

problem of

of

clustering
to

then check

we must when

use

some more

sophisticated

way
f/lashing

sequence so
to

locations

collision

occurs
function

There
to

are

many
the

ways

do

One

called
If

reltashittg this

uses
is

second
filled
if

hash
sonic

obtain
is

second
to the

position get
lirst

consider position

position so
is

thcn have by

other

method

needed from hash


the the fress
the

the

third

and
little

on
to

But be

we

fairly

good

spread second

hash

function
will

then
as

gained

an

independent

function distance
first

We
to

do

just the

well
first

to

find

more

sophisticated

way
this

of determining whatever
that
will catt

move
location

from
is

hash wish

position to

and an

apply

method
function

hash on
the

Hence
the

we

design

increment

the
depend key or on clustering

number of

probes

already

made

und

that

avoid

desired

and
so

it

is

that Quadratic the


If

Probing there
It

of

is

collision
It

It

9.
at

hash

address
that is

It at

this locations
is

method
It

probes
i2

the

table

at

locations for

mod
is

hashsize

That This
ii

is

the

increment

function reduces and


in

method
locations

substantially in the

clustering
fact
it

but not
that

it

not

obvious
is

that

it

will of
If

half

full

probe
strings

all

table
are

does

If

hashsize
is

power

then reach

relatively the

few

positions at

probed
and
at

Suppose

hashsize

prime

we

ilsearches example Jhat


there

same

location

probe
i2

probej

then

It

It

j2

mod

hashsize

with
so that

oidiagram f3ashes
ijty

to

Ji
Since hashsize from
is

mod
divide

hashsize
It

that into differs


is

5.tion Jiling

prime
multiple

it

must

one so
at

factor
least

divides

only have
total

when

by

of

hashsize

hashsizo

probes
so the

been

made
of

Hashsize
distinct Itinning at tO ttunher
probe.c

divides positions that

however
will

when
is

hashsize
exactly

number

be

probed

jstarts

oft

itt/net

hashsize

dlv

BTEX0000267

204

Tables and

Information

Retrieval

Ft

It

is

customary

to the

take

overflow
are

as quite

occurring
satisfactory

when

this

number

of

Positions

dec/a

rat

has been

probed
that
first

and

results

Note
colcu/atioti

quadratic probe
is at

probing
position

can

be accomplished
the

without
is

doing At

multiplications each
successive

After

the the

increment
it

sct

to to

probe
Since

increment

increased

by

after

has been

added

the

previous location

l35.2ili2
for
in alt

you

can

prove

this

fact

by

mathematical

induction probe

will

look

position

as

desired

Key-Dependent

Increments Rather than


let
it

having be

the

increment

depend
the

on key

the

number of probes
For

already

made
insertion

we can
the write

some simple function of


character and
usc
its

itself as the

example
In

we could
Pascal

truncate

key

to

single

code

Increment

we

might

increment

ordk
after division
is

good
is

approach
increment
specify

when
depend
the

the

remainder on
the

taken

as

the

hash

function

to

let

the

quotient

of
so

the the

same division An
calculation will

optimizins be
fast

compiler
the results In
is

should

division

only

00cc

and

generally

satisfactory the that

this

method
it

increment
the

once
will

determined
step not

remains
alt

constant
entries the of

If

hashsice the
is

prime
any
full

follows

probes overflow

through be

the

arras

before pletely

repetitions

Hence

will

indicated

until

array

com

quadratic

Random

Probing

final

method

is

to

use

pseudorandom
be one
that

number generator
generates can be
the

to

obtain

the

increme1it

The generator
it

used
the

should

always thet

same sequence
as
is

provided function
of

starts

with This

same
is

seed

The

seed
in

specified but

some
likely

the

key
the

method

excellent

avoiding

clustering

to

be

slower

than

others

Pascal

Algodthms

To

conclude

the

discussion

of

open

addressing used

we

continue keys
of

to the

study
type

the

Pascal

example

already

introduced type

which

alphanumeric
arrayfi 81

of

keytype

char

We

set

up

the

hash

table

with

the

declarations

BTEX0000268

Hashing

205

ositions

const hashsize

997 996

Jft0aflCCi

accrcc

hashmax
JtcCeSSIVe

is

..aa-s.s

type hashtable

array
hashtable

hashmax

of

item

.ocation

var

The
will

hash of

table eight the

must blanks hash


for

he

initialized

by
the

diining key
field

..cial
of
in

key item
in

called to

blankword blankword
together

look

that

consists

and
function

set

rig

each
Section

We
with

shall

use

already

written

65
that

ran
the

quadratic
of

probing
that this

collision be

resolution
this

We
is

.hown
--

maximum we
keep

number
counter With

probes
to

can

made bound
let

way

hashsze

dlv

and

check

upper

these the

conventions hash
table

us

write

procedure

to

insert

record

with

key

rkey
idy

into

made

procedure var

lnsertvar

hashtabte

item

truncate Ywe

might

caur.ter

ty

115

taa
rrsntly

cIc

integer begin function


iptimizing Last

pcsic.n

1150
fl5flrt

010051

Hashr.key

and while

Htp.key

btankword
r.key div

IC

to

location key
larsen

emptv5

and
the
at is

Hp.key

Has

he

argot

bonn0

array

and
begin

hashsize

do

t.s ovrfiow

occurrecOb

corn

ouucbutic

pro/nui.

Prepare

increment

tor

the

next

iteration

rement jrovided
hction of
If

if

hashmax

then hashsize

mod end
Hp.key

slower

blankword

then
Insert

to

.1kW

tern

else

if

HpI.key

r.key

then the

Error else

same key

cation

4p1n4

twice.t

le

Pascal

Overflow

Counter

has reachco

its

hmit
toserti

end
procedure form and
is left

prOCedure

to as

retrieve

the

record

if

any

with

given

key

will

have

similar

an

exercise

BTEX0000269

206

Tables

and

intormation

Retrievat

CHAPTER

SEC

Deletions

Up
with

to

now
it

we

have

said to

nothing be an easy

about task
it

deleting requiring
is

items only This


to

from marking

hash
the will

table
deleted

At

first

glance

may appear
special that

location

the
is

key an

indicating location the

that
is

empty
the

method
stop the

not

work
for

Thc targc tha

reason

empty

used

as

signal

search or

key

Suppose

that

before

deleuon
is

there

had

been
position

collision
is

two and

some item whose hash


in

address
try
is

the

now-deleted
that to find

actually

stored elsewhere
position
is still in

the

table
the

If

we

now
it

to

retrieve

item then
the

the

now-empty
though
it

will the

stop table
special key

search

and

impossible

item

even

One
placed
free the to in

method
any

to

remedy
position

this

difficulty special but

is

to

invent

another
indicate not

special that this to

key

to

be
is

deleted

This

key
that

would
it

position

receive for

an

insertion

when

desired

should second
bit

be

used key

terminate

search
the

some other item somewhat


for

in the

table

Using

this

special

will
the

however
methods should
be

make we

algorithms
so as far studied as

more complicated
tables
deletions

and
are

slower

With

have

hash

indeed

awkward

and

avoided

much

possible

6.5.4

Collision

Resolution

by Chaining now we
have with
implicitly

Up
fact
in iccked stoaagc

to

assumed

that

we

are

using
for the

only hash

contiguous
table
itself

storag
is

while

working
the natural

hash

tables

Contiguous wish
is

storage
able to

choice
linked linked table

since

we

to

be

refe

quickly access

to

random
is

positioc

overflow

the

table

and

storage storage
itself

not

suited
ttot

random
for

There

howeve.

no can

reason
take of
It is

why
the

should
as

be of
in

used
pointers

the to the

records records

themselves
that is as an

hash

an

array

array

list

headers

An
to

example
refer to

appears
the linked

Figure
front

6.12
the

traiitional

lists

hash

table

as

cltain.c

and
deletion

call

this

method

collision

resolution

by chaining

Advantages

of

Linkr

Storage

There

are the the to

several

advantages themselves
is

to are

this quite

point

of

view
is

The

first

and

the

most

important be saved

Olsadva

.spac

satin

when
Since time
are will

records hash tahk

large

that

considerable must
in

space
aside

may
at

contiguous
If

array

enough

space
are

be

set

compilation then
if

avoid

overflow

the

records
is

themselves
to

the the

hash
cost

table of

there
use

of

spa

many

empty

positions

as

desirable that

help avoid he
to

collisions
If

these other require large

consume
the

considerable
table

ssace

might
pointers of to the the the

needed
the table

elsewhere

on

the

hand iionly
factor

hash

contains then
the

only
size

records

pointers

that

one

word

each
by
the

hash
size

may

he

reduced and ases hash


acid

bya
will

sn-coil

reco

essentially relative to

factor

equal

of

the

records
for

become

small

space

available

for

records only

or

other
in the

The
flciitIIuPI
it

scond
simple and good

major and

advantage

of

keeping

pointers

table link as

is

ti

allows

efficient
all

collision
tlte

handling witl
will

We

need

only hash

field

cad
list

record With

organize hash

records

single give the

adcires

linkso

function

few keys

same hash

.idress

BTEX000027O

CII ON

.--

Hashing

207

At

first

Slocation

tk
va
ind

The
target that

ilsewore
iition
till

will the

in

ly

to

be
is

sition

terminate

however
j1methods should be

4-

Figure

.\

chainett

bash

table

storage itelf is
in

linked all overflow table

lists

will

be

short

and

can hash
it

be

searched

quickly

Clustering go
to distinct that records the

is

rio

problem

at

because
third

keys

with

distinct
is

addresses
is

alwt

lists size of the


in

ipositions

advantage
the

that

no
If

longer
there linked

necessary
are

hash
the

showever
Ives

exceed
it

number of
only Even
length
list

records of
are

more
are

than
to

entries

We
as

table than table

means
record

that
if

some
there

the

lists

now

sure

contain
the size

more
of the

ais

an

one
the

-overal
lists

times
will

more records than


small and

average

of
will

the

linked

remain

sequential

search

and

on

the

appropriate

rentain

efficient

Finally proceeds
in

deleton
exactly

becomes
the

quick
as

and

casy

task

in

chained simple

hash

table

Deletion

same way

deletion

from

linked

list

tportant

Disadvantage

of

Linked

Storage

saved
pilation
if

These

advantages
is

of chained superior
the links to

hash open

tables

are indeed

powerful
let

Lest point

you

believe

that

chaining
space

always
All

addressing space needed


If

however
the the

us

out one then hut

important space
records
is

there these

disadvantage
negligible are
in

require that

records

are

large

this
if

comparison
it

with

for

records themselves

the

other
require large
/1

small

then

is

not
links
is

Suppose
records

for

themselves

take

example that the only one word which we


use the

take the

one

word

each

and

that

the are

items quite

key
to

alone
answer
the

Such

applications

become

common
the

where Suppose

hash

table

only and
the

some
hash items
for the the

yes-no
table

question
quite shall

about small
use 3n the

key
the

that

we

use

chaining
entries for as the

make number
table

itself

is

that to

with words
links full

same

number

of

of

Then
keys

we

field

of
to

storage
the

altogether node if

hash

and
will be

for

linked so the

find

next be

any

on

each

chain of

Since
the

hash
will

table

nearly

there

will

many

collisions

and

some

chains

have

several

items

IL

BTEX000027I

208

Tables

and

Information

Retrieval

Hence

searching

will

be

bit

slow Suppose
of storage

on

the

other
into will

hand
the

that

we

use
will

Open

addressing
that
it

The same 3i
be only
for

words
full

put

entirely there

hash

table

mean

wilt

one any

third given

and
will

therefore be faster

be

relatively

few

collisions

and

the

search

item

Pascal

Algorithms

chained

hash

table

in

Pascal

takes

declarations

like

thcIii

oiii

ii

type
pointer
list

mode
record array
called points

head

pointer

end
of
list

hashtable

10. hashmax
node
to the consists next

The
called

record next

type that

of

an on

item

called

into

and

an

additional

field

node
the

linked table
is

list

The
iliiiiii/iZiJ/rii

code

needed

to

initialize

hash

for

to

hashmax
use
is

do

Hlil.head

nil

We
hash
retrieval

can

even
itself

previously no
different use the

written

procedures
that

to

access

the

hash

table
for

The
data

function

from procedure

used

with

open

addressing linked

we
5.2

can
as

simply

SequentialSearch

version

from

Section

follows

procedure

Retrievevar

hashtable Boolean
USC

target

keytype
perfect lies

var found
hinds the norta
to

var 0050 tooth

location table

pointer anc hue


rcLirria v.ith

wth
rvsdc

kecusroe

Loatin

poinbnq begin

that

pro.rh

ihe

iooomes

SequentialSearchHlHashtarget

target

found

location

end
Our procedure
already
45 iisiriii

for

inserting the

nec
receni

entry

will
in

assume with

that given

the

key whl

does he

not

appcar

otherwise

only

most

tscrti

key

retrievaH

procedure
inserts
Icey

lnsertvar
fliD

hashtable
toe the

pointer
haai leule
ciS.eLOtflflq ii oil

node .nto.te

ohaned

r.da

wth

is

var
integer

used

for

index

fts

hr
IS

table

begin

Hashpt
pI.next

.info.key

01ri

ktr

d.ex

the iso the

linKed

Dr

Hli.head
Sat
Iso

incrr
i-.ao

flea
to

ls
nec

tie

rn

end
As you can
versions for

see

both

of

these since

procedures
collision

are

significantly
is

simpler

thou

arc

it-.

open

addressing

resolution

not

problem

BTEX0000272

TC

t4

Hashing

209

Exercises
6.5

El

Write and

Pascal

procedure

to

insert

an

item

into

hash

table

with open

addressing

linear

probing

E2

Write
ing

Pascal

procedure probing

to

retrieve

an

item from probing

hash

table

with

open

address

and

ta

linear

th quadratic

F3

Devise
to integers the

simple

easy-to-calculate

hash

function

for

mapping
thc values

three-letter of

words
function

between

and

it

inclusive

Find

your

on

words

PAL
for

LAP
II

PAM
17 19

MAP
Try
for

PAT
as

PET
collisions

SET
as

SAT
possible

TAT

BAT

13

few

E4

Suppose
12

that that the

hash

table

contains keys 45
are

hasttsize to be

entries
ittto

indcxed
the

from

through

and

following 100 32

mapped
29

table

10

58

126 and

200

400

Detcrmine
these

the are the are

hash reduced hash


tirst

addresses

find

how

many

collisions

occur

when

keys

mod

hasheize and
find thcir

Determine
these

addresses

how
digits

many

collisions

occur

when

keys

folded

by adding reducing
will

together in ordinary

decimal

rpresentation Find
iij\/t Juit

and function

then
that

mod

hashsizo no
collisions set for set for
is

hash
that the

produce
for

these called

keys
perfect

hash

cIiui

function

has
no
called

collisions parts 01

fixed

of

keys

Repeat
that

previous

this exercise for fixed

hashsize of keys
that

11

hash

function
fill

produces
table
is

collision Ininifizo

completely

the

hash

perfeeL

ES

Another array
location

method
the

for resolving overflow

collisions into

with
all

open

addressing
that

is

to

keep

separate

called are

table can

which be

items

collide

with an hash

occupied or
the

put
in

They
order

either

inserted

with used

another
for

function Discuss

simply

inserted

with

sequential

search

retrieval

advantages

and

disadvantages

of

this

method

E6 E7

Write

an

algorithm

for

deleting

node

from

chained

hash

table

Write
special retrieval

deletion

algorithm
indicate

for deleted

hash item

table

with
part

open

addressing

using

second
the

key and

to

see

of Section

6.5.3

Change

insertion

algorithms

accordingly

EL

With
special

linear

probing
as

it

is

possible the

to

delete

an

item

without

using

second

key
is

follows
If

Mark
the

deleted finds

entry key
it

position the the


first

found

search then
the

empty Search until another empty whose hash address at or before


is

empty
and

position from

move

back

there

make
Write an

its

previous

position to

empty ment

continue

new empty
and

position
insertion

algorithm need

imple

this

method

Do

the

retrieval

algorithms

modification

BTEX0000273

210

Tables

and

Information

Retrieval

CHAPTER
the

SE

Programming
Project
6.5

Fl

Consider words
filled

35

Pascal of on
nine the

reserved

words

listed

in

Appendix
less

C.2.l
nine

Consider
letters

these are

as

strings

characters right

where

words

than

long

with Devise applied

blanks an
to to

integer-valued
.11

function

that

will

produce
find the
it

different helpful to

values write
file

when
short

luau

35
assist

reserved

words
program and could

may

program
the

Your
devise
integer

read what

words

from

appl
At

function
the

you

determine such
values until

collisions

occur
values

Find
are

smallest

hashsize
all

that

when

the

of your

function

reduced your

mod

hashsize
as

35

remain you can

distinct

Modify
the

function part
the

necessary
will

achieve

hashsize
perfect

35

in

preceding
for

You
Pascal

then

have

discovered

minimal

hash
tlWi

function

35

reserved

words.

6.6
The

ANALYSIS OF HASHING
Birthday Surprise

The
sion
likely

likelihood

of

collisions

in

hashing

relates

to

the to

well-known be
itt

mathematical before
it

diver-

Si

How
that leap

many
two years
the

rartdomly people
there will are

chosen have 365


in

people
the

need

room
and

becomes apart answer

same

birthday

niottth people 24

day
that

Since
the

from
will

possible

birthdays answer
for this
is

most
ottly

guess

be

in

hundreds
determine

hut the

fact

the

people by answering
probability his
its

We
With
have
in the

can

probabilities

question
is

opposite no
off

randomly chosen same The


birthday
probability

people
Start that

in

room
any

what

the

that

two
Ott

with

person person

and has
has
first

check
different

birthday
is

calendar

second
that

hirihd

364/365
is

Check 363/365

it

off

The

probability this

third see that

person
if

different

htrthday have
is

now

Continuing then
the

way

we
that

the

people birthday

different

birthdays

probability

person

in

has

different

365
Sittce the

in

l/365
independent people 365
all

birthdays
that

of
the

different probability

people
that

are
in

the

probabilities

maltirJv
is

and

we

obtain

have

differcttt

birthdays

364 365 This expression


Itt

363 365 than

362 365 whenever

in

365

becomes
to

less

0.5

in tells

24
us that

regard size
be

hashing
are to

the

birthday
to

surpise have
the

with

any

problem therefo
to

cilhisuni

J//r

reasonable should those not


that

we
only
as

almost
try to

certain

some

eollisiotts of

Our approach
but also

mininlize
as

number

collisions

ltandc

occur

expeditiously

possible

Counting

Probes

As with

other

methods

of

information on average

retrieval

we

would

like

to

know

how

many

uhj.

comparisons
to locate

of keys occur given


its

during
use

both
the

successful

and
for

unsuccessful looking
at

attempts

target

key
with

We
the

shall

word probe

onae

and

comparing

key

target

BTEX0000274

pER

6.

Analysis

of

Hashing

211

er these

The number of probes we need as


let

clearly
it

depends
the the

on

how

full

the
in

table the
in

is

Theretbrc and

long

are

for

searching
is

methods
the

we

let

be

number of items number


of

table
the

we The

which

same
table
is

as

hashsize
n/I

be

positions

array-

.ntd

when
fvs
short

factor

load factor
table there that
is is

of

the

Thus
can

signifies

an

empty

table
but for

0.5

half

full

For

open

addressing

never and

exceed open

chaining

no

limit

on

the

size of

We

Me

consider chaining

addressing

separately

apply

function

Analysis

of

Chaining

With
35 in

chained probes

hash

table that the

we

go

directly that will

to

one

of

the the

linked target

lists

before

doing has

any

Suppose

chain

contain

if

it

is

present

rfeer

hash
in cttcccssf
it

items
rut vol

If the

the

search

is

unsuccessful Since
the

then

the are

target

will

be

compared

with

all

of
lists

corresponding
probability

keys
of
is

iten
any
the

distributed the

unifomly over number


probes
for

all

equal one

appearing

on

list

expected
of

of

items

on

the

being searched
is

n/i

Hence

average

number

an unsuccessful

search
cucajit
cal retrieval

Now
we know
of

diver-

suppose
that the the

that

the

search

is

successful
of

From

the
is

analysis

of

sequential

search
is

becomes
length
see

average

number
the

comparisons But
it

where
length at least of this

the
is

chain
since

containing

apart no answer longer

target

the

expected contain
distributed the

chain

we

know

in

advance than
the

that

must
are

one

node over

he

thc
all

target chains oPp05ite


at

The
hence
for the the

nodes expected of

other

target

uniformly
is

number on mall size


of

the

chain

with

target

1/i
1/i by
n/i

no

two on

Except

tables

trivially

we

may approximate
successful

Hence
ybff

average

number

probes

for

search

is

very

nearly

364/365
is

1c
Analysis of

now
Open Addressing For our
the random analysis

Iitferent

of

the

number of probes
by
next

done
that

in not

open only

addressing
are the
all first

let

us

first

ignore

problem
after

of clustering
collision the
let

assuming probe
that will the

probes

randoni
of

pro/w.v

but
the

be

random over
is

remaining
all

positions

table

In as us

fact

us

assume events an

table

so

large

that

the

probes

can

be

regarded Let
hits
cell

independent
first

study
cell is

unsuccessful
the

search

The

probability that

that

the

first

probe

an
is

occupied

load factor
that the

The

probability

probe

hits

an
in

empty
exactly

The
is

probability

unsuccessful
the

search

terminates
that exactly

two

probes

therefore

Al
search

and
search
is is

similarly

probability

Ic

probes

..a of
trefore

are

made
in

in

an

unsuccessful

Atl

--

The

expected

number

UA

of

probes

an

unsuccessful

therefore

handle

UA
many
ttiIxuc-ctosJim/ retrieval

This

sum

is

evaluated

in

Appendix

we obtain

thereby

item

LJA1 _A2 Aj----

BTEX0000275

212

Tables

and

Information

Retrieval

CHAPTER
the

SE

To
needed

count
will

probes one
the

needed

for the
let

successful

search

we
in

note
the

that

the

number
search

be exactly
inserting inserted

more than
item
at

number of probes
us

unsuccessful
as

made
with grows

before each

Now
time value

consider
these
is

the are

table

beginning
the

empty
factor this

item

one
lo
its

As
It

items

inserted
us to

load

slowly

from growth the

final

reasonable replace
successful

for

approximate an
is

step-by-step conclude iicc


that

by

continuous

growth
of

and
in

sum

with

integral

We

average

number

probes

search

approximately

act

es sJ

rid

SA
Jo Similar
it

IA
for

calculations

may
to

be done assume

open

addressing

with

linear

probing where The


For
Err

is

no

longer

reasonable
are rather

that

successive
so at

probes
present

are

independent
the

lit

Ca

probing

details

however

more complicated
the references for

we
the

only
the

results For
to

the cotnplete probing

derivatioti

consult

end

of

chapter
increases

linear

the average

number

of probes

an

unsuccessful

search

and

for

successful

search

the

number becomes

II

1A

Theoretical

Comparisons Figure 6.13


gives the

values

of

the

foregoing

expressions

for

different

values

of

the

load factor

Load

factor

010
sea
rc/i

0.50

0.80
--

0.90

099

2.00

Sucee.rsjii

Chaining

1.05 probes
1.05

1.25
1.4 1.5

1.40 2.0 3.0

1.45 2.6
5.5

.50 4.6

2.00

Open Random
______
Linear

probes

1.06

505

UnsaecessJii

Sea

re/i

Chaining

0.10 probes
1.1

0.50 2.0 2.5

0.80 5.0

0.90 10.0

099
too 5000 methods

2.00

Open Random
Linear

probes

1.12

13

50
or hashing

1igurc

6.13

Theoretical

comparison

We
consistently traversal the

can

draw
requires

several

conclusions probes
is

from

this

table

First

it

is

clear

that

chaining

fewer
lists

than

does slower

open than can be

addressing array done access

On

the

other can

hand
reduce

of

the

linked
especially

usually

which

advantage

if

key

comparisons

quickly

Chaining

comes

BTEX0000276

SECTION
Omber
earch
into
its

Analysis

of

Hashing

213

own

when
is

the also

record
especially

are

large

and

comparison

of

keys

takes

significant are so

time Chaining inon


often
since

advantageoLts

when
cry to

uthuccessful
list

searches
be

com
that

mpty
raetor
ke

with

chaining

an
at
all

empty
need

list

or

short

may

found
is

no

key

comparisons addressing

be

ione

show
the

that

search

unsuccessful
linear

this

With
ing table
is

open

and

successful

searches

simpler

mcthod of
at

prob
the

.41

We

not
is

significantly

slower
full

than

more

sophisticated searches

methods however
search and

least

until

jimately

almost completely
linear

For unsuccessful
into

clustering

quickly conclude
factor
is

causes

probing
if

to

degenerate
are
is

long sequential
to

We

might load

therefore

that

searches

quite quite

likely

he

successful but
in

the

moderate
where
bit dts
Wr

then should

linear

probing

satisfactory

other

circumstances

another

method

be

used

The
For
Empirical

linear

Comparisons
It is

important and
also

to

remember
in

that

the

computations
is

giving

Figure

6.13 so

are

only

approxi
always For study

mate
expect sake using

that

practice

oothing
the

completely
results the

random
and
results

that

we

can

some
of

differences

between therefore
are

theoretical

actual of

computations one empirical

comparison
keys
that

Figure

6.14

gives

900

pseudorandom

numbers

between

and

Load

factor sea

0.1

0.5

0.8

0.9

0.99

2.0

SuccessJii

re/i

Chaining
of the

1.04 probes 1.04


1.05

1.2 t.5 1.6

1.4 2.1

1.4

.5 5.2

2.0

Open

Quadratic Linear

2.7 6.2

probes search

3.4

21.3

Unsuccessful

Chaining

0.11 probes
1.13 1.13

0.53 2.2 2.7

0.78 5.2 15.4

0.90 11.9 59.3

0.99

2.04

Open o0

Quadratic Linear

12b 430

probes

Figure

6.14

Empirical

comparison

of

hashing

methods

onclusions to the

In

comparison about
all

with
these

other

methods of information
is

retrieval

the the

important

thing

note

numbers
of items
in

that

they

depend

only from average

on

load factor
table
is

not on 20.000 from times


is

absolute
in

number

the
is

table no

Retrieval on

hash than search

with

items
table the

40000
20

possible in

positions

slower

retrieval
list

with
size

items take

40

possible

positions long
Ig

With
search but

sequential

1000
this

will to 10

1000

times

as to

to

With
still

binary time

search needed

ratio

reduced
Chaining tt the

more
it

precisely not with

1000

the

increases

with

size

which

does should

hashing
the

hand
reduce comes one

Finally
that the

we

emphasize

importance

of devising of
that

good
If the

hash hash

function function
is

executes

quickly of

and

maximizes can

the spread
to

keys of

poor

performance

hashing

degenerate

sequential

search

BTEX0000277

214

Tables

and

Information

Retrieval

PIER
that

SE

Exercises
6.6

El

Suppose exclusive are


If

each
the
in

item

record
field

in
if

hash

table
is

occupies usedt and

words
suppose

of

storage
there

of

pointer

needed

chaining

that

items
the

the hash
factor
is

table and
be

load

open

addressing
for the

is

used determine
table

how

many

words
If

of storage
is

will

required each words

hash require

chaining
field load

used

then

node
will

will

words
for the

including

the

pointer
If for the the

How
factor table

many
is

be
is

used

altogether

nodes
will

and
itself

chaining Recall

used
with

how many
chaining

words
the

be

used
itself

hash only

that

hash

table

contains

pointers
to

requiring one
the

word each
parts to find the total

Add
ment
if
.c

your
for
is

answers
load

two previous
chaining

storage

require

factor

and open

small
for

then

addressing
requires

requires less

less

total

memory
Find
total

for

given

but

large for
at

chaining which on are


the the

space

altogether

the break-

even

value
will

both
load

methods
factor

use the

same

storage

Your

answer

depend
6.14 of

El

Figures account Produce case


to

6.13
is

and taken

somewhat
needed

distorted for the links

in

favor

of
part

chaining of

because

no

space 6.13

see

Section

65.4
for
is

6.7

tables

like

Figure
for

where addressing
the

load
the

factors

are

calculated by
links

thc

of chaining table
it

and

open

space

required

added

the hash

thereby
in

reducing

load

factor
to

Givcn

nodes

linked

storage connected

chained with load links

hash
factor

table

with
find the
c/talc

words per item plus


total If

more
that

for will
is

the

link used
in

and

amount

of storage
of

be

ittcluding

strap

this
it

same anlount
items
to table

storage

used

hash
resulting

table

with

open

addressing This
is

and load

of use

words
for opeit

each

find

the
in

loth
the

factor
revised

the
tab/i

factor

addressing

computing

tables

Produce Produce

for

the

case
for the like case
.s

another
will the

table

What
123

table

look answer
to

when
the

each

item

takes

IOU

words
is

One

reason from

why
the

the

to

birthday
related

prohlem questions disregard the

is

surprising
the

that

it

differs

answers
are

apparently
in

For
leap will

following

sup
ether

pose

that

there

people

the

room

and
in

years have birthday on

What

is

the

probability

that

someone hat
at least

room

random

date
the

drawn

from
that

fb

What same
If

is

probability

two

people

in

the

room

will

have

that

random
cltoose else

birthday one
in

we

person
the

and
will

find

his

birthday
the

what

is

the

probability

thut

someone
124
In the

room

share
that

birthday sense chain


it

chained

hash

table

suppose
the

it

makes
each
as

to are the

speak kept

of
in

an order

order by
the in

fc

keys

and search

suppose can be

that

nodes
as

in

ker
key an

liaal

arc/pied

dcii

/th

Then
should

terminated
I-low

soon

passes
will

place on

where
average

be

if

present

many

fewer

probes

be

done

BTEX0000278

Cot

ion

Comparison

M.vods

215

jorage
there

unsuccessful average
to

search
insert

In

successful

search
the
in

How
place

many

probes

are

nceded Lnswrs

on with

new

node

iii

right the

Compare
the case

your

the curresponding

numbers derived
of chaining
of the
in

text

for

of unordered

chains

many

ES

In

our

discussion
for

the

hash

table

itself

contained
is

only
the
is

pointers
first

list

headers
4lng the

each chain

chains
the

One
table

variant

method

to

place

actual

item an the

of

each

hash open

usd1

An
With

empty

position load

indicated calculate

by

des
be

impossible
effect in

key
space item

as

with
this

addrcssino
as

given
the

factor

used
itself

on each

of

method
takes

function

uf

number of words

except

bk

links

link

one

word
your

require-

Programming
Project
6.6

Pt

Produce
test

table to

like

Figure

6.14
the

for

computer
of hash

by

writing

and

running

programs

implement

various

kinds

tables

and

load factors

it4

given

Your

iuse

no

.7
for
the

CONCLUSIONS

COMPARISON OF METHODS
and
the

added

This chapter of information

previous one
sequential

have

together

explored search
first

thur table

qutte

different

methods
hashing which
to

retrieval

search
is

binary must

lookup
criteria

nid by

4with
ftnd the

If

we

are

to

ask
these

which
criteria

of

these

best both our and

we

select

the

Hues

0/

1111

answer
and

and

will that

include
affect
lists

the requirements choice second or of data two


to for

imposed

by

the

application the
first

orucrurc
ldressing
is is

other

considerations applicable
are free to to

structures tables our


In

since

two

methods however
ubte ton/sup
is

are

only

to

the
lists

many

applications

the

we

choose speed

either

tables

data

structures
in

In

regard

both

and

convenience
are

ordinary

lookup
to

contiguous
it

tables

certainly as

superior
list is

but

there

many
the set

applications of keys
since
is

which
It is

is

inapplicable inappropriate storAge

such

when

preferred deletions

or
are

sparse
actions

also

whenever

insertions

or

frequent

such

in

contiguous

th at
pg

may
it

require

moving
of
the

large

amounts of information
three

Which
the form
ni/icr

other

methods

is

best

depends

on

other

criteria

such

as

supmethods

of the

data
search order
is

Sequential be stored
is

certainly either

the

most

flexible

of

our

methods

The

data

may

4ay

on

in

any

with

contiguous keys

or linked be
in

representation and
the requires

Binary must

search be
in

much

more demanding

The

must

order

data

tye

that

random-access peculiar

representation of
the

contiguous
well
If

storage
to retrieval

Hashing from
the

even but

more
generally

ordering
for

keys

suited the

hash

table

that

useless

any then

other

purpose kind
the

data
is

are

to

be

available

immediately
table
is

for

human

inspection

some
is

of order of

essential

and

hash search

inappropriate search and

ker

for

Finally
near miss

there

question say

the

unsuccessful
that closest the to

Sequential

key

hashing by
search can can

themselves determine
useful

nothing data

except keys

search
the

was

unsuccessful and perhaps

Binary thereby

Ac
fe

key an

which

have

target

in

provide

information

BTEX0000279

13n

tok.s/aie ivkic
to

P11151

isP wtl

nlpIfl\

of WacIswl

I98i hook he
tilt

he

\adsvorth

Inc

Ileintont in

Caliktrtiia retrieval plIll/lo ilislic of

9-in

All

rights

reseFvetl

No
AOl

pan
loint

of or

ilto

nets

repntcluced electronic

stored

svsent
tpvll Iir

or

transerilseci

ill

AIIV

nteans
Written

mecltantcal
01
tIt_

ig

re

FCiltg 03

otltcnvise-

vule

tot

prior

permission

k.sUolc Inc

iOihIislliilg

.ompanv

Ioittetts

diltirnia

939it

division

\\atlsosirtli

Prittied

in

the

ititeti

States

ol

Aiticiict

ii

Library of Congrcss
SIttistla
Ii

Cataloging
tiate

in

Puhcation

Data

tat
strticttires

ala

welt

altstrict

clar

Ivise

tic

iilstiii

tititdcS tata data kkiIcI

ititlex structures Lottiiettffr


II

Conspuier science

sciOn \\ehte .N
\\

Ahstrict

ivtcs

Neil

\X

ide
llS.i

cAo.Q.it3Ss

of

S-i-UtO2S

ISBN

O534-0319-Q
.\ci/ did-

Spi

ins

lime

Iiiit its

.ltic/tctil

\tsdll.sittt .IJcc

idtt

nat

Assistants

/1

71/i/i

001/
/llii
ill

On /1

Alat-keting 11111111111

lepteseiltalive
Ftl
ill

tail/on sA aiitcla
.siinLtii

IF IF Ir

nii/t

Manuscript
Perntissii icr Art
intl

Filet latin

Ih-ec
u/lOu

ins

ago
/ouis/i

Intctiot iFs

Sin

//neb

ii

uitlitlAti

ReAct

hi
AuiuiuOC

.IOic/.ii/i/tIi/gi /Cisai .ctas-c //IolSt/Li/t

Interior

Illustrittioti ut-up/tic

mu

lw

/ltotu

/ultill-\ueeun

ivpcscri Ing
Iriniitg
Itst

/t/kSAUifli
/1 /i

Ins
So/LI

.-Otgi/c.s

li/i/ouuuui //ic/taunt

toiling

i-Si/na/lit

.0

c.tsiitjo/.ctai/i

Apple SEC
Iiill is

isaregmsterts_l -egisiet-etl rel.icietetI

it-adentark

uI

Apple

cuuniantei

Inc uutpot-tnn
Ntaclti

trademark trademark
irailcinark

OF
til

liio_il

Oquipnieiu

i_s

ititctiiiiiuttal

Ictsiriess Inc

tea

Tic

Itiseal

Nli

is

of

Digital

kcscancli

BTEX000028O

310

/to/fliT

Sets

We
set

have
Shiecit to

nit
cat tie

mci
ii

tided Th2

the

set

Opvrtt

tti

on/nit

tPZtPrcectinpi

and
Id

nieti c/i/fert-tcc in

iii ri

ir

Ci told they
itt

he included

tfso

how wou

he
the
sjieci

iauons
the
that

thiccuglit

ttve

mi

lililicil

cli

sc

key key

as

Otie

if

7.4

Hashed
have studied

Implementations
several linked Ins niethod.s
lists

nietits
atiahc

Then
ins
ehentii It

/e
reec
Ic

tgc

for several
cit

the

storage kinds stru


Li
if

and
trees the

later retriessd

of

kvvvu
tlia

anit

mg
cog

rds Arracs
liese

and each tow with

provide
Ii

structures peration of recc


is

amc
thii

ecu

tst

cw

tperatic

In

these of

res

id values until

ncc
in

cltapter
of

essarilv the ing struct value

implemented nrc
is

he

st

fbrni the

search
tr

The key
target

rds

dtscctssiott

are

ci

toipared
itr

desired
is

key

either of

match
prohcs
of
is

rte
prc

oft

ficitnd

the

data
lii td.s

structure of
trgati

exhausted
rig

The

pattern the

uses
It

liii

dependent structure
lsinarv \Te

apt
si

to

the

met

izi

and
as

relatirtg

records he prohed

the
Oti in

ever

tied

linear
list

list

implemented
ti

an

array

can

hy
for Iitiked
list

sett-ch

The same
ask
if it is

linked
ti

ftcrm create

can
data
Is

only he
st rLtctu

searched
re that for

sequentially
ni

sorted
ire

might
ci

pi

tssihlc

does

reclu to

fewest
effect ivc

chic

search pute he

ittiplement
teat
it

the

hod

operation
that has

it

pissihle key
sal

example

ccitii

i.tt

Ii

in

oft

he

reet

nd
si

given

ue

hash
AJI
cit

taut

tietut

rs

dd

ress

of

reet

key key

these
Ii

teiittiilhitc svlie

ref

is

teuc

in

that

oh

the

record

idetititieci

maps each liv that key


lie

distit

tet

value thtt

inti

the

mertti
is

cry

address
nc

cha
It

ngi
ts

\\e

sittil

see
are

the
Itt

artswer

qualified

cc

yes ml Ihev

Such
lie cc

futietiotis

can cd
if

hcund
of the

lint

they
ti

difficult

determitie
kni
c\vti

and tdvjn

eati
tca

let

ii

ii

instruct called

all

keys

the

data

set

are ate

it

than
is

ti

ealcik
it

ate

pet fect

hashing functions

and

further

exatniocti
ii tii pci

Section Ni

Tht.3

an
there has

tctitat

is

irma lv

he
that
ces

ci

mprotii dyes

cc
ealcu

fri

im
in

strictly

calculated ks

aecvhim
itcc

Figitre

selietite to

hvhrid

scheme
di

iti

lath give

folk

rcved

some

searching
if

The

function
ird tot

tiot

necessarily

the

exact thtt

tiietiion

addres
the

COnS
type

tahil

the

tart.et

reet
itLl

only

gives

home

address

tnt

ci tnlai

hills

desired

reci

tar
hi

tahlc

woe

acid tess

lit

kei

Figure
Futieth
iris

such

is Ii are

kttiiwrt

as

bashing functions
etsv
tic it

Iti

cotirt-ast

to

perfect Sctppi
In
151

hashitg perk
that

funetit

os The

these
hi

tre

usitallv

to

detertititie the

atid

can

give
si

exeellern otght

trtnauee case
In

uric

address

may
is

ci itittiti

record
this
is

being

search
Secthtti -t2 the
its

oft

cther

tddresses inttoduce
several
if

reqit

ired and of

ktiosen

as

rehash
and
in

vat

tthle

ing

.t

we

nunihier

hashing
Its

futictiotis 7.5

Section
tarize

we exantine
irnianee
in

rehLshttg

strategies tos
itt

Section
in

we

sitni

pertc

hashed

implemerttath with
diat

and
isis

Section trees

.6
ft tr

we
the
aticl thtt tite

tmpate

opertt
if

ttkl

perforniatice

and

freihcteticvttialssis

ci

graphs
idea
in

The
Si

lu

ndameotai
te

hiehitid regular

hashing
pittterti

is

the

tuthesis the

tf

sotiit

I/i

kec
that
\\hiti

arranges
hitiarv basic

tI

reci irds possihile


tic

that

tiiakes

relat itch

tidcivr
itch
tot
sc

setrelt idea
is

ltshitig the

takes

the
ci

diametrically

opposite
Iv

Nottce
tttiil

apprc
it

scatter

records

imphetei

rattdomn

through

BTEX000028I

.Secnn

hashed

ltiephsiiteitnituus

351

nleiiiorv

or

stor
of
as

spacerhe

so-called

ba-sb

table

he
that the

LtL5Il

ftinctii

ni

can of

he
the that

thought key key


as

pseudo-random-number
and
that

generator address
of

uses

the

valt.ie

seed

outputs

the

home

element

containing

One metes
analt
si

of

the
is

drawbacks no

of hashing of
is

is

the

random
parent

locations or child

of

stored or

dc

There

nouon

first

next

root
for

annhing

gous

Thus

hashing but
not

appropriate implementing
it is

implementing
that that

set itvolve
is

relationship relationsltips

of

keyed
that

among elements
anutntg constituent

for

structures

ctuie5 iott
is

elements
sets

for
hi

that

reason ther

hashing

discussed mtexts

in

necin

this

chapter
of of

11

There

are

tweceo

appropriate

ci

or

tecOrds
ei

disc1tssion

hashing
the virtues of

matchprobes
is

One
probes

hashing
has of

is

that

it

allows

us

to

find of

records with
that

01
011
in

The
in

/iitclkei

operation

required

nuniher
structure

probes

depend
far

ids cjihed

of

the

on
for

even

implementation
of

even
list

data

discussed
array

so

by

linked rted
list

implementation and
to 01
find

01

log2n hr an
search
tree

inplementath hashing
to requires

of

uentially

logn

for

hinan
it is

Since

the

tt

require
to

fewest

probes

something Also

frequently

considered
stores

be
in

particularly

com-

effective

search
it

technique sometimes
of

since

bashing
to

elements
for to

table on

the

hash
All

table

is

considered
are sets of

he

technique

operating view

tahkss as are

of

these

views

hashing

correct
its

We

choose

lashing

technique address
hi

for

impletiienting this point

other

advantages

and

disadvantages

not

changed
It is

by

view
the the

qualified

convenient hash
function
its

consider
calculate

hash

table value

to

he
of

in
the

array

of

rect irds

and

.ie

and

can
let

the

index
directly

home

address

rather

advance
htniined
in

than
is

to calculate the

memon
address

address

Once

the

appropriate
the

index value
into
iii

computed
actual

arras

mapping The

function

can
table

complete
is

transtbmatiitn
as

an
gued
1rne

memory

hash

then

represented

shown

access limited address the

Figure 7.12

in

coast type

tablesize position

lJsersopplieci 0.1 tahlesize lNor


VtaiiIcircI

cOntain

/ascoi

var

table

arraylposition

of sidelement
of hash table

17/ic

bash

iahk.l

Ftgure
iko perfect

712

Array

representation

excellent

Suppose
In

that

we

have

hash

table

defined

by

iuught

rebasb
ti.k

var

table

arraylO..6l

of record key data


integer arrav1..lOl of char

and

in

twe sum
7.6 for

we
the

end
and hash

that

the

function

is

tIi8

sort

I-Il

key
that the

key

mod
produced
the

efficient

pach
iOut

The

Notice and

value
is

by

this

frmnction of

is

always
table

an

integer

between

some

which

within

range

of indexes

the

BTEX0000282

St

312

/to/ner

see

Operation
Table address Table contents the
litst

cc-ca/c

will

produce key

the value

empty
of

table

shttwn
the

in

FigLlrc function

7.13

We
If -fr

st

tec-ord

we

store

has

374

then

bash

1/kes

ti

etupit etnpn entpty places has

I1L7i
the

3m
at

nuid
in

the

exac to dt

it

record value of

tahlel3

This get

is

showtt

in

Figure

.14

If

the

next record

thing

141

cit

isrv

key
/111191

191

we

empty ll etttpn empty 1091

mod
that

74.1
There
is

and Figure umpic

tite

tahie

becomes

shown

in

Figure

7.15

third

record

with key proposed

7J3
table

911

gives
straightfsttt

11911
and
Table address Table contents target position value
tIi1i

911

mod
shown
the to the
iii

since their

the

si

use Inc

the

resulting
itf

tahle

Figure 7.16
already
in

exotic table
is

Retrieval

any

of

records
the

the that

simple
the

matter

The
table

Coos
TIt

ic

key
as

is

entpn

presented
it

hash

unction was

reproduces
the target

same

enipn
eiittty

did

when

record

stored
in

If

key

were 740

not

iti

the

table 7iO

the

hashing

functic

would produce

TIc

t11t

Ji7q0
Interrogating

mod
we
not
that in

We
find the that tahle
it is

will

nc

Si

cii

tp
tahIt ThO
is

tilt

entptv

atici

we conel

tide

tI

tat

record Digit

Sc

with Figure
16_st

Icey

7.14
si

-c

ned

at

table

The example
prohieni
in tile St

we
with

have

just

seen
sal

was

constructed hashed
case
in

to

conceal
different

serious ccations

The
keys

hrst ol

ltt

fbi
1liztt

keys
is

different

ues have
is

tltt

table
the

generall
values value

so and
carefully
is

tnlv

the

tair

current

example
of

Social

Sect

Table address

Table Contents

because

key key

were
of

chosen Then

Suppose

that

inserthm

ke
If

record with

22

attempted

III

etlipty //t etiiJtt\ etltpte

2rt
is

mod
iireici
hi

the

pops
thu
in

the

last

hut
c_Iziti

tablel 31 different dli

led

with

anc to

nher
the

reeord

This

is

cal

led
this

collision happens of
life

possible

I_1

hi

cntptv entpn lit

two what
data

key th
iut

values
it

ltashittg

same

locatioti isions

Why
are

and
wIten

are

mp

trtant

because

et di

fact

var

tahtt

hashing Figure
Seeccud

.lS
tett
ti-ct

Sctppose
suited
at

that

employee

t-eeords
it

are will

hashed
tiot

based
to

ttn

Social

Security table to

num
with

where

pet-s Ntctic

tahteo

ber

If

firm entries

has 310 employees


tthe

want

resene

bash

keep

billion
that

number

tO

pscssible

Social to uses will

Secorirv
It

numbers
ccatioti

guarantee
if

l1 key

each

its

emph vee
slots
in its

records hashes hLsb


lits-

niclcte

Even
that
is
is

the

firm cvhicb
sitit1

Table address

Table conttnts

allocates rtnck This


data

100
izer

table

and

hash be
it

function
tI

perfect
lv

tm
is

the

ptt cbabi

that

there

isiorts

essential

zero

Gate with which


digits are c/c/i

eiltptv

the

birthday paradox Feller


are

1930
stortli

which
lookitig

says for

that

hasb

functions
in

It

Ott... empty

with IA Ii 4Th ..
data

no

collisions

so

rare

that

it

is

them only
in

vet

special 7.t.3
Iti

citcunistaoces
the

These

specitl to

circumsutnces
Insider

are to

disccissecl

Section

prc tbtb state art

etttpiy

nteantime

we

need

what

It when

colhsicttc

does

single

empty
1191 data

occu With
careful called

number
design
strategies for

handling

collisions

are

simple

The
and

arc

inally and state

iSsues

Figure
lltitdt tee

7.16 nj si ted
at

ci

cnrnc

ink

rehashing
in

or collision-resolution
7.-i.2

strategies

cluster

table

will

distttss

them

Secthm

56

BTEX0000283

Section

flashed

Implernentotzo

313

7.13

We
11 I-Il

salected

the

hashing

function

key

key

ii

in

the

example
to

we

just

completed look
at

We

will

now
of

see

why

that

was

reasonable

record

thing

do and

will

also

numher

other hashing

functions

TA
There
key pr
is

Hashing
large since

Functions
diverse

and
the

group
of the

ol

hashing functions
technique
all

that are

have simple

been and

posed

advent
are

hashing

Some

straightforward
since their exotic
latter

others
of the

comple
of

Almost such
of

are

computationallv
is

simple
factor
in

the

speed

computation hasa

functions

an important

use

Lum

l9l
will

good review
our have
attention

many
to

including hut

some of
effective

the

more

ones We Good

confine

simple

methods

The
table

ne
ie

hashing

finctions

two

desirable

properties

740

They They

compute
produce

rapidly nearly

random
hashing

distribution

of

index values

Wc
record

will

now consider

several

functions

Digit The
keys
first

selection hashing
the set function of data

serious

we
that

will discuss

is

digit selection
with are strings

Suppose
of digits

that

the
as

of

we

are

dealing

such

example
of

ocial

Security

tiumbers

nine-digit

key
the

If

population
three

comprising

the will

data give

is

randomly

chosen then
distribution

the of

choice values

of

the

last

digits

d449
is

good

random

Jilsion
spens
1lfe

possible

implementation

the

following

and

when

var

table

arrayf

09991

of person

fity ile

numwith

where person keep


1/C

is

record
the

type

for

the

key

and

information
is

that

we

wish

to

Notice

that

hashing

function

in this case

Marantee Vthe
firm

key
simply

key

mod

1000 three

perfect
Ually

which

strips off

the in

last

digits

of

the

key
to select
If

zero
with

Care must which

he
are

taken dealing

deciding
students

which
at

digits

the

population
last

functions
ity

we

is

university

for the their

example
three

the

three

in

very

digits are

CI7dMds

are

probably
State

good

choice tend

whereas
to
first

first

digits

d1c/41 from
Security

ih

Section

probably
state are

not

universities

draw
three
in

student bodies of
the the Social

5km

does

single

or geographical based

region

The

digits

number
They and
are
ittally

on

the

geographical

region
for

which

number was orig


first

issued Most
clustered

students from and


is

California digits

example have
various data

digit of

of the

we

and
state

second
for

third

indicating

subregions
for

567

example

very

common

Lithe

were

California

BTEX0000284

it

314

./si/eii

it-is

uttiversitv rittge
ii

almost
the
licsii

all

of

the

students large not table

rcxorcis

would would
and an

map
tllitJt

riRi

the

500sg
5fi
uld wi

factors factors

and other the


lit

tthk

tnd

subgroup he

into

position hut

The
Ii

if

the

unction
positu ins

would
of the

ctniform

rand

tm

he

time

that

iadecl

Ii

certain
It

causing

inordinately for that


is

high reason

number

However
than

oF

citlhsiotis
if ic

would
pi pci

not
at in

he
is

good
kin
twti

hashing
ti

function
it

21

is

St

key
in

advance
of the

possible digits
is

analyze participating
in

clist

rihctt

it

iii vat

ues taken
ate

hi each

digit

key

The

Multiplic simple met

ttte

ltaslt

tclclrnss

tlten ease tue

to select
last

Such
digits the the

an

analysis

called

digit
the

analj
digits

six
tf

Instead

ii

elu
eie

tsitig

three

we would choose
most uniform
fcm

three

the
tttcl

key

wlti

digit

attalvses

showed ins

that distrihctthin
If

tlte

keys

if
tlti

gave

lie

hit test

clistribcttit

hashing
to

nctioti

might
in

strip
tile

out

kev
The ket
is St

ise

digits

from

key

and

put

them

together

form

number

range

999
fit

rf1d/ri

fsf//C44
advised
thee nat
it

tIc

tactthtti

is

sitice

although

the

digits

are

apparently

random and
For

iit

list

tinift

trio

in

value
ti

might have
ins of
is

dependencies and
mu

amotig
tend the to

thetnselves
tccct

exam
The
if

ple

certai

et

tmhi

ight

tgether
position

Then
rttitpped

rcsttlt

were
to
tltc-

alwtvs range
if

wlteti

rI38

would he
loweritig
intercligit

only

select table

hat

ott

ut

the

J3ttd39
ci

effectivelv fir

the
ctitrelati

table tns

size

and

itlereasing tleccssarv

example
Ii is

t-.j

cltattces ht-ing

tlhsion
itt

Antlvsis to light

might he

intl rigl

to

such

ing

the

tt

situtti

cotiies

only
tlttcst

Division

right the

fly

sattte

tt

ttc

ttlt

ic-

tilt

st

elleci

Re

ucsltit

tg

tuctht

icis

is

division
/t

which

works

as

It

tilt

os

introducitig
invctlvittg

lit

keel

ke
of

tttod

the

ttt

/t

tt

itt

in

the ket

key
is

is

llte
ci

liii

pattern
in lie

tltc

key

regtrclltLss
liv itt
ilttcl

ttf

its

data

t\iDe

is

treated the

asatt
clivi.sh tn

integer
ctserl

the

ati

ivtdecl

titeger
/t

sense
is itt

lie

rentaiticler
it ti itt

of

ts
tin

tltc-

tthlc

tcldress

the
ltitvc

range an

front integer
tlte

Such
since
in

futiction getserate

is

last

con

tpctter ut

systems
ste

that

ci

ivide

most another

the

Folding
The
digit

rico

ieitt

lttrclwtre register

tegister
iicccl

aticl

tetmtiticlet

The
and

ctttttent tile irish

next

hasi as

oldie
is

rettttittclei

ottlv

be

copied

anti

the

variable/i

key

ci

itti

p1

ct

ccl

in

practice
rictI

icitictitins

of

this type Os
Ft
in tc

give the

yen good
iivisictti

resctits

Lctm

dYt

has

kevr
and
the pritg
cliv

tn
pi
it

cmlii
in

study
ti

sI

ti

twing cases ntap


itt

he

case
if iii

can

however perform
keos
itt fl-crc

itt

urtther

of
Id

example
csit
it

were 25 then
itt

hardware form

divisible
sctl

liv

wi
keys

ict

intt

itis

tI

15 and
inncthi mctcl

20

of

the
in

table getieral keys for


itittt

hash

iset

ttf

the
tic1

nttps
ci
ci

scthset
lic-

cii

the
tti

table
ii

st

ng
itt

that

we

wisl
iviuclt

ti

lvi

If
ci itt

rse

ctstt

tg

fu

ticts

kec hir

lit

kc\ II idu cc

tin

into that bias titles


is

tahielhl

all

keys

which

key
not

maps mid
want

all

key

itt

The

result

ivi

tithlel

etc
at iv

httt

ctntvtiiclahle

\Vhat

we

clii

to

clii

is

to

itt

ts

fu

it

her

I/t
and
codtld hc

The
laett
ir

pttthlctti
5..-\l kcv.s

uticleriving

die

chttice
ivi

iii

25

as

the

table table in

size

is

that

it

Itas

of

with
crime
is

as
tci

htctor

II

map
the

intt

position

thtt

alsct

there
the

were

has

that

htctttr

The

make

scire

thtt

key and

have

nct

common

tiunibets

BTEX0000285

Sec/in

N/ed

/iitpfeiiiciiio/ioiic

315

411

-0099
567 be

actors
.tctors

and other

the than

easiest

way

to

ensure
inte

that

is

to

chotse
Fi
ir

to

50

that

it

itas

nil

cy

5432t

isitiOt1
ut

and

itselfa
is

itumher
tahle
it

this

reason

nit

sr

ouId

time

that

the Luni
is

division 19 Ic

function slttavs

used uiv

the

sc_c
\vitlt

ill he
ti

tome
lack irs

ttunthei
sat

nLimber
jgh
111

nvever
than

thtt

divisi

small

less

5432t 54321 54321

20

su

dab

inalsze licipating
1/gil

the

08642
in

Multiplication simple

32963

analy
method
Lees
in

27284
that
is

based
are

ott

multiplication
digits
in

is

sometime.s

used

Suppi

se

three iutiOn-

rhgits
If

27605
295077 91
04

d4

that

the

question

live

length

ht
in

strip out the

Lee The Lee

range

is

squared

itt

Ii

077

ri/./tf
and FOr
ther
lion

2.O
The
result
is

Sti

i2i
I-digit

Figure
kit
tic

.t7
tc-iuli liv
ii

Ilcil

exam
if

Lt
1i-

iai

ITO
div
i.t

ii

Then

prcluct
In

hltc

function the Figure digits


itt

is

utittitleted digits

he
are

doiitg

digit for
N/tIc

initItItciligugiiiit

mapped

selection .xantple

ott

the

prodLict
Art

most
is

Lses
in

ittiddle

chosen

liv

iv.

r.4r5i1

example
to cia twit the tose

shu the of
ttf

nvn

ii

increasing
It is

necessarY
itg

important
right

middle
the

Consider
tile

for

exantple
That
otilt

clioos
value the

the

most from

digits

product
21

extntplet 21
that is
it

comes
right the
7ks

only

product
tile

and
All

front

most two
tahie

digits of

original This
in

see
is

value
the

kcvscndiitg
of hias are
ft

21

svihl

produce
to

same

location-it
digits aitd

kind

titat

we
fri

tn
tiit

tvoid

as

follows

intri

iducing
ilving
tIle

The middle
left ikelv

the
irt

slier
is

hand
of the
nh

trnted

pri tducts ite


itt it ci

ittvc

middle
ti

right the

pi

key

Chattging
in talile
fri

iitv

igit
if

in

the

key
is

is

change
in tile

hash

result
tn
if

trntatit

ml

ii

pm

los

is

au

integer
is

the

key

amalgamated

calculatit

tile

hash

subscript

ision

used
is

nction

fast

icnerate

the

Folding
The
digit text

The
tid

content the

hash
as

function
ill

we
the

will discLtss multiplication

is

folding Suppose method

that

we have

five-

hash

key key

we had

lt

1971

has

dd44c4
programs
divide are

Lver

perform
that

and

the

running

on

simple micrticornputer
that

system

tltat

has

no
to

.y5
the

were

hardware form

or

multiple
is

hut
to

does
the

have

an

arithmetic digits uI

add
the

one
key

was

table general keys for intO

hash

function

simply

add

individual

We
is

in

all

Ii The

key

d1

cl

-I-

cL

ci

cls

flu

result

would
Li

he 4S

in

the

runge

iii

to

do

is

tO

is

that

it

has also

and

could

be

used

as the than

index 46

in

the

hash
the

table result

If

larger

tahle

were he

needed adding

ticn

that

lthere
tile

were tnore
as

records

could

he

enlarged

no

commofl

numbers

pairs

of

digits

BTEX0000286

4c

316

riccc/clcs

sets

IIRcvh

IC/I

Ilj
lien

r/r4 Ott ntd 20

the

hit

Tie
tIlt

result
ilitlite

\uuld
givett to

he

heR
01

ecu

09
conthi

99 nng
ire

99
porn

lblding

is

10
The coo
ordi

tttss

nittItois
lie

tat

ttivcilves

ms

of

the

Rev

to

butt

stitaller

result

nietliotbs

or

oflhihtntrt.4

nsuaIl

either

arithmetic

addition
olteti

or

exdnstve
in

ors
With

Foltlmg
Sc

used

conjunction
inc In
hit

other

methods
tgt-anl

lithe

Rev

were Since

end

ecti

liv

numhe
that

ci

digits

and and
is

p0

were

implemented

cm
istt

ittutiel

iniputer tieger size

has tss3 to

registers
cii

consetlnentlv

has
as
it

maximum
the thtee
It

tie

ii

the
less

Rev

im

raetahlc

stands can

must used lu3Rs beyond


otdi

sctntelttte
Fttlditig cati

he he

reduced used
to

an

integer

than the

M535
in

hefore
has

he

do

this

Snppitsc

Rev

question

value
is

Rei

9KOSa
htcah
die

321

\\

can

Rev

tint

1ottrc1it.it

groups

and

then

add

diem

tIUt9

type
i321
Ii

iltl

Rev

3Oh
it

ftinet

Ntis
thin
It

result

would he hctween
In

antI tahie

20tT Now
iosinoti

apply
the
cit

second range
is

hashing
It..

func-

var

sn

divisnin
taltle

produce
in tosttic

within

Un

lie

hash

ltts

ctis

the

composite

uncut

Ill

Rev

olth

bold Reel
ta
ccc

rep
II

Character-valued
All
ccl

keys
itt

the \vcre

exatttples
sc

our
ccl

diseussic

in

ccl

Itashing

funethtns the

assunied
Revs are

that

die

Res

tile

cciii

tiueger

dune kers

cltetu

however
these

character

strutlgs

or
that all dct.i

bce
ic

Unti
tre

litntlled

end
eonlputter or tltetnor\ c.saniple
is

Rencetither ol hits

sU
lie

ic5

tie

stmph

strtng

lie

ASCII

code

or

chttraeter

Algot

\\lttt

Ii

tati

.tlscc

ht

ccctetpttttd dtaraetets

..-as

cs

tltt

inurget
in

..-.-.. caIn 21
tIns Iashi
cu

Flit

nit

futittcon

of

the

sttiiplc

Uaseal

tchuerprets
121

integers

cnzlt

his procides one


salnes ate single

h.sis

tc-ug
dts

cittractet-s

in

Itashing as

functions htlhcws

the

Rev

7.4.2

Ct

eltaraeters

tHu
cn

cut he

applied

coLlisic
Ill

Rev

ci

rdl

Rev and

mc

tb

when
will

nyc

In

the

ease

Re
cctdc

in

hegiti

I/cs

nod
stritig cO

strategies

ies
length such

Ii

the

Rer Rev

is

character

as

nmedigit

BTEX0000287

Section

-.

Flashed

Imp

letnentations

317

the

hit

pattern

for

the

string

would

he

110101011110012 The

corresponding

integer

is

ordj
key were Si ce
128 to the

128

ordv
multiplication

13689 by
128 effectivel the for

the left

shifts

hit

pattern

hits

The

addition

effectively get

concatenates

the

2-hit strings-

For

the

three-character

string

djv we

ordd
1h384 hecond ani
is

16384 providing

ordj
left shift

128

ordCv
14 the
7.1 hits

1652089
Notice
available that the result
is

of

for

the

capacirv

of

16-hit

register

size register folds

on most mini
string
in

microci miputer systems

Algorithm

21-character

groups

o13

type

stringl

array

I.21

of

char

fi-inction

fold

string2

integer

l-oldv

cIxnackr nnqsc
hit

strii
of

of
var begin 1.22

ciaractcis

ti

IctLct

.14

hnqcn
the

art

rcqiiirectJbr

recoil

IbId

repeat
fold fokl

oniUli

16384
128

ords
ords1
until

end
Algorithm
7.1

Folding

character

string

Algorithm
the

7.1

could

he

written

more generally
can be

hut

doing
to the

so would
result of

ohscure
frmnction

simple process

Division

hashing

applied

fold

7-42

Collision -Resolution

Strategies
or

collision-resolution

strategy have

rehashing

determines
to the to

what

happens

when
will

two

or

more elements

collision

or hash

same address

We

hegin

by defining

some parameters

that will

be used

help describe these

Strategies

We
nine-digit

will

call

the

number

of

different Social

values Security

that

key

can

assume

integer

for example

number

has

1000000000

BTEX0000288

Section

Flashed

fotpletneittarzorts

317

the

hit pattern

or

the

string

would

he

Folding

110101011110012 The corresponding

Ipons

of

the integer
is

ally

either1

ord

128

ord

13689 he 128

Ic

key

were

Siwe
hits

28
to the

the left

multiplication

effectively

shifts

the

hit

pattern

for

1tttplementecl 4a

The

addition

effectivev get

concatenates

the

maximum
the
It

2-hit strings

For

three-character

.hds

must used
1o384

string

djv we

be Ivalue

ordd
is

16384

ordf 128
left shift

ordv
14 the
7.1 hits

1652089
Notice that the result
is

2i4
the

providing ofa

of

for register

heo

lttd

capacity

16-hit

register

size folds

available

on most
string
in

mini-

and microo

tmputer systems

Algorithm

21 -character

groups

113

type

string2l

arraj

1.211

of

char

inctlon jhing
func

fold

string2t

integer

loldc

clxuactcr
to

.ctrotg of

of

2/

cicracters

tcnefe ituctrs
the

van
begin

1.22

At

h-act

24

hit

aw

rctjztirtclfttr

nttl

IT

hild

repeat Id
fold ordi

16384
12H

trdsi
ordUll

28

until

21

end
Algorithm
7.1

Folding

character

string

Algorithm
the

7.1

could

he

written

more generally
can he

hut

doing
to the

so would
result of

obscure
hinction

simple process

Division

hashing

applied

fold

7.42

Collision -Resolution

Strategies
determines
to the to

collision-resoLution

strategy have

or

rehashing

what

happens

when
will

two

or

more elements

collision

or hash he

same address

We

begin

by defining

some parameters

that will

used

help describe these

Strategies

We
nine-digit

will

call

the

number

of

different Social

values Security

that

key

can

assume

integer

for example

numher

has

1000000000

BTEX0000289

318

c/tapir

Sets

conat

bucketsize tablesize

User User

supplied supplied
It

The
must he

size

of

the

hash

table tablesize
to

is

second elements
in

important

parameter
to

Li
rehash
at

large

enough
of

hold
that

the
is

number
actually

of

we
table

wish
varies
is

store with time


fraction

The number
type bucket array bucketsize stdelement of of the
is

records

stored

the

svhicl

and

is

dent
table

ted

ii

One
contains

of the
at

most important any time This

parameters
is

the

is

found

that

records

called

the

load factor

address reque used


to

and
var table array .tablesize of

written

at
In

tablesize 7.3

We
The

bucket Figure 7.16


In

3/7
the

7.3 of

summary
and

keys
are

our

data
in

elements hash
table

are

chosen
is

from

different

values
is

elements
full

stored

the

that

of size

tab/rize

and

pro
var begir

100%

more general form


position called table
is

of

hash

table

is

ohtained Each
array

by allowing of these

each

hash

table
is

to

hold

more
and

than

single

record

multirecord of such

cells

bucket shown

can

hold

records

An

representation

hash

if

in

Figure
of

718
tables access as collections devices to

the of buckets
as
is

The concept
that bucket ______________________ are stored

hash
direct

important disks such

for tables

on
bucket

such
cell

magnetic

For
as

those
track in the

if

devices

each

can

be

tied

physical

of the device

the cia

tee1

or sector
transfer

The hashing
the physically

function related

produces block
into

bucket
the

number
access

that

results

of

random
at

memory speed
tables

tee1

rec

RAM
stored

end
A1g
func

Once
rec1

there
Iluckets

the of

bucket
size to

can

be

searched one
are

or modified
of limited

high
in

greater

than the

use
to

hash

in

RAM The
will the

tend
discuss table

slow

average one
table

access
in

time

records Bear
size
in

when

searching

We
that

only bash

buckets

of size
is

this

chapter
of

mind however
proct var
st

we

discuss

of

buckets

one
approaches second
positit

the
The
first

strategies

for resolving

collisions

will

be grouped attempts
into

into three to place

approach keys
that linked the linked
is

open
that

address methods1
to tbe

and
in

begin
star

subsequent
in

basb

one

table

location

some other

the

table has to

unoccupied
list

open
home
in

The second
hash

approach extenial
table third

chat
is

rtj
ft

big
Figure 7.18 Hash table of added
buckets

associated
list at its

with each address


the

address

Each

eknient
pointers

Un

The

approach
will discuss

uses

to link

together
since

different
it

buckets
of the

bash

table

We
that

coalesced

chaining

is

one

better

strategies

uses

this

technique

ens Mgi

Table address

Tabte contents

Open
Fur
all

address methods address methods


in of

of

the

open

and

their are

algorithms

we

will

use

the

ml
lii
121 131 141 151 161

empty

empty methods seek


to

9t1...data..
empty

hash

table

represented

Figure 7.12
sophistication after as

There and

several

open

address
AJI

an elemm added
requircc
it is

using varying degrees


data find
is

variety Let

of techniques
to to

37i
empty empty

an

open 227

table for

position

collision

us

return

Figure 7.16
the to

which

repeated
is

reference
that the

Figure

7.19

and

attempt
function

add

key 227

whose

easy inse

109t

..

data.

value

Recall

example

bashing

applied

gives

The

.11

FIgure Three

7.19 stored tabIeIl


at tablel Il

11227
so
that

227

mod7
and
dc/c

records

tablel3l

and

227

collides

with

374

deleted

BTEX000029O

.cectioi

i-Ictshect

Iiizp

kince

unflons

319

Linear parameter
to

rehashing
is

simple
sequential

resolution

to

the the

collision

called
at

linear
Table Table contents position position lu empty 911 empty 13
-i

store

rehashing
time
at

tu

start

search through

hash

table

the

address

which found

the

collision until
tile

occurred
table
is

with

The search continues


probe
at

until

an

open

is

fraction

or and
to

the

exhausted
is

position
is

reveals
in

an

open

It

a4factor

address

new record
the

stored there

The

result

shown
tile

Figure 7.20

request used
to

find
it

record

with key

227

generates

same search path

374

71
eniptv
1091

store are
first

15
in

We
7.3 The
7.3 g4ifferent Wesize

now

position

to

implement which
is

the

operations specihed

in

Section
7.2

operation

isfindkei

implemented

by Algorithms

and Figure
Linear 7.20 rehashing

and

procedure
vat
11

findke

ttke

kevtpe

boolean

positiOn

begin
hashtable

Fltkey
tI cells
is

Apply

bath

funrtion

hash

if

tablehj.key

-C

they

and

table

empty

then
for

Iinearrehashtkey

tables

for

those
track
in

If

they uindkev

tahlehf key
true
false

2$

then
else

is

the

hndkev

tyRAM
stored .isrching

end
Algorithm
function
7.2

Implementation

ofoperationjinc/key

using

the

hash

however

procedure linearrehashtkey
war

kevtvpe

var

it

position

oaches
xtnd and
05ltlOfl

start position

begin

start
repeat

iilthajn
is

mod
until

tablesize

tablefh.key tablelh.key
start

they

fleer

Jhttncl

iointers

or or

empty
Entire

Open
tthk

IoLanrnl .osarcbed

oiesced
tiiue

end
Algorithm 7.3 To
insert

linear rehashing

Table address

Table contents

Probes

an element or
is

we

search
the

beginning
table
is

at

the

home
For

address

until

an
empty II 12 13 911 421 374

II

use

the

methods 61seekto

empty address is found whose key an element


added

until in

exhausted
leads to

example

inserting

421

Figure
of

7.20

the Figure 7.21

We
of

have

column
to to find

to

our

illustration

hash

tablesthe
In the

number
of linear

probes
..i

16

which required whose


it

77
empty
1091

each

element

stored therein

case
this

rehashing

IS

is

easy

determine

an elements can be

home

address
as

from
in

added

information
7.4

The insen operation

implemented

shown
for the

Algorithm

We
and

will

assume two The use


of

user-supplied empty
is

values Let

key of an element

empty value

Figure
i-lash

7.21

table

and

the
to

number
find

of ele

deleted

obvious

us see

why

we

need

the

probes

required
in

an

deleted

ment

the

table

BTEX000029I

320

Chapter

Sets

procedure inserte stdelement


vat begin
position

Insert

an

element

using

Prohlen rehashing
in

linear

rehashing

pa

Figure
that to

He.key
while tablehj.key

empty
tablesize

and tableh.key

deleted

do

any key hashed


call

mcd
tableh.elt

this

phei

end
Algorithm
rehashing
Table address 7.4

Prohltm
pOsitiOn

Implementation

of

operation

insert

using

linear

two

rehash

clustering Cons

Table contents

idt

Probes Figure
7.22 in the

shows

the

result

of

adding

624
needed using

whose
to

home
an

address

is

to for

difference

in

101 III

empty
911 421 374 227

hash
are

table also

Figure 7.21

The probes
search
of the

find

empty space
to find

Only

new

kc

624

shown
that

12
131

subsequent
pathIf

linear

rehashing

624

position tioo

will retrace

same

any value

three elements subsequent

421

374

or

227 were would


ter from

deleted and
not

replaced

by

the

151

624 1091

empty

searches
the

for

624

The CX
can he
calcu

61

work

Upon

encountering
solution

location to this

marked empty problem


special
is

search

would

minate Figure 7.22 The probe sequence


searching key
is

unsuccessfully elements then have been

to

mark

positions

which when can he

deleted with
as

value 7.5

The

deletion

operation

Original position

implemented

shown

in

Algorithm

for

624

value

whose

or any other home address procedure deletetkev


VZt begin
position

keyrype

leteze

an

eten2entfron

the

hczcb

gable

l1tkey
if

Apply tkev

hash

function

table
deleted

and tableh.key

empty
Figure
hash
tabt

then iinearrehashtkey

table
end
Algorithm
function
7.5

Implementation

of operation

delete

using

the

hash

The
and
of

ex

unsucc

The drawback
hash
table

to

the

use of
the to

the

value

deleted
of

is

that

it

can

pcrtbrmat

clutter to find

up an

the

thereby

increasing
is

number
all

probes

required

ele and

general way
that the pert

ment
to

partial the

solution

reenter

legitimate

elements

periodically

mark

remaining

locations of
it

notedprin empty
hashing/rehashing searching
in for target detail strategy
is

The performance
by
the

combined makes
linear that
it

You ma
measured key
in

number of probes
the

in

values Section by
is

We
7.5

other than
will 7.3 but
at

would

examine

perfurmance
feel for the that

of
fact

rehashing

more

we
the fur

can

get

probe key

sequence
value

results

may not perform very when search of Figure

well 7.22

looking undertaken
position to

kt
where
tablesize tern will are

of

624

Since

624

mod
is

the

search probes probe

begins
are

at

in the

table

The subsequent
are

search

shown
the

Five linear

required

find

624

There

two

problems

underlying

method

coy

BTEX0000292

Sect/au

7.4

Hashed

unp/ementat/oiws

321

men

ucing

Problem rehashing
in pattern 7.22

Any key
as will to
it

that

hashes
that

to

position

say

will that

follow

the

same
Table Table contents address

ybasbing

all

other keys
the

hash

Figure
that to

follow will
is

probe

Any key shown sequence


with an
all

to

hashes

to position that

Probes

This guarantees keys


is

101
tj

tnprv 911
i2t

any key hashed


call

hashes before

have

to collide

of

the

that

previously

found

or before

empty

position

foun

We

will

121 131

this

phenomenon
Note

prlmaiy
in

clustering
7.22 that the

ll

227 cmprV

Problem
position near

Figure
the

probe

pattern

for

rehash

from

merged with
patterns

probe

pattern

for

rehash

from

position

The

CI

109t

two rehash clustering

have

merged together

phenomenon

called

secondaty
Figure
7.23

Consider
so is

Figure
the

7.23

which

is

copy

of

Figure and

7.21 There
the

is

substantial

to for

difference

in

probabilities

of positions positions

receiving will will

next

new key
to

space
find

Only

new

keys Keys

hashing hashing

into into

and
position

rehash
eventually

if necessary
arrive
at

624 were

position tion

any other

posi

227

would
would
ions

The expected
can be
calculated

number of probes
as

for

any

random key

not

yet

in

the

table

ter fromi

shown

in

Figure 7.24

operation

OrigInal posItion

hssh

Number

of

probes

Empty found

position
at

bath

table.l

fanczioa
Total

18

Figure
hash

7.24

Expected
in

number
7.23

of probes Expected

for an

unsuccessful of probes

search tS/7

in

the

table

shown

Figure

number

2.57

hash

The expected
and

number of probes
key not

for in

both

successful

target
will

key our

in

table

unsuccessful target
of rehashing Section 7.5 can

table and

searches we
will

be

measures
in to

tet

up

th

of

performance

strategies will

examine

them

more
noting
that

04

an ele

general that the

way

in

We
be

confine our

attention

here simply
the

c4ly and

performance and

improved

by eliminating

problems

we

notedprimary
measuret

secondary
to

clustering
the difficulties to

You may be tempted


other than
7.3 For linear

resolve

by introducing
table position in

step

size

We

wE
at

rehash

Stepping

new

Algorithm

Or75 but

would

become

Sng
mqtmlcen
Position

cmodm
where tablesize
are relatively the
If

tablesize

is

prime or
then
exactly

at

least

if

and pat

red

to

finc

lablesize tern will

prime
table

have

no common
at

factors
position

the

search

cover

entire

probing

each

once

without

BTEX0000293

322

Chapter

Sets

repetition highly again

This

kind

of

coverage
if

nonrepetitlous complete
position that

coverage
probed
prcihe

We

ha

desirable prohed and table

Obviously
the

table

was
the

previously duplicate pattern

were

during

same rehashing
performance
that are

sequence
If

would
cover
not

he
the

wasted
entire

would
empty

affect

the

probe

did

not

spaces

not

included

in the

pattern

would

he

discovered Although
value that not of that
is

relatively properties fact

prime of

to

the

table

size

does

give

rehash coverage and


is

technique
it

has

these

nonrepetition
the

and

complete
of

where The
since
fact

is

does

solve

or

in

even
that

improve does

problems
of these

primary

that causi

secondary

clustering

An approach

solve

one

problems

it

described

next

be such

random
an
appi

Quadratic rehashing
is

rehashing probe
at

One

method

One so
of

improving

the

performance

of collided
at

to

key value

so

home

address

i2 mod
values

values tahlesize

of

Hkev
wheref
position takes
is

on

the

until table better


is

either

the

target

key or an

empty
called the

we define

found

or

until

the
is

completely
linear nut
in

searched

This

method
it

quadratic
p1ohleni clustering
that of

rehashing
secondary
Details
visits

than it are

rehashing
solve the

because problem

solves of
it

ckey
Suppose
position
thai

clustering

does given

primary
is

of this
all

method

Radke

1970

where

shown
is

rehashing

table

locations

without

repetition

provided

tab/esize

prime

number of

the

form 4k so

c421
the table

Random
occurs simply

rehashitzg jumps

Envision
to

rehashing

strategy

that

when method
he

collision
is

randomly and
the the

new

table

position

This

called of fianc
If

12
12
624 had
its

random
random
tion

rehashing
distance to

rehash hash second key

can

be

considered or
to

to

jump
hash

from
the

original
if

position

be

second
collisions
is

applied
is

same key
until to the

and or an
to

subsequent

occur
or

the
until

However

process
the table

repeated

target
full

empty

position the target

found

is

determined have
its

he

and

not

contain

key Since
fixed

each and

c62q
the prol

key

would

patterns
value

The

own random pattern there would have random sequence


acces.ses

would
to

be

no

rehashing by
the the key

he

determined must
follow

since as

subsequent
the

with
there

the

same key value


be

same
there
is

pattern

original primary
it

Since

would
clustering to

no common
this

patterns approach
turn to

would

be

no

or

secondan
difficult

Although

the The
position that

oretically that are

appealing simpler and

appears

implement Thus
are

we

schemes
reh orig to
tJ

whose

performances

almost

as

good
hash

Douhlc

/xi.s/nig
str

Several

methods
the

exist large
is

that

attempt
of

to

approximate

the of

such

an
sia

tndom rtbashing
hs
it

Itegs

without

overhead

calculation efhcient

required
izing step

One

of

thcse

double hashing

computattonally

and

simpk

of
is

the quite

expect clos

.4

to

apply

BTEX0000294

Secno

-/

Ilasbeci

nipletuenrctriuits

323

coverage probed probe


jid

is

We

have

seen

that

the

general

pattern

for linear

probing

is

to

probe

at

were.1

woul_

not

cover not
Ci

mod mod mod

tablesize tablesize tablesize

rn

would

He

does

give

nd

rof pt prol

where The
since
fact

is

constant
is

Cc
is

in
at

our

original

discussion inefficiency

of

linear linear

rehashing rehashing
like to

that

constant

the

root

of

the

of

it

causes

fixed

probe
to to

patterns constraints

and

clustering repetition

Ideally

we would
this
is

.ese

be

random but
an

subject leads
is

on

Although
that
is

possible

such

approach
solution
at

computational

overhead size

too high each key


that of has the Table Table
COzItCIAtS

One
tformance
collided

to

compute
needs

random jump
rehashing
to

for

position that

and

Thus
the

would be
location function are

function

address

key value
values

so

different

keys hashing
starting

same

given

different
It III

oic I1key

For

example
key rood

with the hashing

empty 911 empty

tablesize

21

or an empr
ethod
it

we define

related

step

size

function SI empty 1091

called

solves of

the

ckey
Suppose
position that

mod
421
is

tablesize
in

2J
Figure 7.25
is

primary
is

to

he stored
collision

Then

421
as

collides

with 911

at

Figure

7.25

.te

it

shown

When

the

occurs

computed

ed

cahiesize

c421
so en thod ca ad
n$ is

421

mod
at

the

table

is

probed

called of

jump hash occur

mod 22mod7
If

frJoII/stort

Empty
it

tine
thc

624

had

been
its

the

key
pattern

would would

have have

also

collided different

with
that

911
is

at

position

However

rehash 624

been

bund
cy ted ed

or unt

Since

eac
and Ice sant then
is

c624
the

mod
have

rehashing
by the the

probes

would

been

at

419w

mod

coittsioaj
jcoI/isiottl

patterns

.tproach

the

mod 35mod7
The rehash
position that originally to the pattern
is

Enqwy
for the

am

to

scheme

two

keys

both

of

which
pairs step size

hashed

to

the

same

different position for

Although and produce

we can
the

find

or groups
the

of keys

hash

same
is

same

size

probability

proximate tion
etit

th

of

such

an

event
size

low

hash
fact

tables

of reasonable
of

and

good
hashing

random
in

uAJ
simpl

izing of
is

step

generator

In of

the performance
for

double and

terms

and

the

expected
close

number
to that of

probes

both

successful

unsuccessful essentially

accesses the

quite

random rehashing

Since

it

has

same

BTEX0000295

324

Chapter

Sets

performance

in

numbers
greater as

of

probes

and

lower overhead rehashing


to

in

computation
for

per

key key and


resu

probe
hashing

it

has
is

overall

efficiency 7.6
It

algorithm Algorithm

double

given

Algorithm

is

comparable

7.3

procedure douhlerehashtkey
var
start position integer

keytype

var

it

position

key

produce
Eacl acteristic

begin
start

tkey

conat
type

mod

tablesize

or doubi

lablesize pointer

User

supplied repeat
Ii

quencie

node
record el stdelement next pointer

node

mod
tahleh.key

tahiesize

may be
tkey

until

tkey

found

Obs
cussed of
in

or tahlehj.key or
start

empty
Entire

Open
table

location SearJfld

end
position .tablesize

one an

end
pointer

function Rehashing algorithm


for

var

table

arrayl

position

of

Algorithm 7.6

double

hashing

Extc

Figure 7.26
Representation
for

Algorithm
of chaining hash table

7.6

shows
function

only one
that

method

for

computing
size will In that
is

random
less than the

step

size
is

external

Any
not

randomizing hascd on
the
is

produces
original

step collision

and
division

position

of
is

the

do However
to avoid this

at

algorithm biases
Table address Table contents in
101 111 121 131 nil nil nil nil nil nil nil in

that esize

shown

efficient

and

simple
If

order

introducing

tab

should with

be
the

prime
division assures

number method
an

we use

method of computing hash


of
is

conjunction
as
If

for the

original

the table

choice

of

and

tuin

primes
is

exhaustive tableszze

search

the

without
In tb
in

repetition

ahesize primes

prime and

also

prime then

and

are

rwin

ing
is

by

act

in

how

14
151

External chaining Coales


second
is

16

approach
the table

to

the problem

of
all

collisions

called

external chaining
that

Figure
Initialized

7.27 hash table


for

to

let

position

absorb
keys

of

the

records
into

hash

to

it

Since

we

To

illustrzi

external

do
list

not
is

usually

know
data

how many
to

will the

hash

an

table

position

linked

shown region
address

in

chaining

good
of

structure
is

collect
in

records

representation

based

on

an

array

pointers

shown

Figure 7.26 and


in

rt

As an
Tabte Table contents initialized
If

example
the

let

tablesize as

suppose

that

operation

create

has cellar

The
is

hash

table

shown
is

Figure 7.27

address
101

division

hash key

function

chosen say

home

add

nil

911
nil

I-It

key

mod
keys 374
1091

Hle
assuming
After

131 nil

374

then

insertion

of

the

51
16

nil

key
1091

374
1091

mod mod mod


in

next

it

co
Ii

key key

address
result
is

911 hash

911

FIgure Hash

7.28
after

table 1091

insenion

of keys

produces duces two

the

table

shown

Figure 7.28
are not

Insertion in the

of

227

and

421

pro

position
If

\s

i4

911

collisions

the

collisions

shown

text

ket

BTEX0000296

______________

Section

7.4

Hasl.ec/

Inrplementatiozs

325

key key and


results

227 421
in

227 421

mod mod
insertion of Table address Table contents

Figure

729

Subsequent

624

nil

911s21
key produces Each
acteristics or

624

624mod7
131

nil

374
nil nil

227

the
list

result
is

shown

in
list

Figure 7.30

11 has
all

linked

The designer
any pointers records

of

the of

choices

of

list

char
single

151

as

he

or she

has

for access

listmethod
and
are

61

1091

terminauon
the
list If

double

linkage with

other
the

ordering accessed

of are

the

fre Figure
it

729
after

quencies

which
to

various
list

quite

different

I-lash

table

insertion

of keys

may he

effective

make each

self-organizing
in

227 similar there


is

and 421

Observe cussed
of
in

that

the

operations

this case are

are that

to are

those

on
lists

lists

dis
II

Chapter
that the

The only
list

differences

many

instead Table Table contents

one

and

in

which

we

are

interested

determined

by

the

hash

address

function
nil

External

chaining

has

three advantages with

over

open

address

methods

9tl421E624
121
nil

Deletions The

are

possible of

no

resulting table

problems
greater

number
be

elements than
lists

in the 1.0

can be
for the

than
is

the

table

size

13
nil

374

227

can
allocated

greater as the
in

Storage
larger that the
is

elements

dynamically

nit

grow
7.5

1091

We
in

shall

see

Section

performance
better as

of
that

external

chaining address Figure


Itash 62-i 7.30
after

executing and

afindkev continues

operation
to

than

of

open

methods

be

excellent

grows
as

beyond
in

1.0

tahle

insertion

of

key

In the

next

technique
the

collisions to

are resolved
inserted to the

they
of

are

external

chain

Li

ing
is

by adding

element
is

he

end

list

The

difference

in

how

the

list

constructed
Table address Table contents

Coalesced To
illtitrate in

chaining
empty

coalesced

chaining The hash


In
last

consider
is

the

hash
into
five

table

with seven parts


the

buckets

Ii 12

empty empty

addreys region

Il

shown

Figure 7.31
the

table

divided
the the
first

two

address
the

empty

ii

region and
address

cellar and
the

our two

example make up
each
that

addresses

make up

emptY
II

region

cellar into the

The hash
cellar
is

function

must
store

map

record
collided

address

region
at

The
their

empty
cellar
Ii

empts

only

used

to

records

with another
the division

record

iii
.1

home addresses

For our

example

we

will

use

hash

function

FIgure Hash

7.31 with
for

Hkey
assuming
After that

key each

mod
key
is

table

seven

buckets

initialized

coalesced

an

chaining

integer 27 and
is

II

inserting

key values 27
it

29

we have

Figure 7.32
position
at its

If

32

is

inserted largest

next

it

collides In

with

and
is

stored
to

in the that

empty begins

with

the

address
result
is

addition
in

added

list

home

address
the

The empty

shown
with
the

Figure 7.33 address


it

To

assist

in visualizing
is

the

process

position
If

Largest

epla

shown

in the
is

figures
in

key value

34

is

added

collides

with 29 and

placed

address

the

BTEX0000297

326

CT/ta/wee

Sets

Table address

Table Contents

Table address

Table contents

Tablc address

Table contents

7.43

Perj
Lu basi

perfect
Itt Ill

empty empty empty

In
II

empty empty 27 empty


Ill 121 131

empty empty

perfect hash

table

ha we
gis

131 lil

epla

collisions that has

Il IS enipty epla

IS 11

epla 32

SI

that

such

fun

Perfect

One such
Figure
Flash

cot

7.32
after

Figure
inserting keys 27 Results

7.33
after

Figure
inserting key

7.34
after

table

32

Result.s

inserting

key

34

applications

and

29

programmin procedure programs


st

Table address

Table contents

cntptv location

position

with
result point

the
is

largest

address
in

and

is

added

to

list

beginning

at

word
perfect

Suppo
hashi

The
to this

shown

Figure 7.34
chainitig to the has

empty
It

tip

coalesced
is

behaved
of
list

exactly that

like
at

external
its

resened
of the

WOI

epla

chainingeach
address
is

new record
insertion

added

end

begins

home
cellar

specili rese

121 131

The

next

illustrates

how

collision

is

resolved

after

the

same
not

full
If

resent
Atit ithet

Ii

37

is

added
the to

it

collides that

with
at

27 so

it

is

placed

in

location
is

and

added
Figure

151 161

to

the

end 1he

of

list

begins here

address
that

The
again

result the

shown

in

cerns which

the

ant

7.35 Figure
Results 7.35
after

point
its

he

made

is

once

record being
in

inserted position

cut he
cxl

was
insening key

since the

home address Was


address Adding
is

already

occupied
the

placed
result this

the

empty
in

increases possihle into

37

with

largest

47 produces used
to

shown

Figure 7.36 because


for the

fun hash th TIiw

The example
Table address Table contents
list

term
if

coalesced
were
at

describe
table
in

technique
it

53

added
to

to

the

hash with

Figure 7.36 begins


kill
at

would cause Note

functions

that
lists

begins cannot

21

coalesce
until of

the

list

that
is

131

however

1973h
the

that

cottlesce

after

the

cellar

number
hash
at

101 Ill

epla 4size that

The

effectivencss

coalesced
is

chaining
in

depends

on

the

choice

of
ts

cellar

perfect

Selection cellar

of
that

cellar contains

size

discussed
the

Vitter1982
table

1983
well

where
under

it

shown
of

There has propose

14% of

hash

works

varierv

IS 29

circumstances Because overliow


can is

suggested records he
solved fortn
lists

50
to

the

deletion to

problems

of

open

the
fect

times

lii 161

34

addressing Any such

schemes approach
since

without

resorting

marking
for the

records deleted
external

functions Let us
It

however
lists

more complicated
coalesce
in list

than
of

chain

Figure
Results

.36
after

ing inserting key

approach

the

can

Details past the

such

deletion to

scheme
are

are

for

keys

ti

47

which given

essentially
in

relinks

elements

element

be

deleted

of

Pascal

set

\itter

1982
our
introduction to collision-resolution

1-11ev techniques
the In

This Sections of

concludes
7.5

and

7.6

we

will

performance
functions

Before
that

we

compare these techniques do so however in Section


that collisions will not

from
7.4.3

point of view
will

where

we

introduce hashing

Llen
The
is

hash

guarantee

occurperfect

functions

function

the

intege asso

integer ation

betwee

BTEX0000298

Section

7.4

.asl.tecl

Itnpfenzet

ocelot

is

327

Z4.3

Perfect

Hashing

Functions
is

Pascal

Reserved

Words

perfect perfect hash

bashing function bashing function


having
load factor that
is

one

that

causes no

cot lisions that

minimal
operates on no

and
array begin

mod
nil

periect

hashing
perfect
is

function

table

of

10

Since

not of or

hashing needed

functions to locate

cause

case const

cllisions
that

se

are

assured key value


are

exactly is of to

one probe
course

an element
is

has

given
functions

This

very

desirable

The problem

dlv

packed procedure program


record repeat set then to type
until

that

such

not easy

construct found
are

do under
certain
in

Ierkct

hashing

functions
is

max onk he
of
the

conditions
Certain of

downto
else

One such
applications

ct.ndition

that

all

ke1

values the

known

advance

end
file

have

this

quality In

for

example
there
is

reserved reserved

or

key words begin


as
it

programming procedure

language

Pascal

are

36

words

end
the

for forward function

When
it

compiler

translating
it

program
has

scans

whether must determine programs statements word Suppose the reserved words are stored
perfect

encountered
table

reserved by
is

in

hash

accessible
in

goto
If

var while with

hashing

function

Determining only

if

word encountered The word


is

the the

scan

in

reserved
of the

word-requires
table
is

one prohc

hashed and
the

content
are the
is

label

specified

compared with found


If

the

word from we
can he

scan

If

they
the

saie
tot

reserved reserved

word was

not

certain

that

word

word
condition
of for perfect

Another
cerns the

hashing necessary
an-tount of

functions to find

is

practical

one

It

con

amount

computation

perfect

hashing

function

which

cmi he

enormous

The
with

total

computation keys
in

and
data

therefore

time
of

increases asihle into

esponennally
funcitions table that that size

the the

number
31

of

the

The number
English

map
41
is

most

frequently

occurring whereas
the

words

hash

of

approximately mappings

number of such

functions

give

unique

perfect
10

is

approximately
is

l0
In

Knuth
if

1973h
the

Thus

only one
keys
is

of each
greater
is

million

functions

suitable of

practice to find

number of
hashing
are

than

few

dozen
long on

the

amount

time

perfect

function several functions

unacceptably
for perfect perfect

most

computers Sprugnoli
Cichelli

There
has

proposals
that are

hashing but
not

functions

1977
has

proposed

minimal

1980

suggested
the
fect

functions and has given examples and some simple minimal perfect times to 1981 has proposed other minimal per compute them Jaeschke functions that avoid some problems that might arise with Cichellis method Let

us look keys

ft

idly

at

Cichellis strings

method Take
for

The

functions the

that

he

proposed words
11

are of

for

that are the

character in the

example

36
is

reserved

13 to

Pascal

see

list

margin

The hashing

function

15

gkeyfl
where

gkeyjLj

15 14

15 15 14

length

of

the

key an with each


letter the character

15 13

The
is

function integer

gx

associates

integer the
first

thus gkevl

lj
the

15

13

the

associated

with the
last

of

the

key and
7.37

gkey
shows
an

is

Elgure associ
cichellis
for

7.37 associated integer


table

integer ation

associated

with

letter

of

key Figure
Cichelli

between

letters

and

integers

found

by

Pascals

resened

words

BTEX0000299

328

ha/i/er

.Set.s

do end
else

record

As
conipi

an

example

suppose
function

that

the

word
would

begin he

were encountered

he
its

cxc
tI

packed
not then

icr

The hashing

result

pare

case 16 downto
goto
to

//Cbegin The hashing


There
integer function are several
is

IS

13

33 should

Impici
its

24 26 28 29 30

exe
th

procedure
with

simple

as

it

he The
letters
first is

Use
that that of

repeat var
in

problems however
the

looking he
of di

up

the With
in tIre

otherwise type

associated etliciencv integer


trial

with

two

or and

more more

hut

can
is

irte

reasonable
ing

second he

serious

problem

that

determin
are

11 12 13

while const
div

array

which by

should

associated

with

each

character

The

integers

found
nil

and

error using table


discussion perfect

backiraching
7.38 need

a1oritbm
he
huilt

Of
for

ar
course
the

and
set

for

associated

integer

see

Figure

only used

once
this

Cicbej problem keys


are ci

Is
tisi

16

or of

33 34
351

begin
until label

1981
In

has

good

of the

backtracking functiitnsare
of

algorithm feasible
is

summan
in

hashing

when
In

the

tki

mod
tile

36

km
function

\vn

advance
is

and

the

number
iti

records
of

stiiall

that

case

perfect

program

hashing
its

function

detertnitied

advatrce rteed

the

use

of

the

hash

table Although
resulting access

determinttion
the veer itds
iif

mae be costl
the

it

only he
res
rn lv

done once
one
pri

The

Figure
tire hash

7.38
iitile
ir

ti

hash

tahie

rei4ui

ibe
values tii

Pascal

reserved

wi

rd Exercises 7.4
Fxplain the
tcillosving

lii
lii

ternis

ii

our

iiwir

words
perfect

tnrpte to Spi Lii

trash

tuiictii

ii

tunic
ci
ill

address
in

hashing hashing

In

net

ii

in

ci illisiiin lacti

isP

rew

ii

utii

in

double

Li
Ii

ti
tsi

iaij

ir

linear

rt_liash ci

external

ehnning

ci iilesceit

tabring ci
Ci

ilie

divisi

in

trash

ttnrctii

in

i/I

key
goi
in
is

key

iii

id

ot

11

is

usually rio
ii

iii

hasir

function
iii iiivert

if

iii

has

nn

sniahi

divisors

spliin

svhv

tins

and
cliaini

iest

placed
tunctii tire

in in

eveii
iilti

ip

hash
in

ti

ninedigit test

integers

Social
functii

Seen
iii

rity

irwnihcr
ti

produ
fu

integers randonrlv
if

range

It

.. 999

vi iu

hash
trains

ire

applying

net

stttt

generated
te

keys keys

Deterirrinc

rosy

of

the

addresses

rcccivv

inrcgc

hasheij

Ci innpare using

vi iur

experimental iirrizer
uinet
ii

results
tire in

with

tire

results

that

nvi

iuld

he

ihiai

ned

perfect
values
if

rairdi

number
is

of

addresses

receiving
is

exacilv by

mashed

the

hash

perfect

randonnizer

approxiniated

7.5
For
syheie eceli
us
is

1-k

this

tIne

Ii

ad
funet

facti

ii ci invert

groups
keys
iii tire

rash

ii

in

tu

type

basil

tth

kevtvpe

array
the

.15 of char

Operatioi

Operatio

mu

integers

in

range

1999

trnpleioent

your

htsin

funcbi

in

and

deiernrtt

Otahlesi

BTEX0000300

Section

uiashi

tg

Peiforinance

329

by

its

4tered

execution
their

time

Do

the

stme

fur the

Flash

function

in

Exercise

and

com

pare

execution the perf

times
ct

Implement
its

hashing compare

function
it

described
the results 11
to

in

Section
in

7.4.3

Determine

execution
the

time

and

with

obtained
the

Exercise of integers

it
itpkulg

Use up
the with

hash

function

key 27

key 35

tm.d

store

sequence

32
in

31

23
table

tie

done

at

of

determin-

the

hash

integers
Of

are tL
var Use Use Use Use
tahle

course

array0.

11

of

integer

itre

Iichelli

lincar

rehashing

rthis

problem
keys are

douhle
external coalesced

hashing
chaining chaining with cellar size of four and the hash function

1.he

1e

perfect

k.Although
tijting.accesS7

I-tke
Ft
ir

key

mod
ahi
n-c

each

if

the

011 isbn-handling
the

strategies

determine

after

all

values lite

have
cid

been
lactor

placed

in

table

the

following

The
11w

average
tverage

number
nutnher

of of of

prohes prohes

necded needed
that

to to

hnd
find

value value

that that

is

in

the
in

tahle tahle

is

not

the

Implement
to Specihcation nntn Linear iuhle External

collection se

procedures

forms

hashitig

package

accordittg

rehashing

hashing
chaining chaining table with cellar size of

Coalesced
let
htslt

70

he

given

tahlc

array0..500
function
will

of

integer

pRin

why

and

hash

by/il

key
key integers the

ke

mod 501 The


Use
in

hash

function

for coalesced generator as at to

chaining ny

he fikeyl
of

mod 431
to store

random
the hash of

nunther
table

numbers
it

produce
futleth
ttl

sequence
of
in

Determine needed
to

plnng
t%s

the

load table

Ftctor

average

tlumher

probes

find

receivc

itlteger

the

ohtainec t4g exactly

ifrimated

7.5

Hashing Performance
this

j-

discussion

the

operations
iticludes

in

Specification that

72 do
not

are

divided

into

two
the

groups hash

The

First

group
size not

operations and

involve to

searching execute
is

table fill does

create

clear

traverse

The

effort

these

operations
OperationsJiill

depend
require

on

which

collision-resolution

strategy ancl.clear to

used

and
effort

size since

01
table

effort

Operations must he

crane

require the value

Ideterm

Oiahlesize

each

position

initialized

BTEX00003OI

330

Civiptci-

Sets

empty Operation
processing Each
the

traverse

requires

probing

OOabiesize

table

positions

and

factor

0n
of for in

elements
in

value second These


target

of

operation an

the

group

requires

searching searches
are

the

hash

table

for

hashing

key value element

element
the are

associative

either

successfttl

an

which group of

key

value insert
is

is

found

or

unsuccessful and by delete


the

The The

7.52
In additi

it

operations performance
ated

this

findkey operations
discuss

retrieve

update
determined

of

all

these therefore

primarily

associ
for

search

We
and

will

the

number of compares required


will single

ments hash

ol

successful

unsuccessful later

searches

We

out

the

delete

operation

tahi

for discussion

element
table

cor

7.5.1

Performance
expressions and
that give the

Tx
expected can

Explicit successful ferent 7.39

number
he

of

compares
Results 7.39

required
for three

for dif

Tx

unsuccessful

searches
policies are

developed
in

collision-resolution

shown see

Figures

and

7.40

Figure

shows
and

the

algebraic 7.40

expressions
the results

Knuth

1973h
the give

for

their

develop

memj
Observe those

Figure any

shcws

of graphing
will

algebraic results

expressions vers
close in to lesced

The
hasl
ci

that fur

random rehashing
hashing
for

technique

double

Expressions
the cellar
is

coalesced
result

chaining
for

are

given

in

Vitter
is

1982
same
is

Note

that

if

position
position will

not

full

the

coalesced
effort of

chaining

the

as for external

chaining
the

In as

general
that of

the

search

coalesced See
Vitter

chaining

approximately
the

now
If
ti

same

external

chaining
is

1982
all

in

which

per
itself

formance discussed

of coalesced
in

chaining

compared with
chaining considered
is

the hashing
to give

techniques
the best

th

this for

chapter

CoaLesced

shown

Figure
table as the
is

performance

the

circumstances

we

extern perfo

Linear rehashing

Cotlisionl resolution

provides
If

strategy

Unsuccessful

Successful External t/ less

oubte ha

shing

linear

rilusting

-ll
It

-lI------

of of

uY/

rules

elements
ISnihic lug hashing

and
ing

saves provit are

aba
0.5

Fxteriial

cloi

ning

cx

xx

ments
Load Factor

Figure
III

739

Algxtaaic

cxpressi

115

hi

IF

ii

Ic
iii

nxinilcr Nuhi

it

pri

ihcs

or nearly
expected

successful

md

imiisticccssful

scan_lies

table

Thes elements

Notice Figure Number


successful searches cessful
in

in

Figures

7.39

and

7.40

that

the

performance
of

curves
the

for

hashing The

example
user-defin

7.40 of probes and required


for

methods
unsuccessful hash table sucis

are

monotonicallv

increasing

functions are data

load

factor

performance
of the

cones
of the

for

lists

and

both
trees

monotunically structure

increasing

functions
It

large

number

elements

in the

The number of elements


for

may be
1.0.

unsuccessful

not under

implementors

control

than 1-lowever hasihng


the load

BTEX00003O2

SediOn

7.5

I-/cashing

Peiforrnance

331

Jkons

and

factor

may be made
of

arbitrarily the

small load

by
factor

increasing

the

table the

size For

given
of

value
for

we The

can

reduce
is

and

improve

performance

hashing

price

more memory

iccessfuI tSful
adele
the

The The

7.5.2
In

Memory
to

Requirements
it

associ for

addition

performance hashing
that

is

important
Let

to

compare
the

the

memory
of

require
in

ments hash

of various

techniques pointer
of
is

be

numher
of

buckets and
for

the

required

table

re

operation

assume

occupies

one

word

memory

that

an 3T

External chaining

element
table

occupies

words elements

memory The memory


then

requirements

hash 27

containing
for

any

open

addressing

method

coalescedchaining

required orthree
1.40

fort
dif-.

for

coalesced

chaining

Open

addressing

Figure

nw
These
in

0.5

for external

chaining
Load Factor

tir

develop expressions
table for the are

based

on

the

exressions
ejy

following

assumptions
for

Each

position

close

to

hash

open
hash

addressing
table

contains

room
pointer

one one one

element element pointer


in the

For
in in

coa
each each Figure Memory element amount
7.41 requirements uccupies of

lesced

chaining For

contains the

one

and

Note
for

that

if

position
position will

external

chaining and

hash

table for

contains

external

and one
use
is

pointer

one element
to

each

element

table

We

when same
as

an

roximately
ch

now
If

the

expressions

consider two
pointer as to

cases
rather factor

memon

pointer

the

peritself

perhaps
the

we

store

an element
of the load

than
is

the

element

then 7.41

techniques
ye

memory

required

function requires

that

shown

in the

the

best

Figure
table
is

Open
hill

addressing

always

least

memory When
as
hill

nearly

open

addressing

requires
the
is

only
is

one-third nearly

much memory

as external the

chaining of

Of course open

when

table

see Figure 7.40


chaining

performance

addressing witha

poor

In

this

case
in as

coalesced

provides
If
II

good
is

performance then
is

substantial

saving are of

memory shown

requirements
in

10

the

memory
over
is

requirements wider
full tables

Figure 7.42 and


extracts

External less of rules

chaining penalty

attractive the table

range
This to

load

factors leads in

when
for

nearly

analysis

to the

following For
small

of

thumb

constructing

hash

be

stored

RAM

elements
I- cx

and load

factors

open

addressing and

provides competitive performance


large load factors

and
ing

saves

memory

For small elements

coalesced

chain
If

provides

good performance
external

with reasonable

memory

requirements with

ele

ments
led

are large

chaining provides requirements

good

performance

minimum
number of Take
about
to for the

or

nearly

minimum memory
rules in the

These elements
for

are based
table

on
be of

the

assumption Often
that

that that
is

the
is

maximum
not to the store

can
table in

estimated compiler

case
data able

hashing

example

the symbol

used

actor
fig

The

user-defined both
It

identifiers

programs
with
to

The compiler
wide range
that

must

be

process FIgure
7.42 requirements occupies of 10

ftinctionsi

large

and

small programs
the

in the is

numbers of
load

identifiers greater

Memory
have
factor element

when
times
as

an

of

elements
the load

leg

may be possible for than 1.0 The compiler

table

overfill to

the

should

continue

operate

smoothly

Such

situations

amount

memory

pointer

BTEX00003O3

332

C/wines-

sets

are for

then handled
load factors

1w

the

use
than

of external
.0

chaining

which

continues

to

fLtnction

where
by

greater

7.5.3
\Xe will

Deletion
conclude hash
tables this section that

with

few

comments
using

about

deletion

As

discusseci

earlier

are constructed

open

addressing

techniques

pose

prohlem.s by
c/c/c/ed lent

when suhjected
record
clutters external as
it is

frequent deletions simply hash


is

deleted
This arises
just

canno up
the

be

The space preen tuslv occupied marked empty but must be marked
Itt cit

tahle

and

hurts
Ct
ill

performance isbn

NC such

prf
is

if

chainint
for

Lised
list

for

resolution chaining
deletion
is full

Ieletion

handled
prohlettt essentially of
is

any

linked has

where
For coalesced been Citce must
full since the cellar deletion
IL

eel

as

long
as
it is

as

the

cellar

The

never

can he and
the

irequ
3tttt
f-i

handled
front

for external exists

chaining
deletion

possihilip
i-tttt

coalesced given
in

lists

then
It

he handled complicated hashing with

carefully

An algorithm would
the
IigLI

\itter

1982
be

is

slightl\

niore

and
strategy

extract egics

small
tf

perfurnitnce
It

penalp When
considered

and

designing along

frequency predicted

deletit
Li

must

performance

and

memory

req

ren
lit

tents 5ect
if

tn

Th

tee \\e

svi

II

appl

several
In

hashing
theot-etical

nteth
t-csults

tLl5

the

frequency
specific

atitl\-sis

cligraplis

will

see

nv

the

apply

in

Dignptt

ease

7.6
\\e

Frequency
ftne
lists

Analysis
fret

of Digraphs
of cligraphs hetcire Lised
bitta

1/

discussed
ii

luence
anti

analysis
in

In

Section

.jt

\\

used

analysts this section

Sect

on

ST we
tour
in

search

trees
ttitr

ttd
use

Figure
\tlLiLs

7ot

NI

trees

lit

we

will hut

cantptre
tltev

Itasiting

sirttegies..-\ll

division linear \\e

ftasltittg

function double
tvith

differ

the

cttllisictn-tesctlotion

strategy

reltasltiitg

hashing
.sutuntan
LI

coalesced
of results

chaining
involving

and
all
if

external
tite

chaining
stttctui-e

will ave

conclude used
tt

data

we

Reet ini
ltxe igrapl
ts

values

to

and

the

7.6 Ihe
tiashtabte array of

flash
Itasi ttl

hinctwn
svi
II

Figure
the

dc

ftc

of

irni

showtt
tin

in

Figu
te

-c

.43

The hash

Figut function four basin

most euckel s/ce

map each
\\e
at-ct

digraph
this it

pair as

if

lettets Let

id

integers

between
the
fit-st

and and

table
for

he .tdblesize

ttitplishi

ktlknvs

cI

and

be

second

conip.

LItittctets

of

addressi ditgttplt Direct

Figure
Htslt

7.43
ci

ad

tatilv

cl1cL

addt-ess plihes
is Ott-

I.t.t

ic

cc

nit

Li

ted

its

It

tI

lows

ore

Ilt

lp

oidld1

tttdl

it

ing

shoLif

elentents
irdi

c/i

ctrd

digraphs

BTEX00003O4

Sect/ri

Ttecjttcict

luo/txi.c

ojiorapl.ia

33$

aU5

to

fttnction

where

and

ate

integers

hersveen

and

25

Finally

let

fir he computed

1d
svhee
ttA
discussed Figure crhtuques ou5l at Mu he liii

2h has and

values

hetsveen

hi

.sutiple

values

of

are

shi

sn

in

14
hash
function htr

pose

digraph

is

occupied marked
IF di
lid

mod

tahlesii

such

prob
is

Deletion deletton
an
is

where no The he handled

irthle

ie

is

to

he

s_lectt_d

so

that

ii

tb/tsszze

lets

ii

st

nail
tilt

dv

sirs

Irequenea

anahsis
.shi

resuhs
555

repi irted values


it

in

this

sect

ii

in

are the

hased
list

cII3lcce choraphs

300
htntt

tigure Neuuxtnn

die

I/i

digriphi

101

tuss

hue possihiliw
ii

ii tn

tO

tlt0fltht Figure extract egies treqLtency predicted


in

ItO shows
Ott
ci

the

expected hinan

search setrch of

leti4ths

Ow

the

lnLtr

htasltitia

strtt
tie

atiuld

and

inparist
tti

sorted

arcs the

results

as

the

Sectit

tid

nietnory

the

frequency
in

specific

Oigraph

Iigraph

Iidigraph

tic

ct

its

iii

IC
Wctiittt

4.9

we
and use Figure
\atues 7.44
if

itch

Figure
ir

7.45
if

trees tour

Figure
Its tnt

hG
ii

ft

digrtpli

ittssis

ti

ittit

adilitss 0i ihte
tiit vi

few
iii

tiecttciti tsptiiect

ri

cis

it

diurtphis

is

All

tlittiuplts It

iTt

xciii
si/v

circli

ic.tl1ih

bution

strategy

9in

ethic

$ta

tuxtl

chaining
structures Recall values
tu

data

see

Figure

4.-itt

that die

processing
rahle

1110

digraphs causes herween


3110

SI
is

distinct
Ihett
tr

he entered

into

hash

The

relationship iah/estze

etd

and

the

numher
7.47

of

digraphs

processed

with

shown

itt

Figure

Figure
hash fi.tnction

148

sht tws

the

average

titute

required

to

process
search

digraph
tree
ALsit

htr the

four it

and table
for

hashing

techniques
is

and

or comparison
fur

binary

included Direct

and

second

comparison
is

the

time required
just in

direct

addressing
in

sehente

addressing
Direct

implemented
is

like

hashing
case

with

this

ease

t11

lId
distitict

addressing
to

possible

this

hecause
This

ye

can

assign collisions
Ii

address
plifies

each

of

the

670 and
is

posslle ensures
the

digraplis that the

eliminates
at

sim

t000

2000 Digrapha

the

algorithms
price not

tturnher for

pri ihes

al

Number
digttplt

of

is

one

The

for this

requirement
with tahle

more
hash

memtn
functit
in

Processed Direct

address
the the

irtg

should

he
in

cunfused
the
in

hashing

ratdonaizes
pltces

elentent.s

stored
in

hash

Our

direct

addressing

scheme

digraphs

the

tthle

alphthetieal

order

Figure

7.47
ol

lrixttieitc\inthssis
iii ttsii

chigttphs

it

BTEX00003O5