You are on page 1of 61

Topic 6

Block Designs, Storage and


Group Testing

The next topic we introduce is that of a combinatorial design. Such designs


have been used in the design of experiments in many application areas including
agriculture (which variety gives the highest yield?), engineering (which mobile
phone is most robust?) and medicine (which medicine works best for treating
high blood pressure?). More recently the same structures have proven useful
in storage in booking systems (redundant disk arrays) and in group testing
(for identifying defective items cheaply and in mass screening campaigns for
uncommon diseases).

6.1 Block Designs


DEFINITION 6.1.1.
Consider a set of v items. From this set of v items construct b subsets, or blocks,
each with k distinct items. Suppose that any two items appear together in
exactly λ of the b blocks. Then the set of b blocks form a balanced incomplete
block design (BIBD).
EXAMPLE 6.1.1.
Let v = 7 and consider the blocks in Table 6.1. These blocks form a BIBD on
7 items with 7 blocks each of size 3 and with each item appearing in 3 blocks.
There is a unique block which contains every pair of items so λ = 1.

THEOREM 6.1.1.
In a balanced incomplete block design each item appears in exactly r blocks. Thus

vr = bk.

Counting pairs gives


λ(v − 1) = r(k − 1).
Proof. Let ri be the number of times that item i appears in the BIBD. If we
count the pairs that item i appears in in the BIBD we have that i appears in
ri blocks and there are k − 1 other items in the blocks. So there are ri (k − 1)

54
Table 6.1: The Blocks a BIBD with v = b = 7, r = k = 3 and λ = 1

1 2 3
1 4 5
1 6 7
2 4 6
2 5 7
3 4 7
3 5 6

pairs involving item i. On the other hand there are v − 1 other items in the
BIBD and item i appears with each of these λ times in the BIBD. So there are
λ(v − 1) pairs involving item i. Equating we get that ri (k − 1) = λ(v − 1) and
so we see that all of the items appear equally often in the BIBD. We call this
replication number r.
Now count the number of items in all the blocks. Again we can do this two
ways: either observe that there are b blocks each with k items, giving bk items in
total or note that there are v items and each appears r times, giving vr items in
total. Since we are counting the same thing, we have vr = bk, as required.
In view of this result, we can talk about a (v, k, λ) BIBD. If v = b then r = k
and the design is said to be a symmetric BIBD (SBIBD).
We say that an item and a block are incident if the item is an element of
the block. One easy way to represent a BIBD is by way of an incidence matrix.
This is a v × b matrix A = [aij ], with rows labelled by the elements of the v-set
and columns labelled by the blocks. The entries are given by
½
1 when item i is in block j,
aij =
0 otherwise.

The incidence matrix for the design in Table 6.1 is


 
1 1 1 0 0 0 0
 1 0 0 1 1 0 0 
 
 1 0 0 0 0 1 1 
 
A=  0 1 0 1 0 1 0  .
 0 1 0 0 1 0 1 
 
 0 0 1 1 0 0 1 
0 0 1 0 1 1 0

Recall that we write Iv for the identity matrix of order v. We will let Jv,b
denote a v × b matrix with every entry equal to 1. We usually write Jv for Jv,v .
THEOREM 6.1.2.
The incidence matrix of a BIBD satisfies

AA0 = (r − λ)Iv + λJv

and
Jv A = kJv,b .

c
°Debbie Street, 2011 55
Thus
det(AA0 ) = (r − λ)(v−1) [r + (v − 1)λ].
Conversely if A is a v × b (0,1) matrix which satisfies the above equations
then, provided k < v, A is the incidence matrix of a (v, k, λ) BIBD.
Proof. If A is the incidence matrix of a BIBD then it has k 1s per column and
all other entries are 0 so each column sum is k, establishing that Jv A = kJv,b .
The (i, j) entry of AA0 is the dot product of row i of A (corresponding to
item i) and column j of A0 . But column j of A0 is just row j of A and so the
(i, j) entry of AA0 is
Xb
ain ajn .
n=1
Since the rows of A have r 1s and the rest of the entries are 0, this sum is r if
i = j. If i 6= j then there are λ blocks that contain both items i and j and so
the sum is λ.
Now  
r λ λ ... λ
 λ r λ ... λ 
 
 
AA =  λ λ r . . . λ 
0
 .. .. .. .. 
 . . . ... . 
λ λ λ ... r
so it has the same determinant as
 
r λ − r λ − r ... λ − r
 λ r−λ 0 ... 0 
 
 λ 0 r − λ . . . 0 
 ,
 .. .. .. .. 
 . . . ... . 
λ 0 0 ... r − λ

found by subtracting the first column of AA0 from every other column. Now
add rows 2 to v of the matrix above to row 1 and we have
 
r + (v − 1)λ 0 0 ... 0
 λ r−λ 0 ... 0 
 
0  λ 0 r − λ . . . 0 
det(AA ) = det   = (r−λ)(v−1) [r+(v−1)λ],
 .. .. .. .. 
 . . . ... . 
λ 0 0 ... r − λ
as required.
Now suppose that A is a (0,1) matrix that satisfies the equations given in
the statement of the theorem. Use the matrix to define the blocks. A simple
check shows that the resulting blocks satisfy the definition of a BIBD.
Choose a block of the BIBD. We let xi be the number of other blocks in the
BIBD which intersect this block in i elements.
EXAMPLE 6.1.2.
Consider the (9,24,8,3,2) given in Table 6.2. The first block, (0,1,2), has x3 = 1,
x2 = 0, x1 = 18 and x0 = 4. The third block, (0,3,4), has x3 = 0, x2 = 3,
x1 = 15 and x0 = 5.

c
°Debbie Street, 2011 56
Table 6.2: The Blocks a (9,3,2) BIBD

0 1 2 0 1 2 0 3 4
0 3 5 0 4 6 0 5 7
0 6 8 0 7 8 1 3 4
1 3 8 1 4 5 1 5 7
1 6 7 1 6 8 2 3 6
2 3 7 2 4 7 2 4 8
2 5 6 2 5 8 3 5 6
3 7 8 4 5 8 4 6 7

THEOREM 6.1.3.
In a BIBD we have
Pk
1. i=0 xi = b − 1;
Pk
2. i=0 ixi = k(r − 1);
Pk ¡ i ¢ ¡k¢
3. i=0 2 xi = 2 (λ − 1);

4. v ≤ b.
Proof. 1. Consider any block of the design. There are b − 1 other blocks in
the design, of which x0 intersect the chosen block in 0 items, x1 intersect
it in 1 item, x2 intersect it in 2 items and so on.
2. There are k items in the chosen block and each of these items appears
r − 1 times in the BIBD other than in the chosen block. There are xi
blocks that intersect the chosen block in i items and so the result follows.
3. In a BIBD¡ we¢ know that each pair of treatments appear in λ blocks.
There are k2 pairs of treatments in the chosen block, and they must
appear together in λ − 1 other blocks of the BIBD.
¡ ¢ If a block contains xi
items from the chosen block, then it contains 2i pairs of items from the
chosen block and the result follows.
4. We prove this result by considering the mean and variance of the inter-
section size of one fixed block with each of the other blocks in the BIBD.
We know that the mean value of i, written as µ, is given by
k
X k
X
µ= ixi / xi = k(r − 1)/(b − 1).
i=0 i=0

c
°Debbie Street, 2011 57
We know that the variance, σ 2 , is given by
k
X
σ2 = (i − µ)2 xi
i=0
X¡ ¢
= i2 − 2iµ + µ2 xi
i
X¡ ¢
= (i(i − 1) + i − 2iµ + µ2 xi
i
X µ µi¶ ¶
2
= 2 + i − 2iµ + µ xi
i
2
= k(k − 1)(λ − 1) + k(r − 1) − (k 2 (r − 1)2 )/(b − 1),

using the identities proved above. We can re-arrange this to give

(b − 1)σ 2 = (b − 1)[k(k − 1)(λ − 1) + k(r − 1)] − k 2 (r − 1)2 .

Since we know that vr = bk and λ(v − 1) = r(k − 1) we see that

vr r2 (k − 1) r
b−1= −1= + − 1.
k λk k
Hence we see that (b − 1)σ 2 is a cubic in r and that the coefficient of r3 is

k(k − 1) k−1
= .
λk λ

If r = k then v = b and k(k − 1) = λ(b − 1) and hence (b − 1)σ 2 = 0. Hence


(r − k) is a factor of the cubic. Now suppose that r = λ. Then k = v and
b = λ whence (b − 1)σ 2 = 0 so that (r − λ) is a factor of the cubic. Finally
d
we observe that dr [(b − 1)σ 2 ] = 0 when r = λ, showing that (r − λ)2 is a
factor of the cubic. Putting this altogether we have that

(b − 1)σ 2 = α(r − λ)2 (r − k)

for some constant α. Equating coefficients we see that α = (k − 1)/λ and


so we have that

(b − 1)σ 2 = (k − 1)(r − λ)2 (r − k)/λ.

From the definition of variance we know that (b − 1)σ 2 ≥ 0. We know


that all the blocks have at least 2 elements so (k − 1) ≥ 0. We know that
(r − λ)2 ≥ 0 and that λ ≥ 0 from which we see that

r − k ≥ 0.

Since vr = bk, we must have that v ≤ b, as required.

c
°Debbie Street, 2011 58
6.1.1 Exercises
1. Consider the design with blocks
0 1 2 3 4 5 6 7 8
0 3 6 1 4 7 2 5 8
0 4 8 1 5 6 2 3 7
0 5 7 1 3 8 2 4 6
(a) Verify that these blocks form a BIBD.
(b) What are the parameters of the BIBD?
(c) Find the incidence matrix of the BIBD and verify the results in The-
orem 6.1.2.
(d) Evaluate the parameters x0 , x1 , x2 and x3 for each block of this
design. What do you notice?
(e) A design is said to be linked if any two blocks intersect in the same
number of items. Is this design linked?
2. A block design is a collection of blocks that need not be balanced. Show
that there are exactly three non-isomorphic block designs with v = 4,
b = 6, r = 3 and k = 2. Is any of them balanced? Are any of the designs
here linked?
3. Consider the Hadamard matrix of order 8 with first row and column equal
to 1. Let M7 be the matrix that you obtain by removing the first row and
column of H8 . Show that (M7 + J7 )/2 is the incidence matrix of a (7,3,1)
BIBD.
4. Show that the result in the previous question holds for all Hadamard
matrices. That is, show that an Hadamard matrix of order 4t gives rise
to a BIBD with v = 4t − 1, k = 2t − 1 and λ = t − 1.
5. Consider a BIBD with parameters v = 7, b = 14, r = 6, k = 3 and
λ = 2. If repeated blocks are allowed, find all essentially different designs
with these parameters. (There is clearly a design with 7 pairs of repeated
blocks. Can you have a design with just 6 pairs of repeated blocks? with
just 5 pairs of repeated blocks? This question is asking you to count the
number of blocks that are repeated twice and either find a design with
that number of repeated pairs or show that such a design does not exist.)
6. Define a block design with parameters v = 8, b = 14, r = 7, k = 4 by using
the 7 blocks of a (7,4,2) BIBD and then using the 7 blocks of a (7,3,1)
BIBD where the symbol ∞ has been adjoined to each block. Show that ¡¢
this design is balanced with λ = 3. How many times does each of the 83
possible 3-sets appear?

6.2 Redundant Disk Arrays and Booking Sys-


tems
A redundant array of inexpensive drives, so-called by the inventors (Patterson,
Gibson and Katz), or a redundant array of independent disks, a name retro-
spectively employed within the computing industry, uses two or more hard disk

c
°Debbie Street, 2011 59
drives to improve access speeds and to provide back-up in case of a disk failure.
Either name results in the familiar acronym RAID.
Fast access is important in any on-line transaction processing system such
as airline or theatre reservations or for banks. If one disk drive fails then it is
preferable if no data is permanently lost.
So how can these objectives be met?
To increase speed, subdivide each data record into several parts, each of
which is stored on a different disk. Then all parts of the data record can be
retrieved simultaneously, thereby reducing retrieval time.
To avoid having a single disk failure interrupting the operation of the system,
calculate a parity block from the different parts of the data record. Store each
of the parts, and the parity block, on different disks. Then any part can be
reconstructed from the remaining parts. Thus one disk can fail and the system
is not affected.

EXAMPLE 6.2.1.
Suppose that the data record is Di = 010100011. Partition Di into 3 equal
pieces, Di,1 , Di,2 , and Di,3 . So we have Di,1 = 010, Di,2 = 100, and Di,3 = 011.
Define the parity block by Pi = Di,1 +Di,2 +Di,3 = 010+100+011 = 101, where
addition is component-wise mod 2. Then if one of the Di,j is unavailable we can
still reconstruct it by calculating the sum of the remaining two pieces and the
parity block. For example, Di,2 = Pi + Di,1 + Di,3 = 101 + 010 + 011 = 100.

So the problem that we want to solve is the following: We have v disks in


a collection, and no two pieces associated with data record Di are to be on the
same disk. For load-balancing reasons we require that
1. Every disk has the same number of data pieces;
2. Every disk has the same number of parity pieces (parity pieces are accessed
more frequently since every time any data piece changes the parity piece
has to change as well);
3. The number of data records with one data or parity piece on disk i and an-
other or disk j is constant, independent of i and j (this ensures that if one
disk does fail, all of the remaining disks contribute equally to rebuilding
the contents of the failed disk).
We can equate this disk mapping scheme to a BIBD in the following way.
Each data record Di , 1 ≤ i ≤ b is subdivided into k pieces, Di,1 , Di,2 , . . . , Di,k
from which a parity block Pi is calculated. Find a BIBD with blocks of size
k + 1 such that there is a distinguished element in each block and such that each
of {1, 2, . . . , v} is the distinguished element equally often. Then the blocks of
the design indicate the disks on which the pieces of corresponding data record
Dj is to be held. The distinguished element indicates the drive that is to hold
the parity block.
The next example illustrates these ideas in a small and unrealistic situation.
EXAMPLE 6.2.2.
Suppose that there are v = 11 disk drives and b = 11 data records. Suppose
that we subdivide each data record into 4 pieces and that we calculate a parity
block from these parts. Thus we have 5 pieces to store for each message. One

c
°Debbie Street, 2011 60
possible allocation is indicated in Table 6.3. The distinguished element appears
in the final position in each block.

Table 6.3: The Blocks of an (11,5,2) BIBD

0 2 3 4 8
1 3 4 5 9
2 4 5 6 10
3 5 6 7 0
4 6 7 8 1
5 7 8 9 2
6 8 9 10 3
7 9 10 0 4
8 10 0 1 5
9 0 1 2 6
10 1 2 3 7
Di,1 Di,2 Di,3 Di,4 Pi

EXAMPLE 6.2.3.
Suppose that there are v = 7 disks and b = 21 data records. Suppose that
each data record is divided into 2 parts. Then the blocks need to contain 3
elements and each block needs to have one distinguished element in it; again
these are the final element of each block and we see that each element appears
as a distinguished element 3 times.

Table 6.4: The Blocks of a (7,3,3) BIBD

0 1 2 1 3 0 4 0 1
0 2 3 2 5 0 3 0 6
0 4 5 6 0 4 5 6 0
4 2 1 1 6 2 5 3 1
6 1 3 1 5 4 6 1 5
4 3 2 3 2 5 2 4 6
2 5 6 4 5 3 3 6 4
Di,1 Di,2 Pi Di,1 Di,2 Pi Di,1 Di,2 Pi

6.2.1 Exercises
1. Show that each of the (7,14,2) BIBDs that you constructed in Question
5 of the previous exercises can have its blocks rearranged to have each
element appearing as the distinguished element twice.
2. Give a simple necessary condition that relates v and b for a BIBD to
have one distinguished element in each block with equal replication of

c
°Debbie Street, 2011 61
the elements in the v-set. How might you prove that this condition is
sufficient?

6.3 Group Testing


Often blood tests are carried out to test for a disease. If the chance of any given
person having the disease is small, and the cost or time involved in testing is
relatively high, it may make sense to pool the blood from a number of individuals
and test that. If the result is negative then one test has revealed that none of
the individuals who contributed to the pooled sample have the disease. If it is
positive then further testing, perhaps on smaller groups, or perhaps on single
individuals, will need to be carried out. This is called sequential group testing.
A related problem considers the problem of conducting a number of group
tests at the same time with the goal of identifying all of the individuals who
have the disease. This is called non-adaptive group testing. Non-adaptive group
testing is the appropriate choice when each test takes a relatively long time
to conduct and so waiting for the results of an earlier test becomes too time-
consuming.
The other subdivision of the group testing problem is by the model being
used. In the probabilistic model, items are assumed to be independent and there
is a fixed probability that any item will be defective. In the combinatorial model
the total number of defective items is known, or at least an upper bound on the
total number of defectives is known. The goal of the testing in this case is to
identify the defective items.
In this section we discuss how to design non-adaptive group testing strategies
for the combinatorial model. We will assume that the testing is 100% reliable
in this section.
Formally we will say that a population of b items has d defective items and
we assume that the remaining b−d items are good. Given a group test conducted
on a subset of the b items, the test is positive if at least one of the items in the
subset is defective.

EXAMPLE 6.3.1.
Suppose that we have b = 10 items to test and that we carry out 6 tests. Table
6.5 shows one possible testing strategy (where a “1” means that the item is
included in the test and “0” means that it is not included in the test) and the
results that arise if only the first item is defective. We see that we can identify
that only item 1 is defective.
Table 6.6 shows the same testing situation but now both items 1 and 2 are
defective. Again we see that we can establish this from the results of the group
tests.
Table 6.7 shows the same testing situation but now items 1 and 4 are de-
fective. We see that we can establish that items 3, 5, 7, 8 and 9 are all good
(from test 5). If we then remove the known good items from the remaining
tests, we can see that there is at least one defective item in each of the sets
{1,2,4}, {1,2,6}, {1,6,10}, {2,4,10}, and {4,6,10}. So it is clear that no single
item is defective. If there are only 2 defectives then we know these are items 1
and 4. But any set of 3 or more from {1,2,4,6,10} could be defective and that
would be consistent with the test results that we have.

c
°Debbie Street, 2011 62
Table 6.5: The 6 tests for 10 items
Test
Item Actual 1 2 3 4 5 6
1 + 1 1 1 0 0 0
2 - 1 1 0 1 0 0
3 - 1 0 1 0 1 0
4 - 1 0 0 1 0 1
5 - 1 0 0 0 1 1
6 - 0 1 1 0 0 1
7 - 0 1 0 1 1 0
8 - 0 1 0 0 1 1
9 - 0 0 1 1 1 0
10 - 0 0 1 1 0 1
Results + + + - - -

Table 6.6: The 6 tests for 10 items


Test
Item Actual 1 2 3 4 5 6
1 + 1 1 1 0 0 0
2 + 1 1 0 1 0 0
3 - 1 0 1 0 1 0
4 - 1 0 0 1 0 1
5 - 1 0 0 0 1 1
6 - 0 1 1 0 0 1
7 - 0 1 0 1 1 0
8 - 0 1 0 0 1 1
9 - 0 0 1 1 1 0
10 - 0 0 1 1 0 1
Results + + + + - -

c
°Debbie Street, 2011 63
Table 6.7: The 6 tests for 10 items
Test
Item Actual 1 2 3 4 5 6
1 + 1 1 1 0 0 0
2 - 1 1 0 1 0 0
3 - 1 0 1 0 1 0
4 + 1 0 0 1 0 1
5 - 1 0 0 0 1 1
6 - 0 1 1 0 0 1
7 - 0 1 0 1 1 0
8 - 0 1 0 0 1 1
9 - 0 0 1 1 1 0
10 - 0 0 1 1 0 1
Results + + + + - +

For a set of b items with exactly d defectives, we would like to be able to find
a set of t tests that would uniquely identify the d defectives with t as small as
possible. We will denote the minimum number of tests required for b and d by
M (b, d). Clearly M (b, d) ≤ b since we can test each item by itself and uniquely
identify the defectives.
We can think of a (v, b, r, k, λ) BIBD as being a set of group tests: block i
lists the group tests that include item i. The tests corresponding to a (9,12,4,3,1)
BIBD are given in Table 6.8.

Table 6.8: The 9 tests for 12 items


Test Items
1 1 4 7 10
2 1 5 8 11
3 1 6 9 12
4 2 6 8 10
5 2 4 9 11
6 2 5 7 12
7 3 5 9 10
8 3 6 7 11
9 3 4 8 12

If no items are defective then all the tests are negative and that’s fine. If
exactly one item is defective then 3 tests will be positive and the three positive
tests are uniquely determined by the item that is defective. (You can check this
by listing the tests that will be positive for each item; see the exercises.)
Suppose that two items are defective. Then there are two situations that
can arise. If items 1 and 2 are defective, say, then tests 1, 2, 3, 4, 5 and 6
are all positive and so items 1 and 2 are identified as defective. If items 1
and 4 are defective, then tests 1, 2, 3, 5, and 9 are all positive. So we know
that these are the only two items that can be positive with these positive tests.

c
°Debbie Street, 2011 64
It is straightforward to check that any two defective items can be correctly
determined.
This illustrates the following result. (Note that we are assuming that there
are no false positives (an item declared positive when it is not) but a similar
result can be established if this assumption is relaxed.)
THEOREM 6.3.1.
If we know that the number of defective items is at most d then a (v, b, r, k, λ)
BIBD is a solution to the combinatorial group testing problem if and only if for
every union of d or fewer blocks, every other block contains at least 1 point not
in this union.
Since block i lists the tests that include item i, the union of d blocks lists
the tests that will be positive if the items corresponding to those blocks are
defective. If every other block contains 1 point not in this union, then at least
one of these tests will be negative and so the defective items will be correctly
identified.
Table 6.9 lists the (9,12,4,3,1) BIBD which corresponds to the tests in Table
6.8.

Table 6.9: The (9,12,4,3,1) BIBD

1 2 3 4 5 6 7 8 9
1 5 9 2 6 7 3 4 8
1 6 8 2 4 9 3 5 7
1 4 7 2 5 8 3 6 9

In fact it turns out that (v, b, r, 3, 1) BIBDs are a solution to the combina-
torial group testing problem when d = 2 with b as large as possible.
Group testing is used in other areas besides the screening of samples. For
example, in a communication network many users share a multiaccess channel.
At any given time, relatively few users are active. A bit reservation scheme
identifies the active users and then schedules them to transmit in turn, to avoid
having more than one transmission at a time. At each step of such a scheme, a
set of users is queried and the active users respond. Thus the test is the query
and the receipt of a response indicates defectives (active users) in the set.
A further extension is to the screening of clone libraries to find clones con-
taining a particular DNA sequence. Such libraries typically have a large number
of items (between 1,000 and 100,000) and tests are conducted on randomly se-
lected sets of perhaps 100 clones with some requirements being placed on the
sets, such as no pair of clones appearing together more than twice. The prop-
erties of these designs are determined by determining the expected number of
unresolved clones or the number of tests that must be conducted to expect to
resolve say 90% of the clones; we will not discuss these further here.

6.3.1 Exercises
1. For the design in Table 6.8, list the three tests that are positive for a single
defective item for each of the 12 possible defective items. Note that these
lists of 3 tests are all different.

c
°Debbie Street, 2011 65
2. Consider the design in Table 6.8.
(a) List the tests that will be positive if items 10 and 12 are both defec-
tive.
(b) List the tests that will be positive if items 8 and 12 are both defective.
(c) Suppose that tests 1, 3, 4, 6 and 8 are all positive. Which items are
defective?
3. Give the values of all of the parameters of the 5 smallest (v, b, r, 3, 1)
BIBDs. What is the ratio of the number of tests to the number of items
in each case? Comment.

6.4 References and Comments


Tables of balanced incomplete block designs may be found in Mathon and Rosa
(2007) and Abel and Grieg (2007), both of which appear in Colbourn and Dinitz
Handbook of Combinatorial Designs, CRC Press (2007).
For more on computer organisation you could look at The Essentials of
Computer Organization and Architecture by Linda Null and Julia Lobur. In
particular they write: “Because the term inexpensive is relative and can be
misleading, the proper meaning of the acronym is now generally accepted as
Redunant Array of Independent Disks” (Chapter 7, page 302).
There is a large literature on group testing in its various guises. An accessible
discussion of its use in clone testing may be found in Balding, D. J., and Torney,
D. C., The design of pooling experiments for screening a clone map., Fungal
Genetics and Biology 21 (1997), 302307. A number of interesting results may
be found in Du D.Z., and Hwang F.K., Combinatorial group testing and its
applications 2nd ed.,(2000), World Scientific, Singapore. For a recent survey of
various relevant results you could look at Chen, Hong-Bin and Hwang, Frank
K., A survey on nonadaptive group testing algorithms through the angle of
decoding, J Comb Optim 15(2008), 4959.

c
°Debbie Street, 2011 66
Topic 7

Latin Squares, Designed


Experiments and Powerline
Communication

Latin squares are a natural extension of block designs: they have symbols
arranged in a square array with restrictions on the elements that may appear
in both the rows and the columns of the array. In this topic we will look at old
and new applications of Latin squares.

7.1 Latin Squares


DEFINITION 7.1.1.
A Latin square of order n is an n×n array based on a set of n symbols such that
each symbol appears exactly once in each row of the square, and exactly once in
each column of the square.
EXAMPLE 7.1.1.
The squares in Table 7.1 are each of order 4.

Table 7.1: Two Latin squares of order 4

1 2 3 4 1 2 3 4
2 1 4 3 4 1 2 3
3 4 1 2 3 4 1 2
4 3 2 1 2 3 4 1

Probably the most familiar current example of Latin squares are (completed)
Sudoku puzzles; these are Latin squares of order 9 which have further restrictions
placed on 9 non-overlapping 3 × 3 subarrays. The Sudoku in Table 7.2 has 17
entries given yet completes uniquely. (At the moment it is believed that it is not
possible to specify only 16 squares and obtain a unique solution to the Sudoku.
There are at least 450 essentially different Sudoku that can be uniquely defined

67
with only 17 squares.) The square in Table 7.3 has 77 squares completed but
can not be uniquely completed. If more than 77 squares are completed in a
“partial” Sudoku then the completion will be unique so this is an example of a
maximal starting square without a unique completion.

Table 7.2: Example of a (currently believed to be) minimal Sudoku

3 1
6 2
7
5 1 8
2 6
3 7
4 2
3 5
7

Table 7.3: Example of a maximal almost-Sudoku

9 2 6 5 7 1 4 8 3
3 5 1 4 8 6 2 7 9
8 7 4 9 2 3 5 1 6
5 8 2 3 6 7 1 9 4
1 4 9 2 5 8 3 6 7
7 6 3 1 8 2 5
2 3 8 7 6 5 1
6 1 7 8 3 5 9 4 2
4 9 5 6 1 2 7 3 8

Sometimes problems involve placing two sets of n items into a Latin square
in such a way that when the two squares are superimposed all of the possible
ordered pairs of symbols appear exactly once in the square. We say that the
two Latin squares form a pair of mutually orthogonal Latin squares (MOLS).
Each square is said to be an orthogonal mate of the other square.
Questions like this were asked almost 400 years ago. In 1624 Bachet asked
whether it was possible to place the 16 court cards of a pack of playing cards
in a square such that each row and column and diagonal of the square had one
card of each rank and one card of each suit. An enumeration of the possible
answers appeared in 1723.
An extension of the above problem, to a situation with 36 men, 6 of each of
6 ranks and 6 from each of 6 regiments, arranged so that each row and column
of the square had one officer from each regiment and one officer of each rank
was investigated by Leonard Euler in about 1780.
EXAMPLE 7.1.2.
The first square in Table 7.1 has an orthogonal mate and the pair of squares

c
°Debbie Street, 2011 68
are given in Table 7.4. The second Latin square in Table 7.1 has no orthogonal
mate.

Table 7.4: A pair of mutually orthogonal Latin squares of order 4

1a 2c 3d 4b
2d 1b 4a 3c
3b 4d 1c 2a
4c 3a 2b 1d

Euler was unable to find a pair of mutually orthogonal Latin squares of order
6 and he conjectured that no pair of mutually orthogonal Latin squares would
exist when n ≡ 2 mod 4. This was shown to be true for n = 6 by Tarry in
1900, by exhaustive enumeration, but was shown to be false in 1960 by Euler’s
spoilers (so called; Bose, Parker, and Shrikhande by name).
The closest pair of nearly mutually orthogonal squares of order 6 yield 34 of
the 36 distinct ordered pairs; the squares are given in Table 7.5.

Table 7.5: A pair of almost mutually orthogonal Latin squares of order 6

5a 6b 3e 4f 1c 2d
2f 1e 6a 5b 3d 4c
6d 5c 1f 2e 4a 3b
4e 3f 5d 6c 2b 1a
1b 4d 2c 3a 5e 6f
3c 2a 4b 1d 6f 5e

A set of Latin squares in which every pair of squares is orthogonal is called


a set of pairwise orthogonal Latin squares. The next result shows that there can
not be more than n − 1 pairwise orthogonal squares of order n in a set of such
squares.
THEOREM 7.1.1.
There are at most n − 1 pairwise orthogonal Latin squares of order n.
Proof. Suppose that L1 , L2 , . . . , Ln are a set of n mutually orthogonal Latin
squares of order n. We can change the names of all the entries in each of the
squares if we want without changing the orthogonality properties. Thus we can
re-label as necessary so that the first row of each square contains 1, 2, . . . , n
in order. Thus the ordered pair (j, j) appears in the cell (1, j) when any two
squares are superimposed. Now let ai be the entry of cell (2,1) in square Li .
We know that ai 6= 1 as the (1,1) cell contains 1 in all of the squares. We also
know that all of the ai are distinct since the pairs (j, j) have all appeared in
the first row. Thus we must choose n distinct entries from the set {2, 3, . . . , n}
which is clearly impossible.
If a set of n − 1 pairwise orthogonal squares of order n exists then the set is
said to be complete. The next result shows that it is always possible to construct
a complete set of mutually orthogonal Latin squares of order pk over GF [pk ].

c
°Debbie Street, 2011 69
THEOREM 7.1.2.
Let n = pk for some prime p and let α be a primitive element of the field GF[n].
Let the field elements be f0 = 0, fi = αi−1 , i = 1, 2, . . . , n. Define the n × n
arrays Am by
(m)
Am = [aij ] = [fm fi + fj ],
where i, j = 0, 1, . . . , n − 1, and m = 1, 2, . . . , n − 1. Then A1 , A2 , . . . , An−1 are
a set of n − 1 MOLS of order n.
Proof. We need to check that the Am are indeed Latin squares and then we
need to check that any two distinct squares are orthogonal.
(m)
Suppose that two elements in the same row of Am are equal. Then aij =
(m)
aik when j 6= k. This means that fm fi + fj = fm fi + fk and so fj = fk when
j 6= k and this is a contradiction. Similarly if two elements in the same column
(m) (m)
of Am are equal then aij = akj when i 6= k. Thus fm fi + fj = fm fk + fj and
so fi = fk when i 6= k. Again this is a contradiction. So we have a set of n − 1
Latin squares.
Now suppose that squares A` and Am are not orthogonal. That means
that for some pair of positions we must get the same ordered pairs when we
superimpose the two squares. Thus for some i, j, u, v with (i, j) 6= (u, v) we
have
(`) (m)
(aij , aij ) = (a(`) (m)
uv , auv ).

This means that we know that

f` fi + fj = f` fu + fv

and
fm fi + fj = fm fu + fv .
Since f` 6= fm , we have that fi = fu and so i = u. This then means that fj = fv
and hence j = v. Thus we have a contradiction to the original assumption that
the squares were not orthogonal.
EXAMPLE 7.1.3.
Let n = p = 5. Then f0 = 0, f1 = 1, f2 = 3, f3 = 4, and f4 = 2. Thus we have
that
(m)
Am = [aij = fm fi + fj ],
where i, j = 0, 1, 2, 3, 4 and m = 1, 2, 3, 4. Thus A1 = (fi +fj ), A2 = (3∗fi +fj ),
A3 = (4 ∗ fi + fj ) and A4 = (2 ∗ fi + fj ). The resulting complete set of mutually
orthogonal Latin squares of order 5 appears in Table 7.6.

Table 7.6: A complete set of mutually orthogonal Latin squares of order 5

0 1 3 4 2 0 1 3 4 2 0 1 3 4 2 0 1 3 4 2
1 2 4 0 3 3 4 1 2 0 4 0 2 3 1 2 3 0 1 4
3 4 1 2 0 4 0 2 3 1 2 3 0 1 4 1 2 4 0 3
4 0 2 3 1 2 3 0 1 4 1 2 4 0 3 3 4 1 2 0
2 3 0 1 4 1 2 4 0 3 3 4 1 2 0 4 0 2 3 1

c
°Debbie Street, 2011 70
7.1.1 Exercises
1. Show that the second square in Table 7.1 does not have an orthogonal
mate.
2. Give the two possible Latin squares of order 2. Show that these are not
orthogonal.
3. Give the 12 possible Latin squares of order 3. Choose one of these. Write
down all of the squares that are orthogonal to your square.
4. Use Theorem 7.1.2 to find a pair of orthogonal Latin squares of order 3.
Are these squares unique?
5. Find a third square that is orthogonal to each of the squares in Table 7.4.
These three squares form a complete set of MOLS. Now use Theorem 7.1.2
to find a complete set of MOLS of order 4.

7.2 Designing Experiments


As we have seen, Latin squares originally arose as the solution to questions
about the arrangements of sets of cards to have certain properties. But in 1788
a Latin square was used to design an experiment to investigate whether sheep
could be fattened just as well on root vegetables as on grain (since at the time
there was a shortage of grain for bread-making). The experimenter, Crette de
Palluel, used four diets (potatoes, turnips, beet and corn) for the columns of
the square, on four breeds of sheep (Isle de France, Beauce, Champagne and
Picardy) for the rows of the square. He choose 4 sheep to slaughter on each
of four dates so that one sheep of each breed and one sheep on each diet was
slaughtered on each of 4 dates. The layout is given in Table 7.7. The results
of the experiment showed that sheep could indeed be fattened on a diet of root
vegetables.

Table 7.7: The Latin square used by de Palluel

Diet
Breed Potatoes Turnips Beet Corn
Isle de France A B C D
Beauce D A B C
Champagne C D A B
Picardy B C D A

Date killed: A=20 Feb; B=20 Mar; C=20 Apr; D=20 May

It is clear from looking at the layout in Table 7.7 that the advantage of
conducting the experiment in this way is that only 16 sheep rather than 64
sheep need to be involved in the experiment but information about the effects
of diet, breed and duration of feeding can still be answered effectively.

c
°Debbie Street, 2011 71
A systematic investigation of the use of Latin squares in the design of ex-
periments did not take place until the 1920s, however, when Sir Ronald Fisher
showed that Latin squares were a useful way to get information about the effects
of three factors on some response (in the case of the sheep feeding experiment,
the weight gain) using the smallest possible number of experimental units (in
the sheep feeding experiment, the sheep). He also showed how pairs of mutually
orthogonal Latin squares could be used to investigate the effects of four factors
simultaneously. Sometimes Latin squares are used when some of the factors may
have a bearing on the results of the experiment but they are not themselves of
immediate interest, as the next example illustrates.
EXAMPLE 7.2.1.
“Unused red light time” is defined to be time at an intersection when there are
cars stopped but no car is travelling through the intersection. The amount of
unused red light is partly a function of the traffic light signal sequence, partly
a function of time of day and partly a function of the intersection. The total
unused red light time for five different traffic signal sequences was compared
using a Latin square of order 5. The rows of the square corresponded to “time
of day” and the columns of the square corresponded to “intersection”. Thus the
rows and columns of the squares have been assigned to “blocking factors” - we
expect them to effect the results of the experiment but they are not of intrinsic
interest in themselves. One possible layout for this experiment is given in Table
7.8.

Table 7.8: Possible layout of traffic light signal sequences, represented by


A,B,C,D,E

Time of Day
Intersection 0600-0800 0800-1000 1400-1600 1600-1800 1800-2000
1 A B C D E
2 E A B C D
3 D E A B C
4 C D E A B
5 B C D E A

7.2.1 Exercises
1. An experiment was conducted to compare the time efficiency of four con-
struction methods for an electronic component. Four technicians were
available to implement the methods. The methods took about 90-120
minutes each and all four methods were to be tested in one day. To al-
low for possible fatigue, and so increasing times, during the day, a Latin
square design was used for this experiment. Give one possible layout and
show how it allows for possible differences both between technicians and
between times of day.

c
°Debbie Street, 2011 72
2. Experiments that are conducted on slow-growing organisms, such as fruit
trees, frequently have to be designed to take account of possible residual
treatment effects from earlier experiments. Suppose that 25 trees are
available for an experiment. The trees have been planted in a 5 × 5 grid,
to allow for possible fertility differences in the orchard, and the treatments
from an earlier experiment were applied as given below.

A B C D E
E A B C D
D E A B C
C D E A B
B C D E A

Show how to allocate a new set of treatments so that each treatment


appears once in each row, once in each column and once on a tree that
has received each of the earlier treatments.

7.3 Power-line Communication


The idea behind this concept is simple. Power-lines are already laid to every
home in the developed world and they are used to distribute electrical power. If
communication could be carried out using this infrastructure then there would
be no disruption for installation of a new system, local area networks could be
devised using existing cabling and more users would have access to the services
(think of the rural/urban “digital divide”).
When we were talking about codes earlier in this subject, we were (mainly)
trying to find codes that corrected errors that might affect individual digits in
each codeword independently. We saw that codes that dealt with that problem,
such as the Hamming and Reed-Solomon codes, could be interleaved to deal with
burst errors, sometimes called impulse noise, as well. But when using power-
lines to transmit communication information as well as background noise, that
can affect random bits, and impulse noise, that can affect a number of adjacent
bits, there there is also permanent narrow-band noise. This affects one particular
frequency all (or most of) the time. Typical examples are interference from
television sets or computer terminals.
In power-line communication it is the permanent narrow-band noise and the
impulse noise which are the most important sources of error.
Recall that with radio waves there are two ways that signals can be transmit-
ted. One is amplitude modulation (AM) where the signal has a fixed frequency
and the sound is represented by variation in the amplitude. The other is fre-
quency modulation (FM) where the reverse occurs. So in FM the amplitude is
fixed and the frequency is varied.
Usually signals are sent using a small bandwidth but this will be particularly
susceptible to narrow-band noise. Narrow-band noise can be countered by using
a wider range of frequencies to transmit the message. Impulse noise can be
countered by having each message transmitted for a longer period of time, in
the hopes that at least some of it will get through correctly and thus the whole
message will be successfully decoded.

c
°Debbie Street, 2011 73
The wider range of frequencies are not achieved just by varying the carrier
wave, as in FM radio, but by having the carrier wave jump between a set of n
discrete values. This is known as n-ary frequency shift keying.
Put the ideas of a number of discrete frequencies, transmitted over a number
of time slots, together to give a code suitable for transmitting information over
power-lines. In fact there is also the requirement that there be a constant
power envelope, which means that it is best if each of the discrete frequencies
appears the same number of times in each codeword. A code in which every
symbol occurs the same number of times in each codeword is called a constant
composition code. If we have n frequencies, and each frequency appears once in
each codeword, we say that the code is a permutation code.
If we want a permutation code of length n, that suggests using the rows of
a Latin square to define the codewords. Such a code would have distance n and
so be able to correct up to n − 1 errors.
EXAMPLE 7.3.1.
Consider the Latin square

A B C D E
E A B C D
D E A B C
C D E A B
B C D E A
of order 5, where we have used A, B, C, D and E to represent the 5 frequencies.
Thus the third message corresponds to having frequency D transmitted in time
slot 1, frequency E transmitted in time slot 2, frequency A transmitted in time
slot 3, frequency B transmitted in time slot 4, and frequency C transmitted in
time slot 5.
How do we go about decoding a received codeword? The simplest approach
is, for each time slot, to determine the frequency that is most likely to have been
transmitted, by saying which one has the largest envelope. But for power-line
channels, the broad-band nature of the noise can lead to several large envelopes
being detected. So a different decoding strategy, called threshold decoding, is
used. In this strategy we establish a threshold and for each time slot all fre-
quencies that are above the threshold are output. Thus there are potentially
a number of possible frequencies output at each time slot. We then consider
each of the messages and see whether all of its frequencies are in the potential
frequencies at each time slot. If they are then that message is consistent with
the received codeword. Once again, though, the distance of the code means that
there will only be one consistent message if at most d − 1 time slots are affected
by noise, and sometimes even when more than d − 1 time slots are affected.
EXAMPLE 7.3.2.
Consider the code of Example 7.3.1. Suppose that we transmit the third message
but that there is narrow-band noise on frequencies B, C and D that affects all
time slots. Then the output from the first stage of the decoding is
{(B, C, D),(B, C, D, E), (A, B, C, D), (B, C, D), (B, C, D)}.
Looking at this, we see that the first message is consistent with this in time
slots 2, 3 and 4. The second message is consistent in time slots 3, 4 and 5. The

c
°Debbie Street, 2011 74
third message is consistent in all 5 time slots, the fourth message is time slots
1, 2 and 5 and the fifth message in time slots 1, 2 and 3. So we decode to the
third message.
If there is also an impulse noise in the final time slot then the output from
the first stage of the decoding becomes
{(B, C, D),(B, C, D, E), (A, B, C, D), (B, C, D), (A, B, C, D, E)}.
The first message is consistent in 4 places, the second in 3, the third in 5, the
fourth in 3 and the fifth in 4. So we still receive the correct message.

Permutation codes with n = d and Latin squares are equivalent. But not all
permutation codes have n = d. The code in the following table has n = 7 but
d = 5, for example.

0125643 0143256 0263541 0246315 0324516


0362154 0412635 0456123 0531462 0654231
0615324
In fact this array is much more since each row can be developed mod 7 to
give a total of 77 rows with distance 5. Thus it can be used to find a permutation
code with 77 codewords with n = 7 and d = 5.

7.3.1 Exercises
1. Consider the code of Example 7.3.1. Suppose that there is narrow-band
noise on frequencies A, B and C across all time slots and that there is
impulse noise in the fifth time slot so that all frequencies appear in that
time slot. Suppose that the second message was sent. What is the output
from the first stage of the decoding? For each codeword give the time
slots in which it is consistent with the output from the first stage of the
decoding process.
2. Consider a permutation code with n = 4 obtained from the 12 rows of a
set of 3 mutually orthogonal Latin squares of order 4. What is the distance
of this code?
3. The rows of the following table give the 35 possible codewords for a per-
mutation code with n = 10 and with d = 9. Verify the distance claim for
some of the pairs of codewords.

0251467938 0387125649 0732894561 0976531824 1450923867


1569847302 2063759841 2139065784 2591384670 2648173905
3126974058 3295806147 3701245896 3967082415 4512076839
4603592178 4835720916 5017634982 5249781063 5674028391
5783916204 6182507493 6329410875 6418795320 6540239718
7230618459 7802961345 7914350268 8046312597 8175649230
8794563012 9072483156 9364271580 9406158732 9857346021

c
°Debbie Street, 2011 75
7.4 References and Comments
The minimal Sudoku puzzle in Table 7.2 is one of 450 to be found at
http://rotor.di.unipi.it/cisterni/Shared%20Documents/minsudoku.html.
The maximal not-Sudoku is from Herzberg and Murty (Notices of the Amer-
ican Mathematical Society,54, 2007).
Table 7.5 A pair of almost mutually orthogonal Latin squares of order 6 is
from Horton (Journal of Combinatorial Theory, A, 16, 1974).
The Latin square used by de Palluel (1788) is reproduced in Street and Street
(Combinatorics of Experimental Design, 1987).
Possible layout of traffic light signal sequences is an example in Mason,
Gunst and Hess (Statistical Design and Analysis of Experiments, 1989). The
first exercise in Exercises 7.2.1 is a modification of one in Kuehl (Design of
Experiments: Statistical Principles of Research Designs and Analysis, 2000)).
The discussion about power-line communication closely parallels that given
in Huczynska (Philosophical Transactions of the Royal Society, A, 2006) and
Stewart (New Scientist, March 24, 2007). The small permutation codes are
from Chu, Colbourn and Dukes (Designs, Codes and Cryptography, 32, 2004).

c
°Debbie Street, 2011 76
Topic 8

Orthogonal Arrays, Secret


Sharing and Software
Testing

Some orthogonal arrays are closely related to Latin squares but the new format
employed to represent them in this topic makes it easier to consider general-
isations of the basic idea. Originally used in the design of experiments, they
have since been shown to be useful in many other areas, two of which will be
discussed in this topic.

8.1 Orthogonal Arrays


An orthogonal array OA[N, k, `, t] is an N × k array with elements from a set
of ` symbols such that any N × t subarray has each t-tuple appearing as a row
N/`t times. Often N/`t is called the index of the array, t the strength of the
array, k is the number of constraints and ` is the number of levels.
EXAMPLE 8.1.1.
Table 8.1 gives an example of an OA with N = 8, k = 4 and ` = 2. We see
that in each column there are 4 0s and 4 1s. In any pair of columns, there are
2 copies of each of the pairs (0,0), (0,1), (1,0) and (1,1); thus the array has
strength 2.
To establish that an array is of strength 3, check that any set of three
columns has each of the possible ordered triples appearing as rows equally often.
Similarly, a design has strength 4 if any set of four columns has each of the
possible ordered quadruples appearing as rows equally often.
EXAMPLE 8.1.2.
The array in Table 8.2 has N = 8, k = 4 and ` = 2 and is of strength 3.
An asymmetric orthogonal array OA[N ; `1 , `2 , . . . , `k ; t] is an N ×k array with
elements from a set of `q symbols in column q such that any N × t subarray
has each t-tuple appearing as a row an equal number of times. Such an array is
said to have strength t.

77
Table 8.1: An OA[8, 4, 2, 2]

0 0 0 0
0 0 0 1
0 1 1 0
0 1 1 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1

Table 8.2: An OA[8, 4, 2, 3]

0 0 0 0
0 0 1 1
0 1 0 1
0 1 1 0
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 1

EXAMPLE 8.1.3.
The array in Table 8.3 has N = 8, k = 5, `1 = `2 = `3 = `4 = 2 and `5 = 4. To
check that it is of strength 2, any pair of columns from the first four columns
must have each of the ordered pairs (0,0), (0,1), (1,0) and (1,1) appearing twice
in the rows. When checking each of the first 4 columns with the last, each of
the 8 possible ordered pairs must appear once. Since these conditions are met,
the array is an OA[8;2,2,2,2,4;2].

Table 8.3: An OA[8;2,2,2,2,4;2]

0 0 0 0 0
0 0 1 1 1
0 1 0 1 2
0 1 1 0 3
1 0 0 1 3
1 0 1 0 2
1 1 0 0 1
1 1 1 1 0

We will usually use “orthogonal array” for either an asymmetric or a sym-


metric array.
If we represent the two levels of each binary factor in an OA by −1 and 1

c
°Debbie Street, 2011 78
then the inner product of any two columns of the OA is 0.
Sometimes several factors will have the same number of levels and this is
often indicated by powers. So an OA[32;2,2,2,4,4;4] is written as OA[32;23 ,42 ;4].
Another common notation for an OA is to use `1 × `2 × . . . × `k //N for an
OA[N ; `1 , `2 , . . . , `k ; t], most often when t = 2 or the fact that t > 2 is not
relevant.

8.1.1 The link with Latin squares


Consider a Latin square of order n. For each cell in the square write down
a triple of elements, the row position, the column position and the entry in
the cell. The resulting n2 array is an OA[n2 , 3, n, 2]. In fact a set of mutually
orthogonal Latin squares can be used in the same way, with each square defining
one column in the associated orthogonal array.

EXAMPLE 8.1.4.
The array in Table 8.4 is an orthogonal array of strength 2 with 4 columns, each
with 3 levels.

Table 8.4: An OA[9,4,3,2]

0 0 0 0
0 1 1 1
0 2 2 2
1 0 1 2
1 1 2 0
1 2 0 1
2 0 2 1
2 1 0 2
2 2 1 0

8.1.2 The link with codes


The codewords of a linear code can be used as the rows of an orthogonal array.
There are as many rows in the orthogonal array as there are codewords in the
code, as many columns as the length of the codewords (so k = n) and the
symbols in the array are the alphabet used in the code. The only question is
how to determine the strength of the OA. It turns out that the strength of the
OA is equal to t where any set of t columns of G are linearly independent and
at least one set of t + 1 columns is linearly dependent.
EXAMPLE 8.1.5.
Consider the linear code in Example 3.1.1. The code is C = {0000, 0011, 1100, 1111}.
We see that 0011 and 1100 form a basis for this code. So we have
· ¸
0 0 1 1
G= .
1 1 0 0

c
°Debbie Street, 2011 79
We see that any column of G is non-zero but that there are pairs of columns of
G that are linearly dependent (since G has repeated columns). So the OA from
this code has strength 1.
EXAMPLE 8.1.6.
Consider the linear code with generator matrix
 
1 0 0 1 1 0 1
G= 0 1 0 1 0 1 1 .
0 0 1 0 1 1 1

The 4th, 5th and 6th columns of G are linearly dependent, but all sets of pairs
of columns are linearly independent. Thus the array formed from the codewords
is an OA[8,7,2,2] and is given in Table 8.5.

Table 8.5: An OA[8,7,2,2]

0 0 0 0 0 0 0
0 0 1 0 1 1 1
0 1 0 1 0 1 1
0 1 1 1 1 0 0
1 0 0 1 1 0 1
1 0 1 1 0 1 0
1 1 0 0 1 1 0
1 1 1 0 0 0 1

8.1.3 The link with Hadamard matrices


Consider a Hadamard matrix in which all the entries in the first row are equal
to 1. Since Hh Hh0 = hIh , we know that every other row of Hh has h/2 −1s
and h/2 1s and that any two rows of Hh will give rise to the same number of
(1, 1), (1, −1), (−1, 1) and (−1, −1) pairs in corresponding positions. So if we
write Mh for the (h − 1) × h matrix obtained by removing the first row from
Hh then Mh0 is an OA[h, h − 1, 2, 2] on the symbols −1 and 1.
EXAMPLE 8.1.7.
Table 8.6 gives an OA[8,7,2,2] constructed from an H8 .

8.1.4 The parameters of orthogonal arrays


Two questions arise naturally when thinking about orthogonal arrays. Both
are motivated by the desire to construct small but informative experiments, the
original motivation for the construction of orthogonal arrays.
The first asks, “given that we want an array with k columns, each with `
levels, and of strength t, what is the smallest number of rows that we can have
in the OA?” Thus we are asking what is the smallest experiment that we can
run given the other information. We will denote this number by F (k, `, t).

c
°Debbie Street, 2011 80
Table 8.6: An OA[8,7,2,2]

1 1 1 1 1 1 1
−1 1 −1 1 −1 1 −1
1 −1 −1 1 1 −1 −1
−1 −1 1 1 −1 −1 1
1 1 1 −1 −1 −1 −1
−1 1 −1 −1 1 −1 1
1 −1 −1 −1 −1 1 1
−1 −1 1 −1 1 1 −1

Clearly if we have an OA we can ignore columns of it and still have an OA.


But how easy is it to include additional columns in an OA? The second asks,
“given that there are N rows in an array on ` symbols and of strength t, what
is the largest value that k can take?” We will denote this number by f (N, `, t).
These numbers are related in the following way.

F (k, `, t) = min{N |f (N, `, t) ≥ k},


f (N, `, t) ≤ max{k|F (k, `, t) ≤ N }.
From the definition of an orthogonal array we know that N ≡ 0 mod `t . We
also know that f (N, `, t) ≥ t + 1 by the simple expedient of writing down all
possible `t t-sequences as a row N/`t times and adjoining one further column
which is the sum, mod `, of the other entries in the row.
We know that when p is a prime there is a set of p − 1 mutually orthogonal
Latin squares and that it is not possible to have a larger set of MOLS of order
p. We have seen that there is a correspondence between an OA[p2 , k, p, 2] and
a set of k − 2 MOLS of order p. So we know that f (p2 , p, 2) = p + 1. Thus we
also know that F (p + 1, p, 2) = p2 and that F (p + 2, p, 2) ≥ 2p2 .
We know that an OA[4s, 4s−1, 2, 2] is equivalent to an H4s and so f (4s, 2, 2) =
4s − 1. Again we know that F (4s − 1, 2, 2) = 4s and that F (4s, 2, 2) ≥ 4(s + 1).

8.1.5 Exercises
1. Construct an OA[16,5,4,2]. Hence construct an OA[16,4,4,2] and an OA[16,3,4,2].
2. Construct an OA[16,15,2,2].
3. One easy way to get an asymmetric orthogonal array is to replace each
symbol in one (or more) columns with the runs of an OA with N = `
runs. Thus a constraint with ` = 4 could have those symbols replaced by
the 4 runs of an OA[4,3,2,2]. Use this technique and the OA[16,5,4,2] to
construct an OA[16;4,4,4,2,2,2;2].
4. Suppose that ` = 5 and that t = 2. What is the largest number of columns
that can be accommodated in an OA with N = 25?
5. Above we have written “We also know that f (N, `, t) ≥ t + 1 by the simple
expedient of writing down all possible `t t-sequences as a row N/`t times

c
°Debbie Street, 2011 81
and adjoining one further column which is the sum, mod `, of the other
entries in the row.” Illustrate this construction when t = 4 for ` = 2.

8.2 Secret Sharing


Suppose that a bank has a vault that has to be opened every day but manage-
ment does not want any one employee to be able to open the vault. The bank
has three senior tellers and any two of them are deemed acceptable to jointly
open the vault. Thus a secret key is shared between three participants (the
tellers) in such a way that any two of them can determine the key but no one
of them has any useful information about the key.
The key is determined by the dealer, who is not one of the participants. Each
share comes from a share set, S, and the key comes from a set of possible keys,
K. The dealer gives the participants their share and makes publicly available
an array from which a group of participants can determine the key. The next
example looks at one possible implementation that is far too small to be used
in practice but illustrates the relevant ideas.
EXAMPLE 8.2.1.
Consider the orthogonal array in Table 8.4. Let the final column be the key
column. So there are three possible keys in this example: 0, 1 and 2. The three
participants are numbered 1, 2 and 3 and they are each given their share by
the dealer. Suppose that the dealer uses the fifth row of the array. Then the
key is 0 and the first participant gets the share 1, the second gets the share 1
and the third gets the share 2. Since the orthogonal array has strength 2, any
two participants can determine the key but participant 3, say, only knows that
rows 3 or 5 or 7 have been used to determine the key. Thus that participant
has no additional information about the key since the whole set of possible keys
appears in the final column considering all the rows in that set.
DEFINITION 8.2.1.
A perfect (t, w)-threshold scheme is a method of sharing a secret key k ∈ K
among a finite set P of w participants, so that any t participants can compute
the value of k, but no smaller group of participants can compute any information
about the value of k from the shares they hold.
Thus the scheme from the previous example is a prefect (2, 3)-threshold
scheme.
If |K| = |S| then the threshold scheme is said to be ideal.
The next result follows immediately from the definition of an OA.
THEOREM 8.2.1.
An ideal (t, w)-threshold scheme with |K| = ` is equivalent to an OA[`t , w +
1, `, t].
Of course an OA with N/`t > 1 will not give rise to a unique key for a set of
t participants and so only OAs with N/`t = 1 are useful as threshold schemes.
Sometimes sets of participants of different sizes are authorised to determine
the key. The next example illustrates this idea.
EXAMPLE 8.2.2.
Suppose that any triple of participants should be able to determine the key but

c
°Debbie Street, 2011 82
that the only pair that can determine the key should be the second and third
participants. Straight-forward checking shows that the arrays in Table 8.7 have
this property.

Table 8.7: A secret-sharing scheme

Key of 0 Key of 1
0000 0011
0113 0101
0231 0220
0322 0333
1040 1051
1153 1142
1271 1260
1362 1373
2400 2411
2513 2500
2631 2622
2722 2733
3440 3451
3553 3542
3671 3660
3762 3773

Another extension to secret-sharing schemes is to have the scheme anony-


mous, so that the secret can be calculated if t shares are known and it is not
necessary to know which participants had which shares. One way of getting such
a scheme is to partition all t-sets of a v-set into a number, n, say, of disjoint
(v, t, λ)-BIBDs. Then t = w and |K| = n. The next example illustrates this
idea.
EXAMPLE 8.2.3. ¡¢
Let t = w = 3 and let v = 6. So we partition the 63 = 20 triples into 2 disjoint
(6,3,2)-BIBDs. Thus we have two keys and we can distribute 3 shares and all 3
participants are required to determine the key. The relevant BIBDs are shown
in Table 8.8.

8.2.1 Exercises
1. Consider Example 8.2.1. What does participant 1 know about which row
has been used to determine the key? What does participant 2 know about
which row has been used to determine the key?
2. Consider the OA[18;36 ,6;2] given in Table 8.9. Suppose that the first
participant has a share of 1 and the second a share of 2. What can be said
about the key? Why does this happen? (Assume that the key is found in
the final column of each row.)
3. Consider the following (15,3,1) BIBD.

c
°Debbie Street, 2011 83
Table 8.8: A secret-sharing scheme based on disjoint BIBDs

Key of 0 Key of 1
123 125
124 126
135 134
146 136
156 145
236 234
245 235
256 246
345 356
346 456

Table 8.9: An OA[18;36 ,6;2]

0 0 0 0 0 0 0
0 0 1 1 2 2 1
0 1 0 2 2 1 2
0 1 2 0 1 2 3
0 2 1 2 1 0 4
0 2 2 1 0 1 5
1 0 0 2 1 2 5
1 0 2 0 2 1 4
1 1 1 1 1 1 0
1 1 2 2 0 0 1
1 2 0 1 2 0 3
1 2 1 0 0 2 2
2 0 1 2 0 1 3
2 0 2 1 1 0 2
2 1 0 1 0 2 4
2 1 1 0 2 0 5
2 2 0 0 1 1 1
2 2 2 2 2 2 0

c
°Debbie Street, 2011 84
(a) Show that it is a BIBD.
(b) What are the parameters of the BIBD?
(c) Show how to use this BIBD to get a perfect (2,3) threshold scheme.
(d) What are the values of |S| and |K|?

abc ahi ajk ade afg alm ano


djn beg bmo bln bhj bik bdf
ehm cmn cef cij clo cdg chk
fio dko dhl fkm dim ejo eil
gkl fjl gin gho ekn fhn gjm

4. Consider the array below. Show that the first and second, or second
and third, or second and fourth, or third and fourth participants can
determine the key but that just the first and third or just the first and
fourth participants can not determine the key.

Key of 0 Key of 1
0000 0423
0011 0432
0102 0521
0113 0530
0220 0603
0231 0612
0322 0701
0333 0710
1400 1023
1411 1032
1502 1121
1513 1130
1620 1203
1631 1212
1722 1301
1733 1310

8.3 Software Testing


Sometimes arrays that are like orthogonal arrays, but which have each ordered
t-tuple appearing at least once, rather than equally often, are useful. Such
arrays are called covering arrays.
An orthogonal array with N = `t is a smallest possible covering array as
well. For larger values of N/`t reductions in the number of rows in the array
are often possible.
EXAMPLE 8.3.1.
The array in Table 8.10 has 33 rows, 6 columns and has strength 3.
These arrays are typically used to test software where there are many choices
possible for each of a number of options. The reduction in size of the array

c
°Debbie Street, 2011 85
Table 8.10: A covering array with N = 33, k = 6, ` = 3 and t = 3

0 1 2 2 1 0
1 2 2 1 0 0
2 2 1 0 1 0
2 1 0 1 2 0
1 0 1 2 2 0
1 2 0 0 2 1
2 0 0 2 1 1
0 0 2 1 2 1
0 2 1 2 0 1
2 1 2 0 0 1
2 0 1 1 0 2
0 1 1 0 2 2
1 1 0 2 0 2
1 0 2 0 1 2
0 2 0 1 1 2
0 2 1 1 2 0
2 1 1 2 0 0
1 1 2 0 2 0
1 2 0 2 1 0
2 0 2 1 1 0
1 0 2 2 0 1
0 2 2 0 1 1
2 2 0 1 0 1
2 0 1 0 2 1
0 1 0 2 2 1
2 1 0 0 1 2
1 0 0 1 2 2
0 0 1 2 1 2
0 1 2 1 0 2
1 2 1 0 0 2
0 0 0 0 0 0
1 1 1 1 1 1
2 2 2 2 2 2

c
°Debbie Street, 2011 86
which is occasioned by not requiring balance of the occurrence of t-tuples does
not appear to reduce the efficacy of the arrays for fault identification but can
considerably reduce costs.

8.3.1 Exercises
1. Check the properties of the covering array in Table 8.10.
2. There is a covering array with 9 3-level attributes and 13 runs. Spend a
few minutes trying to find it.

8.4 References and Comments


An extensive treatment of results pertaining to orthogonal arrays can be found
in Hedayet, Sloane and Stufken Orthogonal Arrays: Theory and Applications
(1999). Many orthogonal arrays can be found on the web. The two most exten-
sive websites are located at http://www.research.att.com/∼njas/oadir/ and at
http://support.sas.com/techsup/technote/ts723.html.
A discussion of the use of orthogonal arrays and other structures in secret-
sharing can be found in Stinson, Cryptography: Theory and Practice (2006),
where Table 8.7 appears, and Handbook of Combinatorial Designs (1996). Schneier
(Applied Cryptography, 1996) discusses computer implementations of secret-
sharing schemes.
Table 8.10 is from Chateauneuf, Colbourn and Kreher, in a paper which
appeared in the journal Designs, Codes and Cryptography in 1999.

c
°Debbie Street, 2011 87
Topic 9

Graphs: Paths and Circuits

Graphs are used in many areas. They are used to represent transport or com-
munication networks, electrical circuits, draws for tournaments, block designs
and Sudoku puzzles, amongst other things.
In the next three topics we will give a brief introduction to some of these
areas.

9.1 Graphs
We are going to use the terms network and graph to refer to a diagram showing
the relationships between the items in one, or more, sets. Formal definitions are
given below.
For example the items in the set might be cities and the relationship might
be “connected by a road”. So we could get a graph like the one in Figure 9.1.
We see that there are two roads between B and C (perhaps a new freeway and
the old highway) but no road between A and D, for example.

Figure 9.1: Roads between 5 cities


A

Sometimes we want to represent the relationships between two sets of items

88
in a graph. So we might have the people who have applied for a set of jobs and
the jobs. We want to indicate which people can do which jobs in the graph, So
we could get a graph like the one in Figure 9.2. We see that A can do jobs a or
b, D can do a, b or c and E and F can each only do d.

Figure 9.2: Applicants and the jobs they can do


F

ã d

D c

C b

B a

Graphs can be used to represent networks (of roads, webpages, people) and
then to provide information about traversing these networks (the travelling
salesman problem) or provide information about sets of edges whose removal
will cause the network to become disconnected. They are also used to design
sporting fixtures.
We will now develop some formal notation and results that will facilitate
solving some of these problems.
A network consists of a finite set of items, called vertices, and a finite set of
edges. A pair of vertices, called the endpoints, are associated with each edge.
We say that an edge is incident with its endpoints and that the two endpoints
are joined by the edge.
If two different edges join the same pair of vertices then we say that the
network has multiple edges.
A loop is an edge joining a vertex to itself.
A network with no multiple edges is a graph. A graph with no loops is a
simple graph.
We can represent a network or a graph by a diagram but it is the vertices
and edges, rather than the actual “picture”, which is important. For instance,
the two diagrams in Figure 9.3 both represent the graph with vertices {1,2,3,4}
and edges{(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)}.
We say that the graphs in Figure 9.3 are labelled. We can represent a labelled
graph by an adjacency matrix. The rows and columns of the adjacency matrix
are labelled by the vertices. Position (i, j) is 1 if and only if there is an edge
joining i to j.
The adjacency matrix for the graphs in Figure 9.3 is given by

c
°Debbie Street, 2011 89
Figure 9.3: Two ways of representing the same graph

1
1

2 4
4 3

3 2

 
0 1 1 1
 1 0 1 1 
 .
 1 1 0 1 
1 1 1 0
The degree of vertex i, d(i), is the number of edges that are incident with i.
We see that all of the vertices in the graphs in Figure 9.3 have degree 3 and that
the sum of the entries in row i of the adjacency matrix is the degree of vertex i.

THEOREM 9.1.1.
The sum of the degrees of the vertices in a graph is twice the number of edges.

Proof. Each edge contributes one to the degree of each of its endpoints.

Two graphs G and H are said to be isomorphic if the number of vertices


in G is the same as the number of vertices in H and there is a one-to-one
correspondence, represented by f , between the two vertex sets such that if
vertices A and B are connected by an edge in G then vertices f (A) and f (B)
are connected by an edge in H.

EXAMPLE 9.1.1.
The graphs G and H in Figure 9.4 are isomorphic. One isomorphism is given
by 1 7→ A, 2 7→ B, 3 7→ C and 4 7→ D.

9.1.1 Exercises
1. Draw a diagram to represent a graph with vertex set {1,2,3,4,5,6} and
edge set {(1,2),(1,4),(2,5),(3,5),(5,6)}.
2. Draw a diagram to represent a graph with vertex set {1,2,3,4,5} and edge
set {(1,2),(1,3),(1,4),(1,5),(2,5),(3,5)}.

c
°Debbie Street, 2011 90
Figure 9.4: Isomorphic graphs

1
A

2 4
D C

3 B

3. Consider the block design given in Table 6.1. Have a vertex for each
treatment and a vertex for each block. An edge joins a vertex to a block if
the vertex is in the block. Draw a diagram to represent this graph. Give
the adjacency matrix for this graph. What do you notice?

4. Draw a graph where there are 6 vertices, each with degree 1; 6 vertices
each with degree 2.

5. Suppose that a graph has 10 edges. How many vertices are there in this
graph if each vertex has degree 2?

6. Find the adjacency matrix for the graphs in Figure 9.4.

7. Can this matrix be the adjacency matrix for a graph? If so, draw the
corresponding diagram. If not explain why not.
 
1 1 1 1
 1 1 1 1 
 
 1 1 1 1 .
1 1 1 1

8. Can this matrix be the adjacency matrix for a graph? If so, draw the
corresponding diagram. If not explain why not.
 
0 0 0 1
 1 0 0 1 
 
 1 0 0 1 .
0 0 0 1

9. Are these graphs isomorphic? Either give an isomorphism or explain why


they are not.

c
°Debbie Street, 2011 91
10. Are these graphs isomorphic? Either give an isomorphism or explain why
they are not.

11. Draw all the non-isomorphic graphs with 3 vertices.


12. Draw all the non-isomorphic graphs with 4 vertices.

9.2 Paths
A path from vertex A to vertex B in a network is an alternating sequence of
vertices and edges, V1 , e1 , V2 , e2 , . . . , Vn , en , Vn+1 where ei is an edge joining Vi
and Vi+1 and A = V1 and B = Vn+1 . The length of the path is n, the number
of edges in the sequence.
A path may have edges listed more than once and may re-visit a vertex. A
simple path is one in which no edge and no vertex is repeated.
EXAMPLE 9.2.1.
Consider the network below. Then AcHeDiF kEhDf CdB is a path from A to
B. We can make a simple path from A to B by omitting iF kEhD leaving the
simple path AcHeDf CdB.

c
°Debbie Street, 2011 92
B

d b
a
C A

f c

e
D H
g j

h p
n
i
E G

k m
F

The previous example illustrates a general result which we give below.

THEOREM 9.2.1.
Every path between two vertices A and B contains a simple path between the
vertices.

Proof. If A = B then the simple path is just the vertex A. So we assume that
A 6= B. If all of the vertices in the path from A to B are distinct then we have a
simple path already and we are done. If not, then suppose that Vi = Vj , where
i < j. Then in the original path we had Vi , ei , Vi+1 , ei+1 , . . . , ej−1 , Vj and since
Vi = Vj , we can delete ei , Vi+1 , ei+1 , . . . , ej−1 , Vj and we still have a path from
A to B. If there are no repeated vertices in this new path then we are done.
Otherwise we repeat the process above until we have a simple path. (We know
that we will eventually get a simple path because the number of vertices and
edges is finite.)
A network is connected if there is a path between any two vertices.

EXAMPLE 9.2.2.
Consider the graph below. We see that the vertices A and C are connected and
that the vertices B, D and E are connected but that these two sets of vertices
are not and so the graph is disconnected.
E

A cycle in a graph is a simple path that starts and ends at the same vertex.
Thus the path AbBdCf DjHcA is a cycle in the graph of Example 9.2.1.

c
°Debbie Street, 2011 93
9.2.1 Euler Paths and Circuits
Konigsberg was a town in Prussia, located near the mouth of the river Pregel.
There was also an island, Kneiphof, in the middle of the river and the land
masses were connected by bridges. The Konigsberg bridge problem asked, “Is
it possible to take a walk and cross each bridge exactly once?”
Sir Leonard Euler was interested in solving a more general problem: given
any configuration of river, islands and bridges, find a rule for deciding whether
it is possible to take a walk which crosses each bridge once.
We can represent the situation in Konigsberg by a graph, where the vertices
are land masses and the edges are bridges. This graph is shown in Figure 9.5.

Figure 9.5: Graph of bridges in Konigsberg in Euler’s time

These ideas are used to devise routes which need to traverse each road (edge)
once, say to pick up garbage (assuming that garbage on both sides of the street
can be picked up), or to test each link in a communication network.
A path in a network which includes each edge exactly once but has different
first and last vertices is called an Euler path. A path in a network which includes
every edge exactly once and has the same first and last vertex is called an Euler
circuit.
What can we say about networks in which an Euler path or an Euler circuit
exists?
Suppose that we are interested in constructing an Euler circuit in a network.
Then every time that we reach a vertex along an edge, we must have an edge
to leave that vertex, or we must have completed the circuit. So if an Euler
circuit exists then every vertex must have even degree. For an Euler path to
exist every vertex must have even degree, except for the starting and finishing
vertices, which must each have odd degree.
Suppose that we have a network in which each vertex has even degree. Can
we always find an Euler circuit in such a network? The short answer is yes; the
next example illustrates the idea.
EXAMPLE 9.2.3.
Consider the network in the first graph of Figure 9.6. All the vertices have even
degree. Consider the circuit (where we do not list the edges since there are no
multiple edges) ABCDA. Then remove this circuit to get the second graph in
Figure 9.6. Now remove the circuit BF CEB. This leaves the final graph of
Figure 9.6. We write the final circuit as DEF GD. We then take the original

c
°Debbie Street, 2011 94
circuit and include the later circuits to get the final circuit in the original graph.
So we have ABF CEBCDEF GDA as the Euler circuit in the original graph.

Figure 9.6: The graph, and then with circuits removed

B B B
A A A

C C C

G G G

D D D

F F F
E E E

THEOREM 9.2.2.
If a connected network has no vertices of odd degree then it has an Euler circuit
starting and finishing at any vertex. If a connected network has two odd vertices
then it has an Euler path starting at one of these vertices and finishing at the
other.
Proof. Suppose we take a simple path that starts and ends at the same vertex.
Suppose that we delete the edges in this path. Then we have deleted an even
number of edges at each vertex that was on the path.
Suppose that we have a network in which all vertices have even degree.
Choose a vertex A. Choose an edge from A and follow it to its other endpoint.
Find an edge from that vertex and continue in this way. Since every vertex has
even degree the only time when there can be no edge to leave a vertex by is
when we have arrived back at A. Delete all the edges in this circuit. The new
network still has all vertices of even degree. If that degree is 0 then we have
found an Euler circuit. If not repeat the process. That is, choose a vertex B on
the original circuit and find a circuit through the new network that finishes at
B. Then join the two circuits by replacing B in the first circuit by the circuit
that starts and finishes at B in the new network. Continue in this way until all
of the edges have been used.
If the original network has two odd vertices then join them by an edge. Make
this the first edge in an Euler circuit constructed as above. Remove that edge
and the result is the required Euler path.

9.2.2 Hamilton Paths and Cycles


In 1867 Sir William Rowan Hamilton invented a game called “Around the
World”. The object of the game was to find a path through all the vertices

c
°Debbie Street, 2011 95
of a dodecahedron (which represented the cities) using only the edges of the
dodecahedron in such a way that each city was only visited once.
We see that this is equivalent to finding a path through a graph such that
each vertex appears exactly once in the path. Such a path is called a Hamilton
walk. If it is closed it is a cycle otherwise it is a Hamilton path.

EXAMPLE 9.2.4.
Consider the graph below. Then it contains a Hamilton path - EBDAC, for
example.
E

Unfortunately there are not many results about the properties of graphs that
have Hamilton paths. We give one sufficient condition in the next theorem.
THEOREM 9.2.3.
Suppose that G is a simple graph with n ≥ 3 vertices. Suppose that for every
pair of non-adjacent vertices in G, A and B, say,

d(A) + d(B) ≥ n.

Then G has a Hamilton cycle.


Proof. We give a proof by contradiction. Thus we start be assuming that there
are graphs that satisfy the condition of the statement of the theorem but which
have no Hamilton cycle. Amongst this set of graphs, choose one which has a
maximal number of edges, say G. Since the complete graph has a Hamilton
cycle, we know that G is not the complete graph and so we can find a pair of
non-adjacent vertices, A and B, say. Join these vertices and get a new graph,
H, say.
Since G satisfied the statement of the theorem and had the most edges
possible without having a Hamilton cycle we know that H has a Hamilton
cycle, and we know that any such cycle must contain the new edge (since if it
did not there would be a Hamilton cycle that only involved edges in G). So
when we remove that edge we have a Hamilton path in G.
One such path could be A = V1 , V2 , V3 , . . . , Vn−1 , Vn = B. (Remember that
a Hamilton path has to pass through all the vertices in the graph exactly once
each.)
If Vj is joined to A then Vj−1 can not be joined to B since then there would be
a Hamilton cycle in G. Now there are d(A) edges from A to Vj with 2 ≤ j ≤ n.
So there are d(A) vertices Vj−1 , 1 ≤ j − 1 ≤ n − 1 not adjacent to B. Thus
d(B) ≤ (n − 1) − d(A) and so d(A) + d(B) ≤ n − 1 and this contradicts the
statement of the theorem.

c
°Debbie Street, 2011 96
EXAMPLE 9.2.5.
Consider the 3 graphs in Figure 9.7. The first graph satisfies the conditions
of the theorem and has a Hamilton cycle. The second graph does not satisfy
the conditions of the theorem and does not have a Hamilton cycle. The third
graph has a Hamilton cycle, even though it does not satisfy the conditions of
the theorem.

Figure 9.7: Three graphs

Gray Codes
The name Gray code is used to describe a listing of binary n-sequences where
two adjacent sequences have a Hamming distance of 1. Such a sequence arises
from finding a Hamilton path on an n-cube. They are used minimize the effect
of error in the conversion of analog signals to digital, in designed experiments
where there is a large cost in changing more than one attribute at a time and
if used in computers to address program memory, the computer uses less power
because fewer address lines change as the program counter advances.

EXAMPLE 9.2.6.
A Gray code of length 2 is 00, 01, 11, 10.

9.2.3 Exercises
1. Find a Hamilton path in the graph below. Is there also a Hamilton cycle?
Is there an Euler path or cycle?

c
°Debbie Street, 2011 97
B

2. Does the graph below have a Hamilton path? Is there an Euler path or
cycle?

B D E F

3. Give examples of connected graphs that satisfy each of the following con-
ditions.
(a) There is both an Euler circuit and a Hamilton cycle.
(b) There is an Euler circuit but no Hamilton cycle.
(c) There is a Hamilton circuit but no Euler cycle.
(d) There is a Hamilton path but not a Hamilton circuit. Can you ever
have a Hamilton circuit but not a Hamilton path?
4. (a) Find the Gray code of length 2 by finding a Hamilton cycle on a
square.
(b) Find a Gray code of length 3 by finding a Hamilton cycle on a cube.
(c) Find a Gray code of length 4. (You may want to work directly with
the binary 4-sequences or with a hypercube of dimension 4.)

9.3 References and Comments


There are a number of books about graph theory and a number of books with
sections on graph theory. For this topic I have used a mix of the presentations
in Dossey, Otto, Spence and Vanden Eynden Discrete Mathematics, (2006) and
A.P. Street and Wallis, Combinatorics: A First Course, (1982).

c
°Debbie Street, 2011 98
Topic 10

Trees, Hydrocarbons and


Bracket-free Arithmetic

In this topic we are going to consider a special type of graph called a tree. Trees
were used by Cayley in the 1870s to predict the existence of then-unknown
isomers of hydrocarbons. In graphs with costs or distances associated with each
edge, trees can be used to find a connected graph of lowest cost that joins all
the vertices in the graph. Binary trees are used in computer science in various
applications some of which we will discuss.

10.1 Trees
A tree is a connected network in which there are no cycles.
There are a number of immediate consequences of this definition and we
state a number of these in the following theorem.
THEOREM 10.1.1.
The following statements are equivalent.
1. T is a tree.

2. T is connected, and the number of vertices is one more than the number
of edges.
3. T has no cycles, and the number of vertices is one more than the number
of edges.
4. There is exactly one simple path between each pair of vertices in T .

5. T is connected and removing any edge of T results in a disconnected graph.


6. T has no cycles and adding an edge between any two non-adjacent vertices
results in a cycle.

10.1.1 Exercises
1. Are the following graphs trees?

99
A

B D E F

2. Suppose that a tree has 8 vertices. How many edges does it have?
3. Suppose that a tree has 8 edges. How many vertices does it have?

c
°Debbie Street, 2011 100
4. Draw all the non-isomorphic trees with 4 vertices.
5. Show that there are exactly 6 non-isomorphic trees on 6 vertices. Give
the corresponding adjacency matrices.
6. Give the adjacency matrices for the graphs in Question 1 above.
7. Draw a tree with 7 vertices that has one vertex of degree 4. What is the
longest path in your tree?
8. Draw a tree with 10 vertices with one vertex of degree 5. What is the
longest path in your tree?
9. Consider a tree with v vertices in which one vertex has degree k. Prove
that the longest path in the tree has at most v − k + 1 edges.

10.2 Spanning Trees


A spanning tree of a graph G contains all the vertices of G and a subset of the
edges of G and is a tree.
Spanning trees are often used to find a low-cost connected sub-graph of a
graph. In this case each edge has a “cost” associated with it, which may be the
cost of building a pipeline or it may be the distance of a road. The object is to
find one, or more, spanning trees whose edges have the lowest possible sum of
weights. Such a spanning tree is said to be a minimal spanning tree.
If the weights associated with the edges of a graph are the profits that are
made by using that edge, or the rate of flow along that edge, then the object
would be to find a spanning tree whose edges have the largest possible sum of
weights. Such a spanning tree is said to be a maximal spanning tree.
Algorithms for finding minimal (or maximal) spanning trees exist but will
not be covered in this subject. We will only try to determine such spanning
trees for small graphs where trial and error methods suffice.
EXAMPLE 10.2.1.
Find a spanning tree for the following graph.

c
°Debbie Street, 2011 101
B

10.2.1 Exercises
1. Find a minimal spanning tree in the graph below. Find a maximal span-
ning tree in the same graph.

c
°Debbie Street, 2011 102
B

1 A

C 5
3 2

7
H G
6 7

2
D
6 5
4

1 F

E 3

2. Suppose that the weight of the edge F H can be changed. Suppose that
it has weight 1. Does that change the answers to the previous question?
What if it has weight 10? Comment.

10.3 Hydrocarbons
Arthur Cayley was a mathematician who was also interested in chemistry and
he spent time investigating the possible structure of hydrocarbons, formed from
carbon and hydrogen atoms, and alcohols. These are derived from hydrocarbons
by replacing a hydrogen atom by a hydroxyl OH.
In particular he was interested in answering a question posed by Schorlem-
mer in the 1870s: Can there be different compounds with the same number of
hydrogen and carbon atoms but just differently arranged? Such structures are
called isomers. Mathematically the problem is “In how many different ways can
a certain number of vertices with degree 4 and vertices of degree 1 be arranged?”
Cayley produced a paper on the “mathematical theory of isomers” and inves-
tigated isomers of the alcohol with 5 carbon atoms, isobutyl carbinol. So the
mathematical problem is asking how many different trees are there with 5 carbon
atoms and one hydroxyl. At that time two were known, although Cayley’s work
predicted that there were in fact 8. At the time he noted, “the number of known
alcohols is two instead of the foregoing theoretic number eight. It is of course
no objection to the theory that the number of theoretic forms should exceed
the number of known compounds; the missing ones may be simply unknown.”
Thus we have an instance of “mathematics predicting the future”(Crilly) since

c
°Debbie Street, 2011 103
this family of 8 amyl alcohols are now all known to exist and are used in the
manufacture of synthetic flavourings, for example.

10.3.1 Exercises
1. Find a graphical representation of the 8 amyl alcohols.
2. An alkane is a tree in which each vertex has degree 4 or 1. Show that
there is one alkane with each of one, two and three vertices of degree 4,
two alkanes with 4 vertices of degree 4 and three with 5 vertices of degree
4.

10.4 Rooted Trees


If the edges in a graph each have a direction associated with them then we speak
of the graph being directed. Each vertex of a directed graph has an in-degree,
which is the number of edges entering the vertex, and an out-degree, which is
the number of edges leaving the vertex.
A rooted tree is a tree, if the directions of the edges are ignored, with a unique
vertex, called the root, which has in-degree 0. All other vertices have in-degree
1. The graph in Figure is an example of a rooted tree. This figure illustrates
the conventional way of drawing rooted trees: the root is at the top and edges
are directed down the page.

Figure 10.1: A rooted tree with 10 vertices

Because of the obvious similarities between rooted trees and family trees,
many of the same terms apply. If there is a directed edge from A to B then
B is said to be the child of A and A is said to be the parent of B. The terms
ancestor and descendent have the expected meanings. A vertex with no children
is a terminal vertex.

10.4.1 Weighing Designs


Suppose that we have a set of coins which are all supposed to be of equal weight
but in fact one coin is heavier. We want to be able to identify the heavier coin,
using a balance scale, in as few weighings as possible. (Remember that with a

c
°Debbie Street, 2011 104
balance scale either the left pan will go down, the two sides will balance or the
right pan will go down.)
We can do this by using a rooted tree, the vertex telling us which coins to
place in which pan. The maximum number of weighings that will be required
is the length of the longest path from the root to a terminal vertex.

EXAMPLE 10.4.1.
Draw possible weighings to find one heavier coin from a set of 2, 3, 4 and 5 coins
(in turn).

10.4.2 Bracket-free Arithmetic


We now want to specialise the concept of a rooted tree to a binary tree, which is
one in which each vertex has at most two children. These children are denoted
by the names right child and left child. The graph below shows the expression
A − B represented by a rooted binary tree.

c
°Debbie Street, 2011 105
-

A B

We can represent any arithmetic expression by a rooted binary tree. The


operands are always terminal vertices and the operations are the internal ver-
tices. We can construct the appropriate tree for a given expression recursively.

EXAMPLE 10.4.2.
The expression a × b + c means ab + c and so we start with a left child of ab,
a right child of c and a root of +. Then we expand the left child to have an
internal vertex of × and a left child of a and a right child of b.
Suppose instead that the expression was a × (b + c). Then the left child is
a, the right child is b + c and the root is ×. The right child is then expanded as
above.

EXAMPLE 10.4.3.
Find the tree corresponding to the expression (a + b × c) − f − de .

c
°Debbie Street, 2011 106
Having obtained a representation of an arithmetic expression by a binary
tree, we need to use this representation to evaluate the expression. A traversal
of a graph visits each vertex of a graph exactly once.
Three traversals are in common use. These are

• preorder traversal;
• postorder traversal;
• inorder traversal.
In a preorder traversal visit the parent before the children and visit the left
child before the right child.
EXAMPLE 10.4.4.
Evaluate the preorder traversals for the trees in Examples 10.4.2 and 10.4.3.

In a postorder traversal visit the children before the parent and visit the
left child before the right child. This order of writing down expressions is also
known as reverse Polish notation and is used by some hand-held calculators.
EXAMPLE 10.4.5.
Evaluate the postorder traversals for the trees in Examples 10.4.2 and 10.4.3.

In an inorder traversal the left child is visited and then the parent and then
the right child. For an expression to be evaluated correctly it may also be
necessary for some brackets to be inserted.
EXAMPLE 10.4.6.
Evaluate the inorder traversals for the trees in Examples 10.4.2 and 10.4.3.

c
°Debbie Street, 2011 107
10.4.3 Exercises
1. Consider the graph below. By directing the edges appropriately, draw the
graph as a rooted tree with each of the vertices C, E and F as the root
in turn.

E F
D C

2. Draw a rooted tree with 8 vertices with as many terminal vertices as


possible; with as many non-terminal vertices as possible.

3. Draw a rooted tree to find one counterfeit coin in a set of 10 coins. Can
you say whether the coin is heavier or lighter than the real coins? If not,
find a tree which allows you to answer this question as well.

4. Draw trees for each of the expressions a + b; a ∗ b + c ∗ d; a ∗ (b + c)/d − f .

5. Give the preorder, postorder and inorder traversals for each of the trees
in the previous question.

10.5 References and Comments


For this topic I have used a mix of the presentations in Dossey, Otto, Spence
and Vanden Eynden Discrete Mathematics, (2006) and A.P. Street and Wallis,
Combinatorics: A First Course, (1982). In particular, Theorem 10.1.1 appears
in Dossey, Otto, Spence and Vanden Eynden (2006). The information about
Cayley comes from Crilly, Arthur Cayley: Mathematician Laureate of the Vic-
torian Age, (2005).

c
°Debbie Street, 2011 108
Topic 11

Graph Colouring and


Sudoku

Consider the following problem. A company manufactures a number of chemi-


cals. It needs to have stores of these chemicals on-site but some chemicals react
dangerously in the event that they are mixed. Thus the company requires that
incompatible chemicals be stored in different warehouses. Given a set of chem-
icals with known incompatibilities, what is the smallest number of warehouses
that are required?
The set of chemicals can represent the vertices of a graph. Two vertices are
joined when they can not be stored in the same warehouse. If we use a different
colour to represent each warehouse then we want to colour the vertices of the
graph so that no vertices that are adjacent are coloured with the same colour.
When the International Biometric Society meets, it arranges to have meet-
ings of its various sub-committees. Some individuals are on several sub-committees
and so it is necessary to schedule meetings so that all the appropriate people
can attend all the relevant meetings.
In this case the vertices are the sub-committees and edges are drawn between
sub-committees that have at least one person in common. Here we use a colour
to represent a meeting time. Vertices that are joined must meet at different
times.

11.1 Graph Colouring


A colouring of a graph is an allocation of a finite set of colours to the vertices of
a graph such that no two vertices that are joined by an edge receive the same
colour.

EXAMPLE 11.1.1.
Any path requires only two colours in a colouring. Any odd cycle requires at
least three colours in any colouring.

109
The chromatic number of a graph is the smallest number of colours that are
required to colour the graph. The chromatic number of the graph G is usually
denoted by χ(G).
Determining χ(G) in general is difficult. We can get an upper bound on
χ(G) easily however.

THEOREM 11.1.1.
Consider a graph G. Suppose that it has a vertex of degree d and that no other
vertex has larger degree. Then χ(G) ≤ d + 1.
Proof. Select any vertex, V , say, and colour it. All of its neighbours can be
coloured with different colours, since there are at most d vertices adjacent to V .
Proceeding in this way completes the colouring.

EXAMPLE 11.1.2.
Consider the complete graph on 4 vertices. Every vertex has degree 3 and so
χ(K4 ) ≤ 4. We see that in fact 4 colours are needed.

EXAMPLE 11.1.3.
The graph below has vertices of degree 5 yet can be coloured with 2 colours.

11.1.1 Exercises
1. Find the chromatic numbers of each of the graphs below. Comment.

c
°Debbie Street, 2011 110
2. Find the chromatic numbers of each of the graphs below. Comment.

3. Find the chromatic numbers of each of the graphs below. Comment.

c
°Debbie Street, 2011 111
4. Find the chromatic numbers of each of the graphs below. Comment.

5. What does it mean if a graph has a chromatic number of 1?


6. Give a graph with a chromatic number of 3; of 4; of 5.
7. Show that the chromatic number of the following graph is 3.

11.2 The 4-colour Problem


We will use map to describe what we normally think of as a map showing
countries or states with the exception that we will assume that all the regions
of a country or state are contiguous (so that we don’t have to cross any borders
to get from one part of a country or state to another part of the same country
or state).
Robin Thomas (“An Update on the Four-Color Theorem”, Notices of the
American Mathematical Society, 45 (1998), 848-859) writes: “The Four-Color
Problem dates back to 1852 when Francis Guthrie, while trying to color the map
of the counties of England, noticed that four colors sufficed. He asked his brother

c
°Debbie Street, 2011 112
Frederick if it was true that any map can be colored using four colors in such
a way that adjacent regions (i.e., those sharing a common boundary segment,
not just a point) receive different colors. Frederick Guthrie then communicated
the conjecture to DeMorgan. The first printed reference is by Cayley in 1878.”
(Other authors give Cayley’s paper as Arthur Cayley, “On the colourings of
maps.”, Proc. Royal Geographical Society 1 (1879), 259-261.) The proof had
to wait until 1976 and involved a computer-based proof carried out by Appel
and Haken (Appel, Haken, and Koch, “Every Planar map is Four Colorable”,
Illinois Journal of Mathematics, 21 (1977), 439-567.)
We represent a map as a graph by placing a vertex in each region (state or
country) and joining two vertices that correspond to regions that share a border.
Then a map colouring is a vertex colouring of the corresponding graph.

11.2.1 Exercises
1. Represent mainland Australia by a graph. What is the chromatic number
of the graph?

11.3 Sudoku
Recall that we observed that Sudoku puzzles are examples of Latin squares of
order 9 with additional restrictions on various subsquares of order 3.
If we represent the 81 cells of the Sudoku as the 81 vertices of a graph, and
join two vertices that represent cells that are in the same row or in the same
column or in the same subsquare then solving a Sudoku puzzle is the same as
finding a graph colouring with 9 colours.
The next example shows this idea for a Sudoku-type puzzle with four 2 × 2
subsquares in an array of order 4.
EXAMPLE 11.3.1.
Consider the array
1 2 3 4
3 4 1 2
2 1 4 3
4 3 2 1
Then it can be represented by the graph below where the cells in the array are
numbered along each row in turn (so-called row-major order).

c
°Debbie Street, 2011 113
4
5 3

6 2

7 1

8 G

9 F

A E

B D
C

This graph came from the following Mathematica code (included for interest
only). sudo4 =
ShowGraph[
DeleteEdges[AddEdges[Cycle[16],
{{1, 3}, {1, 4}, {2, 4}, {5, 7}, {5, 8}, {6, 8}, {9, 11}, {9, 12},
{10, 12}, {13, 15}, {13, 16}, {14, 16}, {1, 5}, {1, 9}, {1, 13},
{5, 9}, {5, 13}, {9, 13}, {2, 6}, {2, 10}, {2, 14}, {6, 10}, {6, 14},
{10, 14}, {3, 7}, {3, 11}, {3, 15}, {7, 11}, {7, 15}, {11, 15},
{4, 8}, {4, 12}, {4, 16}, {8, 12}, {8, 16}, {12, 16}}],
{{4, 5}, {8, 9}, {12, 13}, {16, 1}}],
{{1, 7, 10, 16, VertexColor → Green}, {2, 8, 9, 15, VertexColor → Blue},
{3, 5, 12, 14, VertexColor → Red}},
VertexLabel → {1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, ‘E, F, G}];

11.3.1 Exercises
1. Given the partial Sudoku of order 4
1
2
4
3
Draw this partial graph colouring. Complete the colouring so that all
vertices are coloured.

11.4 References and Comments


Graph colouring is a standard topic in most books on graph theory. Dossey,
Otto, Spence and Vanden Eynden Discrete Mathematics, (2006), A.P. Street
and Wallis, Combinatorics: A First Course, (1982) and Wallis, A Beginners
Guide to Graph Theory, (2007) all have discussions of graph colouring and the
colouring of maps. The relation between Sudoku and graph colouring may be
found in Herzberg and Murty (“Sudoku Squares and Chromatic Polynomials”,
Notices of the AMS, 54 (2007), 708-717).

c
°Debbie Street, 2011 114

You might also like