You are on page 1of 15

COMPUTING

PRACTI CES
A History and Evaluation
of System R
Donald D. Chamberlin
Morton M. Astrahan
Michael W. Blasgen
James N. Gray
W. Frank King
Bruce G. Lindsay
Raymond Lorie
James W. Mehl
Thomas G. Price
Franco Putzolu
Patricia Griffiths Selinger
Mario Schkol ni ck
Donald R. Slutz
Irving L. Traiger
Bradford W. Wade
Robert A. Yost
IBM Research Laboratory
San Jose, California
1. I nt roduct i on
Thr oughout the hi st ory of infor-
mat i on storage in computers, one of
the most readi l y observable trends
has been the focus on dat a i ndepen-
dence. C.J. Dat e [27] defi ned dat a
i ndependence as "i mmuni t y of ap-
plications to change in storage struc-
ture and access st rat egy. " Moder n
dat abase systems offer dat a i ndepen-
dence by provi di ng a high-level user
interface t hr ough whi ch users deal
wi t h the i nf or mat i on cont ent of their
data, rat her t han the various bits,
pointers, arrays, lists, etc. whi ch are
used to represent t hat i nformat i on.
The system assumes responsibility
for choosing an appropri at e i nt ernal
Permi ssi on to copy wi t hout fee all or part of
this mat eri al is gr ant ed pr ovi ded t hat t he cop-
ies are not made or di st ri but ed for di rect
commer ci al advant age, t he ACM copyri ght
not i ce and t he title o f t he publ i cat i on and its
dat e appear, and not i ce is given t hat copyi ng
is by permi ssi on of t he Associ at i on for Com-
put i ng Machi nery. To copy otherwise, or to
republ i sh, requi res a fee a nd/ or specific per-
mission.
Key wor ds and phrases: dat abase manage-
ment systems, rel at i onal model , compi l at i on,
locking, recovery, access pat h selection, au-
t hori zat i on
CR Categories: 3.50, 3.70, 3.72, 4.33, 4.6
Aut hor s' address: D. D. Chamber l i n et al.,
I BM Resear ch Laborat ory, 5600 Cot t l e Road,
San Jose, Cal i forni a 95193.
1981 ACM 0001-0782/ 81/ 1000-0632 75.
632
SUMMARY: System R, an experimental database system,
was constructed to demonstrate that the usability advantages
of the relational data model can be realized in a system with
the complete function and high performance required for
everyday production use. This paper describes the three
principal phases of the System R project and discusses some
of the lessons learned from System R about the design of
relational systems and database systems in general.
represent at i on for the i nformat i on;
indeed, the represent at i on of a given
fact may change over t i me wi t hout
users being aware of the change.
The rel at i onal dat a model was
proposed by E. F. Codd [22] in 1970
as the next logical step in the t rend
t oward dat a i ndependence. Codd ob-
served t hat convent i onal dat abase
systems store i nf or mat i on in two
ways: (1) by the cont ent s of records
stored in the database, and (2) by the
ways in whi ch these records are con-
nect ed together. Di fferent systems
use various names for the connec-
tions among records, such as links,
sets, chains, parents, etc. For exam-
ple, in Fi gure l(a), the fact t hat sup-
plier Acme supplies bolts is repre-
Communi cat i ons
o f
t he ACM
sent ed by connect i ons between the
relevant part and supplier records. In
such a system, a user frames a ques-
tion, such as " What is the lowest
price for bolts?", by writing a pro-
gram whi ch "navi gat es" t hr ough the
maze of connections unt i l it arrives
at the answer to the question. The
user of a "navi gat i onal " system has
the burden (or opport uni t y) to spec-
ify exactly how the query is to be
processed; the user' s al gori t hm is
t hen embodi ed in a pr ogr am whi ch
is dependent on the dat a structure
t hat existed at the t i me the program
was written.
Rel at i onal dat abase systems, as
proposed by Codd, have two impor-
t ant properties: (1) all i nf or mat i on is
Oct ober 1981
Vol ume 24
Number 10
represented by dat a values, never by
any sort of "connect i ons" which are
visible to the user; (2) the s ys t em
supports a very high-level l anguage
in whi ch users can frame requests for
dat a wi t hout specifying algorithms
for processing the requests. The re-
l at i onal representation of the dat a in
Fi gure l(a) is shown in Fi gure l(b).
I nf or mat i on about parts is kept in a
PARTS relation in whi ch each record
has a "key" (unique identifier) called
PARTNO. I nf or mat i on about suppliers
is kept in a SUPPLIERS relation keyed
by SUPPNO. The i nformat i on which
was formerl y represented by connec-
tions bet ween records is now con-
t ai ned in a t hi rd relation, PRICES, in
which parts and suppliers are repre-
sented by their respective keys. The
quest i on " What is the lowest price
for bolts?" can be framed in a high-
level l anguage like SQL [16] as fol-
lows:
SELECT MI N( PRI CE)
FROM PRI CES
WHERE PARTNO IN
( SELECT PARTNO
FROM PARTS.
WHERE NAME = ' BOLT' ) ;
A rel at i onal system can mai nt ai n
what ever pointers, indices, or ot her
access aids it finds appropri at e for
processing user requests, but the
user' s request is not framed in terms
of these access aids and is therefore
not dependent on them. Therefore,
the system may change its dat a rep-
resentation and access aids periodi-
cally to adapt to changi ng require-
ment s wi t hout disturbing users' ex-
isting applications.
Since Codd' s original paper, the
advant ages of the rel at i onal dat a
model in terms of user product i vi t y
and dat a i ndependence have become
wi del y recognized. However, as in
the early days of high-level program-
mi ng languages, questions are some-
times raised about whet her or not an
aut omat i c system can choose as ef-
ficient an al gori t hm for processing a
complex query as a t rai ned program-
mer would. System R is an experi-
ment al system constructed at the San
Jose I BM Research Labor at or y to
demonst rat e t hat a relational data-
base system can incorporate the hi gh
performance and complete funct i on
633
SUPPLIERS
Fig. l ( a) . A "Navi gati onal " Database.
FF
p cF
requi red for everyday product i on
u s e .
The key goals established for Sys-
t em R were:
(1) To provide a high-level,
nonnavi gat i onal user interface for
maxi mum user product i vi t y and dat a
i ndependence.
(2) To support di fferent types
of dat abase use i ncl udi ng pro-
gr ammed transactions, ad hoc que-
ries, and report generation.
(3) To support a rapi dl y chang-
ing dat abase envi ronment , in whi ch
tables, indexes, views, transactions,
and ot her objects coul d easily be
added to and removed from the data-
base wi t hout stopping the system.
(4) To support a popul at i on of
many concurrent users, wi t h mecha-
nisms to protect the integrity of the
dat abase in a concurrent -updat e en-
vi ronment .
(5) To provide a means of re-
covering the cont ent s of the dat abase
to a consistent state after a failure of
har dwar e or software.
(6) To provide a flexible mech-
ani sm whereby di fferent views of
stored dat a can be defi ned and var-
ious users can be aut hori zed to query
and updat e these views.
(7) To support all of the above
functions wi t h a level of performance
comparabl e to existing lower-func-
t i on dat abase systems.
Thr oughout the System R project,
there has been a strong commi t ment
to carry the system t hrough to an
operat i onal l y complete prot ot ype
PARTS SUPPLIERS PRICES
PARTNO NAME
P107 Bol t
P113 Nut
P125 Scr ew
P132 Gear
SUPPNO NAME
$51 Acme
$57 Aj ax
$63 Amco
Fig. l ( b) . A Relational Database.
Communi cat i ons
of
the ACM
PARTNO SUPPNO PRICE
P107 $51 .59
P107 $57 .65
P113 $51 . 25
P113 $63 .21
P125 $63 .15
P132 $57 5. 25
P132 $63 10. 00
Oct ober 1981
Vol ume 24
Numbe r 10
COMPUTING
PRACTI CES
whi ch coul d be installed and evalu-
at ed in act ual user sites.
The hi st ory of Syst em R can be
di vi ded into three phases. "Phase
Zer o" of the project, whi ch occurred
duri ng 1974 and- most of 1975, in-
volved the devel opment of the SQL
user interface [14] and a quick im-
pl ement at i on of a subset of SQL for
one user at a time. The Phase Zero
prototype, described in [2], provi ded
val uabl e insight in several areas, but
its code was event ual l y abandoned.
"Phase One" of the project, which
t ook place t hr oughout most of 1976
and 1977, involved the design and
const ruct i on of the full-function,
mul t i user version of Syst em R. An
initial system architecture was pre-
sent ed in [4] and subsequent updat es
to the design were described in [10].
"Phase Two" was the eval uat i on of
System R in act ual use. Thi s oc-
curred duri ng 1978 and 1979 and
i nvol ved experi ment s at the San Jose
Research Labor at or y and several
ot her user sites. The results of some
of these experiments and user expe-
riences are described in [19-21]. At
each user site, Syst em R was installed
for experi ment al purposes only, and
not as a support ed commerci al prod-
uct.1
Thi s paper will describe the de-
cisions which were made and the
lessons l earned duri ng each of the
three phases of the Syst em R project.
2. Phase Zero: An Initial Proto-
t ype
Phase Zero of the Syst em R proj-
ect i nvol ved the quick i mpl ement a-
tion of a subset of system functions.
Fr om the beginning, it was our inten-
t i on to learn what we coul d from this
initial prot ot ype, and t hen scrap the
Phase Zero code before const ruct i on
of the more complete version of Sys-
t em R. We deci ded to use the rela-
1The System R research prototype later
evolved into SQL/ Dat a System, a relational
database management product offered by
IBM in the DOS/ VSE operating system en-
vironment.
t i onal access met hod called XRM,
whi ch had been developed by R.
Lorie at IBM' s Cambri dge Scientific
Cent er [40]. ' (XRM was influenced,
to some extent, by the " Ga mma
Zer o" interface defi ned by E. F.
Codd and others at San Jose [11].)
Since XRM is a single-user access
met hod wi t hout locking or recovery
capabilities, issues relating to con-
currency and recovery were excluded
from consi derat i on in Phase Zero.
An interpreter program was writ-
ten in P L/ I to execute st at ement s
in the high-level SQL (formerl y
SEQUEL) l anguage [14, 16] on top
of XRM. The i mpl ement ed subset
of the SQL l anguage i ncl uded que-
ries and updat es of the database, as
well as the dynami c creat i on of
new dat abase relations. The Phase
Zero i mpl ement at i on support ed the
"subquer y" construct of SQL, but
not its "joi n" construct. In effect, this
meant t hat a query coul d search
t hr ough several relations in comput -
ing its result, but the final result
woul d be t aken f r om a single rela-
tion.
The Phase Zero i mpl ement at i on
was pri mari l y i nt ended for use as a
st andal one query interface by end
users at interactive terminals. At the
time, little emphasi s was pl aced on
issues of i nt erfaci ng to host -l anguage
programs (al t hough Phase Zero
coul d be called from a P L/ I
program). However, considerable
t hought was given to the human fac-
tors aspects of the SQL language,
and an experi ment al st udy was con-
duct ed on the l earnabi l i t y and usa-
bility of SQL [44].
One of the basic design decisions
in the Phase Zero prot ot ype was t hat
the system catalog, i.e., the descrip-
t i on of the cont ent and structure of
the database, shoul d be stored as a
set of regular relations in the dat a-
base itself. Thi s approach permits the
system to keep the catalog up to dat e
aut omat i cal l y as changes are made
to the database, and also makes the
catalog i nf or mat i on available to the
system opt i mzer for use in access
pat h selection.
The structure of the Phase Zero
i nt erpret er was strongly i nfl uenced
by the facilities of XRM. XRM stores
relations i n the form of "t upl es, "
each of whi ch has a uni que 32-bit
"t upl e i dent i fi er" (TID). Since a TI D
cont ai ns a page number, it is possi-
ble, given a TI D, to fetch the asso-
ciated tuple in one page reference.
However, rat her t han act ual dat a
values, the tuple cont ai ns pointers to
the "domai ns" where the act ual dat a
is stored, as shown in Fi gure 2. Op-
tionally, each domai n may have an
"i nversi on, " whi ch associates do-
mai n values (e.g., "Pr ogr ammer ")
wi t h the TI Ds of tuples in whi ch the
values appear. Using the inversions,
XRM makes it easy to fi nd a list of
TI Ds of tuples whi ch cont ai n a given
value. For example, in Fi gure 2, i f
inversions exist on bot h the JOB and
LOCATION domai ns, XRM provides
commands to create a list of TI Ds of
empl oyees who are programmers,
and anot her list of TI Ds of empl oy-
ees who work in Evanst on. I f the
SQL query calls for programmers
who work in Evanst on, these TI D
lists can be intersected to obt ai n the
list of TI Ds of tuples whi ch satisfy
the query, before any tuples are ac-
t ual l y fetched.
The most chal l engi ng task in con-
structing the Phase Zero prot ot ype
was the design of opt i mi zer algo-
ri t hms for efficient execut i on of SQL
st at ement s on top of XRM. The de-
sign of the Phase Zero opt i mi zer is
given in [2]. The objective of the
opt i mi zer was to mi ni mi ze the num-
ber of tuples fet ched from the dat a-
base in processing a query. There-
fore, the opt i mi zer made extensive
use of inversions and oft en mani pu-
l at ed TI D lists before begi nni ng to
fetch tuples. Since the TI D lists were
pot ent i al l y large, t hey were stored as
t empor ar y objects in the dat abase
duri ng query processing.
The results of the Phase Zero
i mpl ement at i on were mixed. One
strongly felt concl usi on was t hat it is
a very good idea, in a project the size
of Syst em R, to pl an to t hrow away
the initial i mpl ement at i on. On the
positive side, Phase Zero demon-
strated the usability of the SQL lan-
guage, the feasibility of creating new
tables and inversions "on the fl y"
634 Communications
of
the ACM
October 1981
Volume 24
Number 10
and relying on an aut omat i c opti-
mizer for access pat h selection, and
the convenience of storing the system
catalog in the dat abase itself. At the
same time, Phase Zero t aught us a
number of val uabl e lessons whi ch
greatly i nfl uenced the design of our
later i mpl ement at i on. Some of these
lessons are summari zed below.
(1) The optimizer shoul d t ake
into account not just the cost of
fetching tuples, but the costs of cre-
ating and mani pul at i ng TI D lists,
t hen fetching tuples, t hen fetching
the dat a poi nt ed to by the tuples.
When these "hi dden costs" are t aken
into account, it will be seen t hat the
mani pul at i on of TI D lists is quite
expensive, especially i f the TI D lists
are managed in the dat abase rat her
t han in mai n storage.
(2) Rat her t han "number of tu-
pies fet ched, " a better measure of
cost woul d have been " number of
I / Os . " Thi s i mproved cost measure
woul d have revealed the great im-
port ance of clustering t oget her re-
l at ed tuples on physical pages so t hat
several related tuples could be
fet ched by a single I / O. Also, an
I / O measure woul d have revealed a
serious drawback of XRM: Storing
the domai ns separately from the tu-
pies causes many extra I / Os to be
done in retrieving dat a values. Be-
cause of this, our later i mpl ement a-
t i on stored dat a values in the act ual
tuples rat her t han in separate do-
mains. (In defense of XRM, it shoul d
be not ed t hat the separat i on of dat a
values from tuples has some advan-
tages i f dat a values are relatively
large and i f many tuples are proc-
essed i nt ernal l y compared to the
number of tuples whi ch are materi-
alized for output. )
(3) Because the Phase Zero im-
pl ement at i on was observed to be
CPU- bound duri ng the processing of
a typical query, it was decided the
optimizer cost measure shoul d be a
weighted sum of CPU t i me and I / O
count, wi t h weights adjustable ac-
cording to the system configuration.
(4) Observation of some of the
applications of Phase Zero con-
vinced us of the i mport ance of the
"joi n" formul at i on of SQL. In our
Domain # 1 : Names
John Smith
Domain # 3: Locations
Evanston
T 'D 1
/ I
~ 2 : Jobs
Programmer
\
Fig. 2. X RM St o r a ge St r uct ur e.
subsequent i mpl ement at i on, bot h
"joi ns" and "subqueri es" were sup-
ported.
(5) The Phase Zero optimizer
was quite complex and was ori ent ed
t oward complex queries. In our later
i mpl ement at i on, greater emphasi s
was pl aced on relatively simple in-
teractions, and care was t aken to
mi ni mi ze the "pat h l engt h" for sim-
ple SQL statements.
3. Phase One: Const ruct i on of a
Mul t i user Pr ot ot ype
Aft er the compl et i on and evalu-
at i on of the Phase Zero prototype,
work began on the const ruct i on of
the full-function, mul t i user version
of Syst em R. Li ke Phase Zero, Sys-
t em R consisted of an access met hod
(called RSS, the Research Storage
System) and an opt i mi zi ng SQL
processor (called RDS, the Rela-
t i onal Dat a System) whi ch runs on
top of the RSS. Separat i on of the
RSS and RDS provi ded a beneficial
degree of modul ari t y; e.g., all locking
and logging funct i ons were isolated
in the RSS, while all aut hori zat i on
and access pat h selection functions
were isolated in the RDS. Construc-
t i on of the RSS was under way in
1975 and const ruct i on of the RDS
began in 1976. Unl i ke XRM, the
RSS was originally designed to sup-
port mul t i pl e concurrent users.
The mul t i user prot ot ype of Sys-
t em R cont ai ned several i mport ant
subsystems whi ch were not present
in the earlier Phase Zero prototype.
In order to prevent conflicts whi ch
mi ght arise when two concurrent
users at t empt to updat e the same
dat a value, a locking subsystem was
provided. The locking subsystem en-
sures t hat each dat a value is accessed
by onl y one user at a time, t hat all
the updat es made by a given trans-
act i on become effective simultane-
ously, and t hat deadlocks between
users are detected and resolved. The
security of the system was enhanced
by view and aut hori zat i on subsys-
tems. The view subsystem permits
users to define alternative views of
the dat abase (e.g., a view of the em-
ployee file in whi ch salaries are de-
leted or aggregated by depart ment ).
635 Communi cat i ons
of
the ACM
Oct ober 1981
Vol ume 24
Numbe r 10
COMPUTI NG
PRACTI CES
The aut hor i zat i on subsyst em ensures
t hat each user has access onl y to
t hose views for whi ch he has been
specifically aut hor i zed by t hei r cre-
ators. Fi nal l y, a r ecover y subsyst em
was pr ovi ded whi ch allows t he dat a-
base to be rest ored to a consistent
state in t he event of a har dwar e or
soft ware failure.
I n or der to pr ovi de a useful host-
l anguage capability, it was deci ded
t hat Syst em R shoul d suppor t bot h
P L / I and Cobol appl i cat i on pro-
grams as well as a st andal one quer y
i nt erface, and t hat t he syst em shoul d
r un under ei t her t he VM/ CMS or
MVS / TS O operat i ng syst em envi-
r onment . A key goal of t he SQL
l anguage was to present t he same
capabilities, and a consi st ent syntax,
to users of t he P L / I and Cobol host
l anguages and to ad hoc quer y users.
The i mbeddi ng of SQL i nt o P L / I is
descri bed in [16]. I nst al l at i on of a
mul t i user dat abase system under
VM/ CMS r equi r ed cert ai n modi fi -
cat i ons to t he operat i ng syst em in
suppor t of communi cat i ng vi rt ual
machi nes and wri t abl e shar ed vi rt ual
memor y. These modi fi cat i ons are de-
scribed in [32].
The st andal one quer y i nt erface
of Syst em R (called UFI , t he User-
Fr i endl y I nt erface) is suppor t ed by
a dialog manager program, wri t t en
in PL/ I , whi ch runs on t op of Syst em
R like any ot her appl i cat i on pro-
gram. Ther ef or e, t he UFI suppor t
pr ogr am is a cl eanl y separat ed com-
ponent and can be modi f i ed i nde-
pendent l y of t he rest of t he system.
I n fact, several users i mpr oved on
our UFI by writing i nt eract i ve dialog
manager s of t hei r own.
The Compilation Approach
Per haps t he most i mpor t ant de-
cision in t he design of t he RDS was
i nspi red by R. Lori e' s observat i on, in
ear l y 1976, t hat it is possible t o com-
pile ver y hi gh-l evel SQL st at ement s
i nt o compact , efficient rout i nes in
Syst em/ 370 machi ne l anguage [42].
Lori e was able to demonst r at e t hat
636
SQL st at ement s of ar bi t r ar y com-
pl exi t y coul d be decomposed i nt o a
rel at i vel y small col l ect i on of ma-
chi ne-l anguage "f r agment s, " and
t hat an opt i mi zi ng compi l er coul d
assemble these code fragment s f r om
a l i br ar y to f or m a specially t ai l ored
rout i ne for processing a gi ven SQL
st at ement . Thi s t echni que had a ver y
dr amat i c effect on our abi l i t y to sup-
por t appl i cat i on pr ogr ams f or trans-
act i on processing. I n Syst em R, a
P L / I or Cobol pi ' ogram is r un
t hr ough a preprocessor in whi ch its
SQL st at ement s are exami ned, opt i -
mized, and compi l ed i nt o small, ef-
ficient machi ne- l anguage rout i nes
whi ch are packaged i nt o an "access
modul e" f or t he appl i cat i on pro-
gram. Then, when t he pr ogr am goes
i nt o execut i on, t he access modul e is
i nvoked to per f or m all i nt eract i ons
wi t h t he dat abase by means of calls
to t he RSS. The process of creat i ng
and i nvoki ng an access modul e is
i l l ust rat ed in Fi gures 3 and 4. All t he
over head of parsing, val i di t y check-
ing, and access pat h selection is re-
moved f r om t he pat h of t he execut -
ing pr ogr am and pl aced in a separat e
preprocessor step whi ch need not be
repeat ed. Per haps even mor e i mpor -
t ant is t he fact t hat t he r unni ng pro-
gr am i nt eract s onl y wi t h its small,
speci al -purpose access modul e r at her
t han wi t h a much l arger and less
effi ci ent gener al - pur pose SQL i nt er-
pret er. Thus, t he power and ease of
use of t he hi gh-l evel SQL l anguage
are combi ned wi t h t he execut i on-
t i me effi ci ency of t he much l ower
level RSS interface.
Since all access pat h selection de-
cisions are made dur i ng t he pr epr o-
cessor step in Syst em R, t here is t he
possibility t hat subsequent changes
in t he dat abase may i nval i dat e t he
decisions whi ch are embodi ed in an
access modul e. For exampl e, an in-
dex selected by t he opt i mi zer may
l at er be dr opped f r om t he dat abase.
Ther ef or e, Syst em R records wi t h
each access modul e a list of its "de-
pendenci es" on dat abase object s
such as tables and indexes. The de-
pendency list is st ored in t he f or m of
a regul ar rel at i on in t he system cat-
alog. When t he st ruct ure of t he dat a-
Communi cat i ons
of
the ACM
base changes (e.g., an i ndex is
dr opped) , all affect ed access modul es
are mar ked "i nval i d. " The next t i me
an i nval i d access modul e is i nvoked,
it is r egener at ed f r om its ori gi nal
SQL statements, wi t h newl y opt i -
mi zed access paths. Thi s process is
compl et el y t r anspar ent to t he Syst em
R user.
SQL st at ement s submi t t ed to t he
i nt eract i ve UFI di al og manager are
processed by t he same opt i mi zi ng
compi l er as preprocessed SQL state-
ment s. The UFI pr ogr am passes t he
ad hoc SQL st at ement t o Syst em R
wi t h a special "EXECUTE" call. I n re-
sponse to t he EXECUTE call, Syst em R
parses and opt i mi zes t he SQL state-
ment and t ransl at es it i nt o a ma-
chi ne-l anguage rout i ne. The r out i ne
is i ndi st i ngui shabl e f r om an access
modul e and is execut ed i mmedi at el y.
Thi s process is descri bed in mor e
det ai l i n [20].
RSS Access Paths
Rat her t han st ori ng dat a values
in separat e "domai ns " in t he manner
of XRM, t he RSS chose to store dat a
val ues in t he i ndi vi dual rcords of t he
dat abase. Thi s resul t ed in records be-
comi ng vari abl e in l engt h and
l onger, on t he average, t han t he
equi val ent XRM records. Also, com-
monl y used val ues are represent ed
ma ny t i mes r at her t han onl y once as
in XRM. It was felt, however, t hat
these di sadvant ages were mor e t han
offset by t he fol l owi ng advant age:
All t he dat a val ues of a r ecor d coul d
be fet ched by a single I / O.
I n pl ace of XRM "i nversi ons, "
t he RSS provi des "i ndexes, " whi ch
are associative access aids i mpl e-
ment ed in t he f or m of B-Trees [26].
Each t abl e in t he dat abase may have
anywher e f r om zero i ndexes up to an
i ndex on each col umn (it is also pos-
sible to creat e an i ndex on a combi -
nat i on of col umns). I ndexes make it
possible to scan t he t abl e in or der by
t he i ndexed values, or to di rect l y ac-
cess t he records whi ch mat ch a par-
t i cul ar value. I ndexes are mai nt ai ned
aut omat i cal l y by t he RSS in t he
event of updat es to t he dat abase.
The RSS also i mpl ement s
"l i nks, " whi ch are poi nt ers st ored
Oct ober 1981
Vol ume 24
Numbe r l0
P L / I Source Program
I
f
I
SELECT NAME INTO $)<
FROM EMP
WHERE EMPNO=$Y
I
I
I
Modi fi ed PL / I Program
I
I
CALL
I
I
SYSTEM R
PRECOMPILER
(XPREP)
Access Modul e
Machi ne code
ready to run
on RSS
Fig. 3. Precompi l at i on St ep.
User' s Obj ect
Program
call
Executi on-ti me
System
(XRDI)
Loads,
then calls
Fig. 4. Ex ecut i on St ep.
Access
Modul e
l
call
RSS
637
wi t h a record whi ch connect it to
ot her rel at ed records. The connec-
t i on of records on links is not per-
f or med aut omat i cal l y by t he RSS,
but must be done by a hi gher level
system.
The access pat hs made avai l abl e
by t he RSS i ncl ude (1) i ndex scans,
whi ch access a t abl e associatively
and scan it in val ue or der using an
index; (2) rel at i on scans, whi ch scan
over a t abl e as it is laid out in phys-
ical storage; (3) l i nk scans, whi ch
t raverse f r om one record to anot her
using links. On any of these t ypes of
scan, "search ar gument s" may be
specified whi ch limit t he records re-
t ur ned to those satisfying a cert ai n
predi cat e. Also, t he RSS provi des a
bui l t -i n sorting mechani sm whi ch
can t ake records f r om any of t he scan
met hods and sort t hem i nt o some
val ue order, storing t he result in a
Communi cat i ons
of
the ACM
t empor ar y list in t he dat abase. In
Syst em R, t he RDS makes extensive
use of i ndex and rel at i on scans and
sorting. The RDS also utilizes links
for i nt ernal purposes but not as an
access pat h to user data.
The Optimizer
Bui l di ng on our Phase Zer o ex-
peri ence, we desi gned t he Syst em R
opt i mi zer to mi ni mi ze t he wei ght ed
sum of t he predi ct ed number of I / Os
and RSS calls in processing an SQL
st at ement (the relative weights of
these t wo t erms are adjust abl e ac-
cordi ng to syst em confi gurat i on).
Rat her t han mani pul at i ng TI D lists,
t he opt i mi zer chooses to scan each
t abl e in t he SQL quer y by means of
onl y one i ndex (or, i f no suitable
i ndex exists, by means of a rel at i on
scan). For exampl e, i f t he quer y calls
for pr ogr ammer s who wor k in Ev-
anst on, t he opt i mi zer mi ght choose
to use t he j ob i ndex to find pr ogr am-
mers and t hen exami ne t hei r loca-
tions; it mi ght use t he l ocat i on i ndex
to fi nd Evanst on empl oyees and ex-
ami ne t hei r jobs; or it mi ght si mpl y
scan t he rel at i on and exami ne the
j ob and l ocat i on of all empl oyees.
The choi ce woul d be based on t he
opt i mi zer' s est i mat e of bot h t he clus-
t eri ng and selectivity propert i es of
each index, based on statistics st ored
in t he syst em catalog. An i ndex is
consi dered hi ghl y selective i f it has a
large rat i o of distinct key values to
t ot al entries. An i ndex is consi dered
to have t he clustering pr oper t y i f t he
key or der of t he i ndex corresponds
closely to t he orderi ng of records in
physi cal storage. The clustering
pr oper t y is i mpor t ant because when
a r ecor d is fet ched via a clustering
index, it is likely t hat ot her records
wi t h t he same key will be f ound on
t he same page, t hus mi ni mi zi ng t he
number of page fetches. Because of
t he i mpor t ance of clustering, mech-
anisms were pr ovi ded for l oadi ng
dat a in val ue or der and preservi ng
t he val ue orderi ng when new records
are i nsert ed i nt o t he dat abase.
The t echni ques of t he Syst em R
opt i mi zer f or per f or mi ng joi ns of t wo
or mor e tables have t hei r origin in a
st udy conduct ed by M. Blasgen and
Oct ober 1981
Vol ume 24
Numbe r 10
COMPUTI NG
PRACTI CES
K. Eswar an [7]. Usi ng APL models,
Blasgen and Eswar an st udi ed t en
met hods of joi ni ng t oget her tables,
based on t he use of indexes, sorting,
physi cal poi nt ers, and TI D lists. The
number of disk accesses r equi r ed to
per f or m a j oi n was predi ct ed on the
basis of vari ous assumpt i ons f or t he
t en j oi n met hods. Two j oi n met hods
were i dent i fi ed such t hat one or t he
ot her was opt i mal or near l y opt i mal
under most ci rcumst ances. The t wo
met hods are as follows:
Joi n Met hod 1: Scan over the
qual i fyi ng rows of t abl e A. For each
row, fet ch t he mat chi ng rows of t abl e
B (usually, but not always, an i ndex
on t abl e B is used).
Joi n Met hod 2: ( Of t en used
when no suitable i ndex exists.) Sort
t he qual i fyi ng rows of tables A and
B in or der by t hei r respective j oi n
fields. The n scan over t he sort ed lists
and merge t hem by mat chi ng values.
Whe n selecting an access pat h for
a j oi n of several tables, t he Syst em R
opt i mi zer consi ders t he pr obl em to
be a sequence of bi nar y joi ns. It t hen
per f or ms a t ree search in whi ch each
level of t he t ree consists of one of t he
bi nar y joi ns. The choi ces to be made
at each level of t he t ree i ncl ude whi ch
j oi n met hod to use and whi ch index,
i f any, to select for scanning. Com-
pari sons are appl i ed at each level of
t he tree to pr une away pat hs whi ch
achi eve t he same results as ot her, less
costly paths. When all pat hs have
been exami ned, t he opt i mi zer selects
t he one of mi ni mum pr edi ct ed cost.
The Syst em R opt i mi zer al gori t hms
are descri bed mor e ful l y in [47].
Views and Authorization
The maj or object i ves of t he view
and aut hor i zat i on subsystems of Sys-
t em R were power and flexibility.
We want ed to allow any SQL quer y
to be used as t he defi ni t i on of a view.
Thi s was accompl i shed by storing
each view defi ni t i on in t he f or m of
an SQL parse tree. When an SQL
oper at i on is to be execut ed against a
view, t he parse tree whi ch defi nes
t he oper at i on is mer ged wi t h t he
parse tree whi ch defi nes t he view,
pr oduci ng a composi t e parse tree
whi ch is t hen sent to t he opt i mi zer
f or access pat h selection. Thi s ap-
pr oach is similar to t he " quer y mod-
i fi cat i on" t echni que pr oposed by
St onebr aker [48]. The al gori t hms de-
vel oped for mergi ng parse trees were
suffi ci ent l y general so t hat near l y
any SQL st at ement coul d be exe-
cut ed against any view defi ni t i on,
wi t h t he rest ri ct i on t hat a view can
be updat ed onl y i f it is der i ved f r om
a single t abl e in t he dat abase. The
r eason for this rest ri ct i on is t hat some
updat es to views whi ch are der i ved
f r om mor e t han one t abl e are not
meani ngf ul (an exampl e of such an
updat e is gi ven in [24]).
The aut hor i zat i on subsyst em of
Syst em R is based on privileges
whi ch are cont rol l ed by t he SQL
st at ement s GRANT and REVOKE. Each
user of Syst em R may opt i onal l y be
gi ven a privilege called RESOURCE
whi ch enabl es hi m/ he r t o creat e new
tables in t he dat abase. Whe n a user
creat es a table, he/ s he receives all
privileges to access, updat e, and de-
st roy t hat table. The cr eat or of a
t abl e can t hen grant t hese privileges
to ot her i ndi vi dual users, and subse-
quent l y can r evoke t hese grant s i f
desired. Each gr ant ed privilege may
opt i onal l y car r y wi t h it t he "GRANT
opt i on, " whi ch enabl es a reci pi ent to
gr ant t he privilege to yet ot her users.
A REVOKE dest roys t he whol e chai n
of gr ant ed privileges der i ved f r om
t he ori gi nal grant. The aut hor i zat i on
subsyst em is descri bed in det ai l in
[37] and discussed f ur t her in [31].
The Recovery Subsystem
The key object i ve of t he r ecover y
subsyst em is provi si on of a means
wher eby t he dat abase may be re-
cover ed t o a consi st ent state in t he
event of a failure. A consi st ent state
is def i ned as one in whi ch t he dat a-
base does not reflect any updat es
made by t ransact i ons whi ch di d not
compl et e successfully. Ther e are
t hree basic t ypes of failure: t he disk
medi a may fail, t he syst em may fail,
or an i ndi vi dual t r ansact i on may fail.
Al t hough bot h t he scope of t he fail-
ure and t he t i me to effect r ecover y
may be di fferent , all t hree t ypes of
r ecover y requi re t hat an al t ernat e
copy of dat a be avai l abl e when t he
pr i mar y copy is not.
Whe n a medi a fai l ure occurs,
dat abase i nf or mat i on on disk is lost.
Whe n this happens, an i mage dump
of t he dat abase plus a log o f " b e f o r e "
and "af t er " changes pr ovi de t he al-
t er nat e copy whi ch makes r ecover y
possible. Syst em R' s use of " dual
l ogs" even permi t s r ecover y f r om
medi a failures on t he log itself. To
recover f r om a medi a failure, t he
dat abase is rest ored using t he latest
i mage dump and t he r ecover y pro-
cess reappl i es all dat abase changes
as specified on t he log f or compl et ed
t ransact i ons.
Whe n a syst em fai l ure occurs, t he
i nf or mat i on in mai n memor y is lost.
Thus, enough i nf or mat i on must al-
ways be on di sk t o make r ecover y
possible. For r ecover y f r om syst em
failures, Syst em R uses t he change
log ment i oned above plus somet hi ng
called "s hadow pages. " As each page
in t he dat abase is updat ed, t he page
is wri t t en out in a new pl ace on disk,
and t he ori gi nal page is ret ai ned. A
di r ect or y of t he " ol d" and " new"
l ocat i ons of each page is mai nt ai ned.
Peri odi cal l y dur i ng nor mal oper a-
tion, a "checkpoi nt " occurs in whi ch
all updat es are forced out to disk, t he
" ol d" pages are di scarded, and t he
" new" pages become "ol d. " In t he
event of a syst em crash, t he " new"
pages on disk may be in an i ncon-
sistent state because some updat ed
pages may still be in t he system
buffers and not yet refl ect ed on disk.
To bri ng t he dat abase back to a con-
sistent state, t he system revert s to t he
" ol d" pages, and t hen uses t he log to
r edo all commi t t ed t ransact i ons and
to undo all updat es made by i ncom-
pl et e t ransact i ons. Thi s aspect of t he
Syst em R r ecover y subsyst em is de-
scribed in mor e detail in [36].
When a t ransact i on fai l ure o c -
curs, all dat abase changes whi ch
have been made by t he failing trans-
act i on must be undone. To accom-
638
Communi cat i ons
of
the ACM
Oct ober 1981
Vol ume 24
Numbe r 10
plish this, System R simply processes
the change log backwards removi ng
all changes made by the transaction.
Unl i ke medi a and system recovery
whi ch bot h require t hat System R be
reinitialized, t ransact i on recovery
takes place on-line.
The Locking Subsystem
A great deal of t hought was given
to the design of a locking subsystem
whi ch woul d prevent interference
among concurrent users of System
R. The original design involved the
concept of "predi cat e locks," in
which the lockable uni t was a data-
base propert y such as "empl oyees
whose location is Evanst on. " Not e
that, in this scheme, a lock mi ght be
hel d on the predicate LOC = 'EVANS-
TON', even i f no employees current l y
satisfy t hat predicate. By compari ng
the predicates being processed by
di fferent users, the locking subsys-
t em coul d prevent interference. The
"predi cat e l ock" design was ulti-
mat el y abandoned because: (1) de-
t ermi ni ng whet her two predicates are
mut ual l y satisfiable is difficult and
time-consuming; (2) two predicates
may appear to conflict when, in fact,
the semantics of the dat a prevent any
conflict, as in "PRODUCT = AIR-
CRAFT" and "MANUFACTURER ---~
ACME STATIONERY CO. "; a n d (3) we
desired to cont ai n the locking sub-
system entirely wi t hi n the RSS, and
therefore to make it i ndependent of
any underst andi ng of the predicates
being processed by various users.
The or i gi nal predi cat e l ocki ng
scheme is described in [29].
The locking scheme event ual l y
chosen for System R is described in
[34]. This scheme involves a hierar-
chy of locks, wi t h several different
sizes of lockable units, ranging from
i ndi vi dual records to several tables.
The locking subsystem is t ransparent
to end users, but acquires locks on
physical objects in the dat abase as
t hey are processed by each user.
When a user accumul at es many
small locks, t hey may be "t r aded"
for a larger lockable uni t (e.g., locks
on many records in a table mi ght be
t raded for a lock on the table). When
locks are acqui red on small objects,
"i nt ent i on" locks are si mul t aneousl y
acqui red on the larger objects which
cont ai n them. For example, user A
and user B may bot h be updat i ng
empl oyee records. Each user holds
an "i nt ent i on" lock on the empl oyee
table, and "exclusive" locks on the
part i cul ar records being updat ed. I f
user A at t empt s to t rade her individ-
ual record locks for an "exclusive"
lock at the table level, she must wait
unt i l user B ends his t ransact i on and
releases his "i nt ent i on" lock on the
table.
4. Phase Two: Eval uat i on
The eval uat i on phase of the Sys-
t em R project lasted approxi mat el y
2'/2 years and consisted of two parts:
(l ) experiments performed on the
system at the San Jose Research Lab-
oratory, and (2) act ual use of the
system at a number of i nt ernal I BM
sites and at three selected cust omer
sites. At all user sites, System R was
installed on an experi ment al basis
for st udy purposes only, and not as
a supported commerci al product.
The first installations of System R
took place in June 1977.
General User Comments
In general, user response to Sys-
tem R has been enthusiastic. The
system was most l y used in applica-
tions for which ease of installation,
a high-level user language, and an
ability to rapidly reconfigure the
dat abase were i mport ant require-
ments. Several user sites reported
t hat t hey were able to install the
system, design and l oad a database,
and put into use some application
programs wi t hi n a mat t er of days.
User si t es also reported t hat it was
possible to t une the system perform-
ance after dat a was l oaded by creat-
ing and droppi ng indexes wi t hout
i mpact i ng end users or application
programs. Even changes in the data-
base tables could be made transpar-
ent to users i f the tables were read-
only, and also in some cases for up-
dat ed tables.
Users found the performance
characteristics and resource con-
sumpt i on of System R to be gener-
ally satisfactory for their experimen-
tal applications, al t hough no speci-
fic performance compari sons were
drawn. In general, the experi ment al
dat abases used wi t h System R were
smaller t han one 3330 disk pack (200
Megabyt es) and were typically ac-
cessed by fewer t han ten concurrent
users. As mi ght be expected, inter-
active response slowed down duri ng
the execution of very complex SQL
st at ement s involving joi ns of several
tables. Thi s performance degrada-
tion must be t raded of f against
the advant ages of normal i zat i on
[23, 30], in whi ch large dat abase
tables are broken into smaller parts
to avoid redundancy, and t hen
joi ned back t oget her by the view
mechani sm or user applications.
The SQL Language
The SQL user interface of System
R was generally felt to be successful
in achieving its goals of simplicity,
power, and dat a i ndependence. The
l anguage was simple enough in its
basic structure so t hat users wi t hout
prior experience were able to learn a
usable subset on their first sitting. At
the same time, when t aken as a
whole, the l anguage provi ded the
query power of the first-order pred-
icate calculus combi ned wi t h opera-
tors for grouping, arithmetic, and
built-in functions such as SUM and
AVERAGE.
Users consistently praised the
uni f or mi t y of the SQL synt ax across
the envi ronment s of application pro-
grams, ad hoc query, and dat a defi-
ni t i on (i.e., defi ni t i on of views).
Users who were formerl y required to
learn inconsistent languages for these
purposes f ound it easier to deal wi t h
the single synt ax (e.g., when debug-
ging an application program by
queryi ng the dat abase to observe its
" effects). The single synt ax also en-
hanced communi cat i on among dif-
ferent funct i onal organizations (e.g.,
between dat abase admi ni st rat ors and
appl i cat i on programmers).
Whi l e developing applications
using SQL, our experi ment al users
made a number of suggestions for
extensions and i mprovement s to the
language, most of which were imple-
ment ed duri ng the course of the proj-
639 Communi cat i ons
of
the ACM
Oct ober 1981
Vol ume 24
Numbe r 10
COMPUTI NG
P RACT I CE S
ect. Some of these suggestions are
summari zed below:
(1) Users requested an easy-to-
use synt ax when testing for the exist-
ence or nonexi st ence of a dat a item,
such as an empl oyee record whose
depar t ment number mat ches a given
depar t ment record. Thi s facility was
i mpl ement ed in the form of a special
"EXISTS" predicate.
(2) Users requested a means of
seaching for charact er strings whose
cont ent s are onl y partially known,
such as "al l license plates beginning
wi t h NVK. " This facility was imple-
ment ed in the form of a special
"LIKE" predicate whi ch searches for
"pat t er ns" t hat are allowed to con-
t ai n "don' t care" characters.
(3) A requi rement arose for an
application program to comput e an
SQL st at ement dynami cal l y, submi t
the st at ement to the System R optim-
izer for access pat h selection, and
t hen execute the st at ement repeat-
edl y for di fferent dat a values wi t hout
rei nvoki ng the optimizer. This facil-
ity was i mpl ement ed in the form of
PREPARE and EXECUTE st at ement s
which were made available in the
host -l anguage version of SQL.
(4) In some user applications
the need arose for an operat or which
Codd has called an "out er j oi n" [25].
Suppose t hat two tables (e.g., suP-
PLIERS and PROJECTS) are related by
a common dat a field (e.g., PARTNO).
In a convent i onal joi n of these tables,
supplier records which have no
mat chi ng project record (and vice
versa) woul d not appear. In an
"out er j oi n" of these tables, supplier
records wi t h no mat chi ng project rec-
ord woul d appear together with a
"synt het i c" project record cont ai ni ng
onl y nul l values (and similarly for
projects with no mat chi ng supplier).
An "out er - joi n" facility for SQL is
current l y under study.
A more complete discussion of
user experience wi t h SQL and the
resulting l anguage i mprovement s is
presented in [19].
The Compilation Approach
The approach of compiling SQL
st at ement s into machi ne code was
one of the most successful parts of
the System R project. We were able
to generat e a machi ne-l anguage rou-
tine to execute any SQL st at ement of
arbi t rary compl exi t y by selecting
code fragment s from a library of ap-
proxi mat el y 100 fragments. The re-
sult was a beneficial effect on trans-
action programs, ad hoc query, and
system simplicity.
In an envi ronment of short, re-
petitive transactions, the benefits of
compi l at i on are obvious. All the
overhead of parsing, val i di t y check-
ing, and access pat h selection are
removed from the pat h of the run-
ni ng transaction, and the application
program interacts wi t h a small, spe-
cially tailored access modul e rat her
t han wi t h a larger and less efficient
general-purpose interpreter pro-
gram. Experi ment s [38] showed t hat
for a typical short transaction, about
80 percent of the instructions were
executed by the RSS, wi t h the re-
mai ni ng 20 percent executed by the
access modul e and application pro-
Exampl e 1 :
SELECT SUPPNO, PRICE
FROM QUOTES
WHERE PARTNO = ' 010002'
AND MI NQ< = 1000 AND MAXQ> = 1000;
Oper at i on
Parsi ng
Access Path
Sel ect i on
Code
Gener at i on
Fetch
answer set
(per r ecor d)
CPU ti me Number
(msec on 168) of I / Os
13. 3 0
40. 0 9
10.1 0
1. 5 0. 7
Exampl e 2:
SELECT ORDERNO, ORDERS. PARTNO, DESCRI P, DATE, QTY
FROM ORDERS, PARTS
WHERE ORDERS. PARTNO = PARTS. PARTNO
AND DATE BETWEEN ' 750000' AND ' 751231'
AND SUPPNO = ' 797' ;
CPU ti me
Oper at i on
(msec on 168)
Parsi ng 20. 7
Access Path 73. 2
Sel ect i on
Code 19. 3
Gener at i on
Fetch 8. 7
answer set
(per r ecor d)
Number
of I / Os
0
9
0
10. 7
Fig. 5. Measurement s of Cost of Compi l ati on.
64O
Communi cat i ons Oct ober 1981
of Vol ume 24
the ACM Numbe r l0
gram. Thus, t he user pays onl y a
small cost for t he power, flexibility,
and dat a i ndependence of t he SQL
l anguage, compar ed wi t h writing t he
same t ransact i on di rect l y on t he
l ower level RSS interface.
In an ad hoc quer y envi r onment
t he advant ages of compi l at i on are
less obvi ous since the compi l at i on
must take place on-l i ne and the
quer y is execut ed onl y once. In this
envi r onment , t he cost of generat i ng
a machi ne- l anguage rout i ne for a
gi ven quer y must be bal anced
against t he i ncreased effi ci ency of
this rout i ne as compar ed wi t h a mor e
convent i onal quer y i nt erpret er. Fig-
ure 5 shows some measur ement s of
t he cost of compi l i ng two t ypi cal
SQL st at ement s (details of t he exper-
i ment s are gi ven in [20]). Fr om this
dat a we may dr aw t he fol l owi ng con-
clusions:
(1) The code gener at i on step
adds a small amount of CPU t i me
and no I / Os to the over head of pars-
ing and access pat h selection. Parsi ng
and access pat h selection must be
done in any quer y system, i ncl udi ng
i nt erpret i ve ones. The addi t i onal in-
structions spent on code gener at i on
are not likely to be percept i bl e to an
end user.
(2) I f code gener at i on results in
a r out i ne whi ch runs mor e effi ci ent l y
t han an i nt erpret er, t he cost of t he
code gener at i on step is pai d back
aft er fet chi ng onl y a few records. (In
Exampl e 1, i f t he CPU t i me per rec-
or d of t he compi l ed modul e is hal f
t hat of an i nt erpret i ve system, t he
cost of generat i ng t he access modul e
is repai d aft er seven records have
been fet ched. )
A fi nal advant age of compi l at i on
is its si mpl i fyi ng effect on t he system
archi t ect ure. Wi t h bot h ad hoc que-
ries and pr ecanned t ransact i ons
bei ng t reat ed in t he same way, most
of t he code in t he system can be
made to serve a dual purpose. Thi s
ties in ver y well wi t h our object i ve of
support i ng a uni f or m synt ax bet ween
quer y users and t ransact i on pro-
grams.
Avai l abl e Access Pat hs
As descri bed earlier, t he pri nci pal
access pat h used in Syst em R for
ret ri evi ng dat a associatively by its
val ue is t he B-tree index. A t ypi cal
i ndex is i l l ust rat ed in Fi gur e 6. I f we
assume a f an- out of appr oxi mat el y
200 at each level of t he tree, we can
i ndex up to 40~000 records by a two-
level index, and up to 8,000,000 rec-
] Root
[ ] [ ] [ ] [ ] Data
[ ] Pages
Fig. 6. A B-Tr ee I ndex .
Intermediate
Pages
Leaf
Pages
ords by a t hree-l evel index. I f we
wish to begi n an associative scan
t hr ough a large table, t hree I / Os will
t ypi cal l y be r equi r ed (assumi ng t he
r oot page is r ef er enced f r equent l y
enough to r emai n in t he system
buffers, we need an I / O for t he in-
t ermedi at e-l evel i ndex page, t he
" l eaf " i ndex page, and t he dat a
page). I f several records are to be
fet ched using t he i ndex scan, t he
t hree st art -up I / Os are relatively in-
significant. However , i f onl y one rec-
or d is to be fetched, ot her access
t echni ques mi ght have pr ovi ded a
qui cker pat h to t he st ored data.
Two c ommon access t echni ques
whi ch were not utilized for user dat a
in Syst em R are hashi ng and direct
links (physi cal poi nt ers f r om one rec-
or d to anot her). Hashi ng was not
used because it does not have the
conveni ent orderi ng pr oper t y of a B-
t ree i ndex (e.g., a B-tree i ndex on
SALARY enabl es a list of empl oyees
or der ed by SALARY to be ret ri eved
ver y easily). Di rect links, al t hough
t hey were i mpl ement ed at t he RSS
level, were not used as an access pat h
for user dat a by t he RDS for a two-
fold reason. Essential links (links
whose semant i cs are not known to
t he system but whi ch are connect ed
di rect l y by users) were reject ed be-
cause t hey were i nconsi st ent wi t h t he
nonnavi gat i onal user i nt erface of a
rel at i onal system, since t hey coul d
not be used as access pat hs by an
aut omat i c optimizer. Nonessential
links (links whi ch connect records to
ot her records wi t h mat chi ng dat a
values) were not i mpl ement ed be-
cause of t he difficulties in aut omat i -
cally mai nt ai ni ng t hei r connect i ons.
When a record is updat ed, its con-
nect i ons on many links may need to
be updat ed as well, and this may
i nvol ve many "subsi di ary queri es" to
find t he ot her records whi ch are in-
vol ved in these connect i ons. Prob-
lems also arise relating to records
whi ch have no mat chi ng par t ner rec-
or d on t he link, and records whose
l i nk-cont rol l i ng dat a val ue is null.
In general, our experi ence
showed t hat i ndexes coul d be used
ver y effi ci ent l y in queri es and trans-
act i ons whi ch access many records,
641
Communi cat i ons
of
the ACM
Oct ober 1981
Vol ume 24
Numbe r 10
COMPUTI NG
PRACTI CES
but t hat hashi ng and links woul d
have enhanced t he per f or mance of
"canned t ransact i ons" whi ch access
onl y a few records. As an illustration
of this probl em, consi der an i nven-
t ory appl i cat i on whi ch has two
tables: a PRODUCTS table, and a much
l arger PARTS t abl e whi ch cont ai ns
dat a on t he i ndi vi dual part s used for
each product . Suppose a gi ven trans-
act i on needs to fi nd t he price of t he
heat i ng el ement in a par t i cul ar
toaster. To execut e this t ransact i on,
Syst em R mi ght requi re t wo I / Os to
t raverse a t wo-l evel i ndex to fi nd t he
t oast er record, and t hree mor e I / Os
to t raverse anot her t hree-l evel i ndex
to fi nd t he heat i ng el ement record. I f
access pat hs based on hashi ng and
di rect links were available, it mi ght
be possible to fi nd t he t oast er r ecor d
in one I / O via hashing, and t he heat -
ing el ement r ecor d in one mor e I / O
via a link. (Addi t i onal I / Os woul d
be r equi r ed in t he event of hash col-
lisions or i f t he t oast er part s records
occupi ed mor e t han one page.) Thus,
for this ver y simple t ransact i on hash-
ing and links mi ght r educe t he num-
ber of I / Os f r om five to three, or
even two. For t ransact i ons whi ch re-
trieve a large set of records, t he ad-
di t i onal I / Os caused by i ndexes com-
par ed to hashi ng and links are less
i mpor t ant .
The Optimizer
A series of exper i ment s was con-
duct ed at t he San Jose I BM Resear ch
Labor at or y to eval uat e t he success of
t he Syst em R opt i mi zer in choosi ng
among t he avai l abl e access pat hs for
t ypi cal SQL statements. The results
of t hese exper i ment s are r epor t ed in
[6]. For t he pur pose of t he experi -
ments, t he opt i mi zer was modi f i ed in
or der to observe its behavi or. Or-
di nari l y, t he opt i mi zer searches
t hr ough a tree of pat h choices, com-
put i ng est i mat ed costs and pr uni ng
t he t ree unt i l it arri ves at a single
pr ef er r ed access pat h. The opt i mi zer
was modi f i ed in such a way t hat it
coul d be made to generat e t he com-
pl et e tree of access paths, wi t hout
pruni ng, and to est i mat e t he cost of
each pat h (cost is def i ned as a
wei ght ed sum of page fet ches and
RSS calls). Mechani sms were also
added to t he system wher eby it coul d
be forced to execut e an SQL state-
ment by a par t i cul ar access pat h and
to measur e t he act ual number of
page fetches and RSS calls i ncurred.
I n this way, a compar i son can be
made bet ween the opt i mi zer' s pre-
di ct ed cost and t he act ual measur ed
cost for vari ous al t ernat i ve paths.
I n [6], an exper i ment is descri bed
in whi ch ten SQL statements, i ncl ud-
ing some single-table queri es and
some joi ns, are r un against a test
dat abase. The dat abase is artificially
gener at ed to conf or m to t he t wo
basic assumpt i ons of t he Syst em R
optimizer: (1) t he values in each col-
umn are uni f or ml y di st ri but ed f r om
some mi ni mum to some maxi mum
value; and (2) t he di st ri but i on of val-
ues of t he vari ous col umns are i nde-
pendent of each other. For each of
t he t en SQL statements, t he or der i ng
of t he pr edi ct ed costs of t he vari ous
access pat hs was t he same as t he
or der i ng of t he act ual measur ed costs
(in a few cases t he opt i mi zer pre-
di ct ed t wo pat hs to have t he same
cost when t hei r act ual costs were un-
equal but adjacent in t he orderi ng).
Al t hough t he opt i mi zer was able
to correct l y or der t he access pat hs in
t he exper i ment we have just de-
scribed, t he magni t udes of t he pre-
di ct ed costs di ffered f r om t he mea-
sured costs in several cases. These
di screpanci es were due to a vari et y
of causes, such as t he opt i mi zer' s in-
abi l i t y to predi ct how much dat a
woul d r emai n in t he system buffers
dur i ng sorting.
The above exper i ment does not
address t he issue of whet her or not a
ver y good access pat h f or a gi ven
SQL st at ement mi ght be over l ooked
because it is not part of t he opt i -
mi zer' s repert oi re. One such exampl e
is known. Suppose t hat t he dat abase
cont ai ns a t abl e T in whi ch each row
has a uni que val ue f or t he field
SEQNO, and suppose t hat an i ndex
exists on SEQNO. Consi der t he follow-
ing SQL query:
SELECT * FROM T WHERE SEQNO I N
(15, 17, 19, 21);
Thi s quer y has an answer set of
(at most ) f our rows, and an obvi ous
met hod of processi ng it is t o use t he
SEQNO i ndex repeat edl y: first to fi nd
t he row wi t h SEQNO = 15, t hen SEQNO
= 17, etc. However , this access pat h
woul d not be chosen by Syst em R,
because t he opt i mi zer is not pres-
ent l y st ruct ured to consi der mul t i pl e
uses of an i ndex wi t hi n a single quer y
block. As we gai n mor e exper i ence
wi t h access pat h selection, t he opt i -
mi zer may grow to encompass this
and ot her access pat hs whi ch have so
far been omi t t ed f r om consi derat i on.
Views and Authorization
Users general l y f ound t he Syst em
R mechani sms f or defi ni ng views
and cont rol l i ng aut hor i zat i on to be
powerful , flexible, and conveni ent .
The fol l owi ng feat ures were consi d-
er ed to be part i cul arl y beneficial:
(1) The full quer y power of
SQL is made avai l abl e f or defi ni ng
new views of dat a (i.e., any quer y
may be def i ned as a view). Thi s
makes it possible t o defi ne a ri ch
var i et y of views, cont ai ni ng joi ns,
subqueri es, aggregat i on, etc., wi t hout
havi ng to l earn a separat e "dat a def-
i ni t i on l anguage. " However , t he view
mechani sm is not compl et el y t rans-
par ent t o t he end user, because of t he
restrictions descri bed earl i er (e.g.,
views i nvol vi ng joi ns of mor e t han
one t abl e are not updat eabl e).
(2) The aut hor i zat i on subsys-
t em allows each i nst al l at i on of Sys-
t em R to choose a "f ul l y cent ral i zed
pol i cy" in whi ch all tables are cre-
at ed and privileges cont r ol l ed by a
cent r al admi ni st rat or; or a "f ul l y de-
cent ral i zed pol i cy" in whi ch each
user may creat e tables and cont r ol
access t o t hem; or some i nt er medi at e
policy.
Dur i ng t he t wo- year eval uat i on
of Syst em R, t he fol l owi ng sugges-
tions were made by users for i m-
pr ovement of t he view and aut hor i -
zat i on subsystems:
642 Communi cat i ons
of
t he ACM
Oct ober 1981
Vol ume 24
Numbe r 10
(1) The aut hor i zat i on subsys-
t em coul d be augment ed by t he con-
cept of a "gr oup" of users. Each
gr oup woul d have a "gr oup admi n-
i st rat or" who cont rol s enr ol l ment of
new member s in t he group. Privi-
leges coul d t hen be gr ant ed to t he
gr oup as a whol e r at her t han to each
member of t he gr oup i ndi vi dual l y.
(2) A new command coul d be
added to the SQL l anguage to
change t he ownershi p of a t abl e f r om
one user to anot her. Thi s suggestion
is mor e difficult to i mpl ement t han
it seems at first glance, because t he
owner' s name is part of t he fully
qual i fi ed name of a t abl e (i.e., two
tables owned by Smi t h and Jones
coul d be named SMITH. PARTS and
JONES.PARTS). References to t he
t abl e SMITH. PARTS mi ght exist in
many places, such as vi ew defi ni t i ons
and compi l ed programs. Fi ndi ng
and changi ng all these references
woul d be difficult (perhaps impossi-
ble, as in t he case of users' source
pr ogr ams whi ch are not st ored under
Syst em R control).
(3) Occasi onal l y it is necessary
to rel oad an existing t abl e in t he
dat abase (e.g., to change its physical
clustering properties). I n Syst em R
this is accompl i shed by dr oppi ng t he
ol d t abl e defi ni t i on, creat i ng a new
t abl e with the same definition, and
rel oadi ng t he dat a i nt o t he new table.
Unf or t unat el y, views and aut hori za-
tions def i ned on t he t abl e are lost
f r om t he system when t he ol d defi-
ni t i on is dr opped, and t her ef or e t hey
bot h must be redefi ned on t he new
table. It has been suggested t hat
views and aut hori zat i ons def i ned on
a dr opped t abl e mi ght opt i onal l y be
hel d "i n abeyance" pendi ng reacti-
vat i on of t he table.
The Recovery Subsystem
The combi ned "shadow page"
and log mechani sm used in System
R pr oved to be qui t e successful in
safeguardi ng t he dat abase against
medi a, system, and t ransact i on fail-
ures. The part of t he recovery sub-
system whi ch was observed to have
t he greatest i mpact on system per-
f or mance was t he keepi ng of a
shadow page for each updat ed page.
Thi s per f or mance i mpact is due pri-
mari l y to t he fol l owi ng factors:
(1) Since each updat ed page is
wri t t en out to a new l ocat i on on disk,
dat a t ends to move about . Thi s limits
t he ability of t he system to cl ust er
rel at ed pages in secondar y st orage to
mi ni mi ze disk ar m movement for se-
quent i al applications.
(2) Since each page can pot en-
tially have bot h an "ol d" and "new"
version, a di rect ory must be mai n-
t ai ned to locate bot h versions of each
page. For large databases, t he direc-
t ory may be large enough to requi re
a pagi ng mechani sm of its own.
(3) The peri odi c checkpoi nt s
whi ch exchange t he "ol d" and "new"
page poi nt ers generat e I / O activity
and consume a cert ai n amount of
CPU time.
A possible al t ernat i ve t echni que
for recoveri ng f r om syst em failures
woul d dispense wi t h t he concept of
shadow pages, and si mpl y keep a log
of all dat abase updat es. Thi s design
woul d requi re t hat all updat es be
wri t t en out to t he log before t he up-
dat ed page mi grat es to disk f r om t he
syst em buffers. Mechani sms coul d be
devel oped to mi ni mi ze I / Os by re-
t ai ni ng updat ed pages in t he buffers
unt i l several pages are wri t t en out at
once, shari ng an I / O to t he log.
The Locking Subsystem
The l ocki ng subsyst em of Syst em
R provi des each user wi t h a choi ce
of t hree levels of i sol at i on f r om ot her
users. I n or der to expl ai n t he t hree
levels, we defi ne "uncommi t t ed
dat a" as t hose records whi ch have
been updat ed by a t ransact i on t hat is
still in progress ( and t her ef or e still
subject to bei ng backed out). Under
no ci rcumst ances can a t ransact i on,
at any i sol at i on level, per f or m up-
dat es on t he uncommi t t ed dat a of
anot her t ransact i on, since this mi ght
lead to lost updat es in t he event of
t ransact i on backout .
The t hree levels of i sol at i on in
Syst em R are def i ned as follows:
Level 1: A t ransact i on r unni ng
at Level 1 may r ead (but not updat e)
uncommi t t ed data. Ther ef or e, suc-
cessive reads of t he same r ecor d by
a Level-1 t ransact i on may not give
consi st ent values. A Level - l t rans-
act i on does not at t empt to acqui re
any locks on records whi l e readi ng.
Level 2: A t ransact i on r unni ng
at Level 2 is pr ot ect ed against read-
ing uncommi t t ed data. However ,
successive reads at Level 2 may still
yi el d i nconsi st ent values i f a second
t r ansact i on updat es a gi ven record
and t hen t ermi nat es bet ween t he first
and second reads by t he Level -2
t ransact i on. A Level -2 t ransact i on
locks each r ecor d bef or e r eadi ng it to
make sure it is commi t t ed at t he t i me
of t he read, but t hen releases t he l ock
i mmedi at el y aft er reading.
Level 3: A t ransact i on r unni ng
at Level 3 is guar ant eed t hat succes-
sive reads of t he same r ecor d will
yi el d t he same value. Thi s guar ant ee
is enf or ced by acqui ri ng a l ock on
each record r ead by a Level-3 trans-
act i on and hol di ng t he l ock unt i l t he
end of t he t ransact i on. ( The l ock ac-
qui r ed by a Level-3 r eader is a
"shar e" l ock whi ch permi t s ot her
users to r ead but not updat e t he
l ocked record. )
It was our i nt ent i on t hat Isol at i on
Level 1 pr ovi de a means for ver y
qui ck scans t hr ough t he dat abase
when appr oxi mat e values were ac-
cept abl e, since Level-1 readers ac-
qui re no locks and shoul d never need
to wait for ot her users. I n practice,
however, it was f ound t hat Level-1
readers di d have to wait under cer-
t ai n ci rcumst ances while t he phys-
ical consi st ency of t he dat a was
suspended (e.g., while i ndexes
or poi nt ers were bei ng adjusted).
Ther ef or e, t he pot ent i al of Level 1
for increasing system concur r ency
was not ful l y realized.
It was our expect at i on t hat a
t r adeof f woul d exist bet ween Isola-
t i on Levels 2 and 3 in whi ch Level 2
woul d be "cheaper " and Level 3
"safer. " I n practice, however, it was
observed t hat Level 3 act ual l y in-
vol ved less CPU over head t han
Level 2, since it was si mpl er to ac-
qui re locks and keep t hem t han to
acqui re locks and i mmedi at el y
release t hem. It is t rue t hat Isol at i on
Level 2 permi t s a great er degree of
643 Communications
of
the ACM
October 1981
Volume 24
Number 10
COMPUTING
PRACTI CES
access to t he dat abase by concur r ent
readers and updat er s t han does Level
3. However, this increase in concur-
r ency was not obser ved to have an
i mpor t ant effect in most pract i cal ap-
plications.
As a result of t he observat i ons
descri bed above, most Syst em R
users r an t hei r queri es and appl i ca-
t i on pr ogr ams at Level 3, whi ch was
t he system defaul t .
The Convoy Phenomenon
Exper i ment s with t he l ocki ng
subsyst em of Syst em R i dent i fi ed a
pr obl em whi ch came to be known as
the "convoy phe nome non" [9].
Ther e are cert ai n hi gh-t raffi c locks
in Syst em R whi ch ever y process
requests f r equent l y and hol ds for a
short time. Exampl es of these are the
locks whi ch cont rol access to t he
buf f er pool and t he system log. In a
" convoy" condi t i on, i nt eract i on be-
t ween a hi gh-t raffi c lock and t he op-
erat i ng system di spat cher t ends to
serialize all processes in the system,
al l owi ng each process to acqui re the
lock onl y once each t i me it is dis-
pat ched.
In t he VM/ 370 oper at i ng system,
each process in t he mul t i pr ogr am-
mi ng set receives a series of small
" quant a" of CPU time. Each quan-
t um t ermi nat es aft er a preset amount
of CPU time, or when t he process
goes i nt o page, 1/ O, or lock wait. At
t he end of t he series of quant a, t he
process drops out of the mul t i pro-
gr ammi ng set and must under go a
l onger "t i me slice wai t " before it
once agai n becomes di spat chabl e.
Most quant a end when a process
waits f or a page, an I / O operat i on,
or a l ow-t raffi c lock. The Syst em R
design ensures t hat no process will
ever hol d a hi gh-t raffi c l ock duri ng
any of these t ypes of wait. Ther e is
a slight probabi l i t y, however, t hat a
process mi ght go i nt o a l ong "t i me
slice wai t " while it is hol di ng a high-
traffic lock. In this event, all ot her
di spat chabl e processes will soon re-
quest the same l ock and become en-
queued behi nd t he sleeping process.
Thi s phenomenon is called a "con-
voy. "
In t he ori gi nal System R design,
convoys are stable because of t he
pr ot ocol f or releasing locks. When a
process P releases a lock, t he l ocki ng
subsyst em grant s the l ock to t he first
wai t i ng process in t he queue ( t her eby
maki ng it unavai l abl e to be reac-
qui r ed by P). Af t er a short time, P
once agai n requests t he lock, and is
forced to go to t he end of t he convoy.
I f t he mean t i me bet ween requests
for t he hi gh-t raffi c l ock is 1,000 in-
structions, each process may execut e
onl y 1,000 i nst ruct i ons before it
drops to the end of t he convoy. Since
mor e t han 1,000 i nst ruct i ons are t yp-
ically used to di spat ch a process, t he
system goes into a "t hr ashi ng" con-
di t i on in whi ch most of t he cycles are
spent on di spat chi ng over head.
The sol ut i on to t he convoy prob-
l em i nvol ved a change to t he lock
release pr ot ocol of Syst em R. Aft er
t he change, when a process P releases
a lock, all processes whi ch are en-
queued for t he l ock are made dis-
pat chabl e, but t he lock is not gr ant ed
to any par t i cul ar process. Ther ef or e,
t he l ock may be regrant ed to process
P i f it makes a subsequent request.
Process P may acqui re and release
t he l ock many times before its t i me
slice is exhaust ed. It is hi ghl y pr ob-
able t hat process P will not be hol d-
ing t he lock when it goes i nt o a l ong
wait. Ther ef or e, i f a convoy shoul d
ever form, it will most likely evapo-
rat e as soon as all t he member s of
t he convoy have been di spat ched.
Additional Observations
Ot her observat i ons were made
dur i ng t he eval uat i on of Syst em R
and are listed below:
(1) When r unni ng in a "canned
t r ansact i on" envi r onment , it woul d
be hel pful f or t he system to i ncl ude
a dat a communi cat i ons front end to
handl e t er mi nal i nt eract i ons, pri ori t y
scheduling, and logging and restart
at t he message level. Thi s facility was
not i ncl uded in t he Syst em R design.
Also, space woul d be saved and t he
worki ng set r educed i f several users
execut i ng t he same "canned trans-
act i on" coul d share a c ommon access
modul e. Thi s woul d requi re t he Sys-
t em R code gener at or to pr oduce
r eent r ant code. Appr oxi mat el y hal f
t he space occupi ed by t he mul t i pl e
copies of t he access modul e coul d be
saved by this met hod, since t he ot her
hal f consists of worki ng storage
whi ch must be dupl i cat ed for each
user.
(2) Whe n t he r ecover y subsys-
t em at t empt s to t ake an aut omat i c
checkpoi nt , it inhibits t he processi ng
of new RSS commands unt i l all users
have compl et ed t hei r cur r ent RSS
command; t hen t he checkpoi nt is
t aken and all users are al l owed to
proceed. However , cert ai n RSS com-
mands pot ent i al l y i nvol ve l ong op-
erat i ons, such as sort i ng a file. I f
these "l ong" RSS oper at i ons were
made i nt errupt i bl e, it woul d avoi d
any del ay in per f or mi ng checkpoi nt s.
(3) The Syst em R desi gn of au-
t omat i cal l y mai nt ai ni ng a syst em
cat al og as par t of t he on-l i ne dat a-
base was ver y well liked by users,
since it per mi t t ed t hem to access t he
i nf or mat i on in t he cat al og wi t h ex-
act l y t he same quer y l anguage t hey
use for accessing ot her data.
5. Concl usi ons
We feel t hat our exper i ence wi t h
Syst em R has cl earl y demonst r at ed
t he feasibility of appl yi ng a rela-
t i onal dat abase syst em to a real pro-
duct i on envi r onment in whi ch ma ny
concur r ent users are per f or mi ng a
mi xt ur e of ad hoc queri es and repet -
itive t ransact i ons. We bel i eve t hat
t he hi gh-l evel user i nt erface made
possible by t he rel at i onal dat a model
can have a dr amat i c positive effect
on user pr oduct i vi t y in devel opi ng
new appl i cat i ons, and on t he dat a
i ndependence of queri es and pro-
grams. Syst em R has also demon-
st rat ed t he abi l i t y to suppor t a hi ghl y
dynami c dat abase envi r onment in
whi ch appl i cat i on r equi r ement s are
rapi dl y changi ng.
I n part i cul ar, Syst em R has illus-
t r at ed t he feasibility of compi l i ng a
ver y hi gh-l evel dat a subl anguage,
SQL, i nt o machi ne- l evel code. The
644 Communi cat i ons
of
t he ACM
Oct ober 1981
Vol ume 24
Numbe r 10
result of this compi l at i on t echni que
is t hat most of t he over head cost f or
i mpl ement i ng t he high-level lan-
guage is pushed i nt o a "pr ecompi l a-
t i on" step, and per f or mance for
canned t ransact i ons is compar abl e to
t hat of a much l ower level system.
The compi l at i on appr oach has also
pr oved to be appl i cabl e to t he ad hoc
quer y envi r onment , with t he result
t hat a uni fi ed mechani sm can be
used to support bot h queries and
transactions.
The eval uat i on of System R has
led to a number of suggested im-
provement s. Some of these i mprove-
ment s have al r eady been imple-
ment ed and ot hers are still under
study. Two maj or foci of our cont i n-
ui ng research pr ogr am at t he San
Jose l abor at or y are adapt at i on of
Syst em R to a di st ri but ed dat abase
envi r onment , and ext ensi on of our
opt i mi zer al gori t hms to encompass a
br oader set of access paths.
Somet i mes quest i ons are asked
about how the per f or mance of a re-
l at i onal dat abase system mi ght com-
pare to t hat of a "navi gat i onal " sys-
t em in whi ch a pr ogr ammer careful l y
hand- codes an appl i cat i on to take
advant age of explicit access paths.
Our experi ment s wi t h t he Syst em R
opt i mi zer and compi l er suggest t hat
t he rel at i onal system will pr obabl y
appr oach but not qui t e equal t he
per f or mance of the navi gat i onal sys-
t em for a particular, hi ghl y t uned
appl i cat i on, but t hat t he rel at i onal
system is mor e likely to be able to
adapt to a br oad spect rum of unan-
t i ci pat ed appl i cat i ons with adequat e
per f or mance. We believe t hat t he
benefits of rel at i onal systems in t he
areas of user product i vi t y, dat a in-
dependence, and adapt abi l i t y to
changi ng ci rcumst ances will take on
increasing i mpor t ance in t he years
ahead.
A ckno wledgments
Fr om the beginning, System R
was a gr oup effort. Credi t for any
success of the project pr oper l y be-
longs to t he t eam as a whol e r at her
t han to specific individuals.
The i nspi rat i on for const ruct i ng
a rel at i onal system came pr i mar i l y
645
f r om E. F. Codd, whose l andmar k
paper [22] i nt r oduced the rel at i onal
model of data. The manager of t he
project t hr ough most of its existence
was W. F. Ki ng.
In addi t i on to t he aut hors of this
paper, t he fol l owi ng peopl e were as-
sociated with Syst em R and made
i mpor t ant cont ri but i ons to its devel-
opment :
M. Adi ba
R. F. Boyce
A. Chan
D. M. Choy
K. Eswar an
R. Fagi n
P. Fehder
T. Haer der
R.H. Kat z
W. Ki m
H. Kor t h
P. McJones
D. McLeod
M. Mresse
J.F. Nilsson
R.L. Obermarck
D . Stott Parker
D . Portal
N. Ramsperger
P. Reisner
P.R. Roever
R. Selinger
H.R. Strong
P. T iberio
V. Watson
R. Williams
References
1. Adiba, M.E., and Lindsay, B.G. Data-
base snapshots. IBM Res. Rep. RJ2772, San
Jose, Calif., March 1980.
2. Astrahan, M.M., and Chamberlin, D.D.
Impl ement at i on of a structured English
query language. Comm. A CM 18, 10 (Oct.
1975), 580-588.
3. Astrahan, M.M., and Lorie, R.A. SE-
QUEL-XRM: A Relational System. Proc.
ACM Pacific Regional Conf., San Francisco,
Calif., April 1975, p. 34.
4. Astrahan, M.M., et al. System R: A rela-
tional approach to database management.
A CM Trans. Dat abase Syst.1, 2 ( June 1976)
97-137.
5. Astrahan, M.M., et al. System R: A rela-
tional data base management system. 1EEE
Comptr. 12, 5 (May 1979), 43-48.
6. Astrahan, M.M., Kim, W., and Schkol-
nick, M. Evaluation of the System R access
path selection mechanism. Proc. IFIP Con-
gress, Melbourne, Australia, Sept. 1980, pp.
487-491.
7. Blasgen, M.W., Eswaran, K.P. Storage
and access in relational databases. I BM Syst.
J. 16, 4 (1977), 363-377.
8. Blasgen, M.W., Casey, R.G., and Es-
waran, K.P. An encoding method for multi-
field sorting and indexing. Comm. A CM 20,
11 (Nov. 1977), 874-878.
9. Blasgen, M., Gray, J., Mitoma, M., and
Price, T. The convoy phenomenon. Operat-
ing Syst. Rev. 13, 2 (April 1979), 20-25.
10. Blasgen, M.W., et al. System R: An ar-
chitectural overview. I BM Syst. J. 20, 1
(Feb. 1981), 41-62.
11. Bjorner, D., Codd, E.F., Deckert, K.L.,
and Traiger, I.L. The Gamma Zero N-ary
relational data base interface. IBM Res. Rep.
RJ 1200, San Jose, Calif., April 1973.
Communications
of
the ACM
12. Boyce, R.F., and Chamberlin, D.D. Us-
ing a structured English query language as a
data definition facility. IBM Res. Rep.
RJl 318, San Jose, Calif., Dec. 1973.
13. Boyce, R.F., Chamberlin, D.D., King,
W.F., and Hammer, M.M. Specifying queries
as relational expressions: The SQUARE data
sublanguage. Comm. A CM 18, I l (Nov.
1975), 621-628.
14. Chamberlin, D.D., and Boyce, R.F. SE-
QUEL: A structured English query language.
Proc. ACM- SI GMOD Workshop on Dat a
Description, Access, and Control, Ann Ar-
bor, Mich., May 1974, pp. 249-264.
15. Chamberlin, D.D., Gray, J.N., and
Traiger, I.L. Views, authorization, and lock-
ing in a relational dat abase system. Proc.
1975 Nat. Comptr. Conf., Anaheim, Calif.,
pp. 425-430.
16. Chamberlin, D.D., et al. SEQUEL 2: A
unified approach to dat a definition, mani pu-
lation, and control. I BM J. Res. and Develop.
20, 6 (Nov. 1976), 560-575 (also see errata in
Jan. 1977 issue).
17. Chamberlin, D.D. Relational dat abase
management systems. Comptng. Surv. 8, I
(March 1976), 43-66.
18. Chamberlin, D.D., et al. Dat a base sys-
tem authorization. In Foundat i ons o f Secure
Computation, R. Demillo, D. Dobkin, A.
Jones, and R. Lipton, Eds., Academic Press,
New York, 1978, pp. 39-56.
19. Chamberlin, D.D. A summary of user
experience with the SQL data sublanguage.
Proc. Internat. Conf. Dat a Bases, Aberdeen,
Scotland, July 1980, pp. 181-203 (also IBM
Res. Rep. RJ2767, San Jose, Calif., April
1980).
20. Chamberlin, D.D., et al. Support for re-
petitive transactions and ad-hoc queries in
System R. A CM Trans. Dat abase Syst. 6, 1
(March 1981), 70-94.
21. Chamberlin, D.D., Gilbert, A.M., and
Yost, R.A. A history of System R and SQL/
dat a system (presented at the Internat. Conf.
Very Large Dat a Bases, Cannes, France,
Sept. 1981).
22. Codd, E.F. A relational model of dat a
for large shared data banks. Comm. A CM
13, 6 ( June 1970), 377-387.
23. Codd, E.F. Furt her normalization of the
data base relational model. In Courant Com-
put er Sci ence Symposia, Vol. 6: Dat a Base
Syst ems, Prentice-Hall, Englewood Cliffs,
N.J., 1971, pp. 33-64.
24. Codd, E.F. Recent investigations in rela-
tional data base systems. Proc. I FI P Con-
gress, Stockholm, Sweden, Aug. 1974.
25. Codd, E.F. Extending the dat abase rela-
tional model to capture more meaning. A CM
Trans. Dat abase Syst. 4, 4 (Dec. 1979), 397-
434.
26. Comer, D. The ubiquitous B-Tree.
Comptng. Surv. 11, 2 ( June 1979), 121-137.
27. Date, C.J. An Int roduct i on to Dat abase
Systems. 2nd Ed., Addison-Wesley, New
York, 1977.
October 1981
Volume 24
Number 10
28. Eswaran, K.P., and Chamberlin, D.D.
Functional specifications of a subsystem for
dat abase integrity. Proc. Conf. Very Large
Dat a Bases, Frami ngham, Mass., Sept. 1975,
pp. 48-68.
29. Eswaran, K.P., Gray, J.N., Lorie, R.A.,
and Traiger, I.L. On the notions of consis-
tency and predicate locks in a database sys-
tem. Comm. A CM 19, 11 (Nov. 1976), 624-
633.
30. Fagin, R. Multivalued dependencies and
a new normal form for relational databases.
A CM Trans. Dat abase Syst. 2, 3 (Sept. 1977),
262-278.
31. Fagin, R. On an authorization mecha-
nism. A CM Trans. Dat abase Syst. 3, 3 (Sept.
1978), 310-319.
32. Gray, J.N., and Watson, V. A shared
segment and inter-process communi cat i on
facility for VM/ 370. IBM Res. Rep. RJ1579,
San Jose, Calif., Feb. 1975.
33. Gray, J.N., Lorie, R.A., and Putzolu,
G. F. Granul ari t y of locks in a large shared
database. Proc. Conf. Very Large Dat a
Bases, Frami ngham, Mass., Sept. 1975, pp.
428-451.
34. Gray, J.N., Lorie, R.A., Putzolu, G.R.,
and Traiger, I.L. Granul ari t y of locks and
degrees of consistency in a shared dat a base.
Proc. I FI P Worki ng Conf. Modelling of
Dat abase Management Systems, Freuden-
stadt, Germany, Jan. 1976, pp. 695-723 (also
IBM Res. Rep. RJ1654, San Jose, Calif.).
35. Gray, J.N. Notes on dat abase operating
systems. In Operating Syst ems: An Advanced
Course, Goos and Hartmanis, Eds., Springer-
Verlag, New York, 1978, pp. 393-481 (also
IBM Res. Rep. RJ2188, San Jose, Calif.).
36. Gray, J.N., et al. The recovery manager
of a data management system. IBM Res.
Rep. RJ2623, San Jose, Calif., June 1979.
37. Griffiths, P.P., and Wade, B.W. An au-
thorization mechani sm for a relational data-
base system. A CM Trans. Dat abase Syst. 1, 3
(Sept. 1976), 242-255.
38. Katz, R.H., and Selinger, R.D. Int ernal
comm., IBM Res. Lab., San Jose, Calif.,
Sept. 1978.
39. Kwan, S.C., and Strong, H.R. Index
pat h length evaluation for the research stor-
age system of System R. IBM Res. Rep.
RJ2736, San Jose, Calif., Jan. 1980.
40. Lorie, R.A. XRM- - An extended (N-ary)
relational memory. IBM Tech. Rep. G320-
2096, Cambridge Scientific Ctr., Cambridge,
Mass., Jan. 1974.
41. Lorie, R.A. Physical integrity in a large
segmented database. A CM Trans. Dat abase
Syst. 2, 1 (March 1977), 91-104.
42. Lorie, R.A., and Wade, B.W. The com-
pilation of a hi gh level data language. IBM
Res. Rep. RJ2598, San Jose, Calif., Aug.
1979.
43. Lorie, R.A., and Nilsson, J.F. An access
specification language for a relational dat a
base system. I BM J. Res. and Develop. 23, 3
(May 1979), 286-298.
44. Reisner, P., Boyce, R.F., and Chamber-
lin, D.D. Human factors evaluation of two
dat a base query languages: SQUARE and
SEQUEL. Proc. AFIPS Nat. Comptr. Conf.,
Anahei m, Calif., May 1975, pp. 447-452.
45. Reisner, P. Use of psychological experi-
ment at i on as an aid to devel opment of a
query language. I EEE Trans. Soft ware Eng.
SE-3, 3 (May 1977), 218-229.
46. Schkolnick, M., and Tiberio, P. Consid-
erations in developing a design tool for a
relational DBMS. Proc. IEEE COMPSAC
79, Nov. 1979, pp. 228-235.
47. Selinger, P.G., et al. Access pat h selec-
tion in a relational dat abase management
system. Proc. ACM SI GMOD Conf., Boston,
Mass., June 1979, pp. 23-34.
48. Stonebraker, M. Impl ement at i on of in-
tegrity constraints and views by query modi-
fication. Tech. Memo ERL-M514, College of
Eng., Univ. of Calif. at Berkeley, March
1975.
49. Strong, H.R., Traiger, I.L., and Mar-
kowsky, G. Slide Search. IBM Res. Rep.
RJ2274, San Jose, Calif., June 1978.
50. Traiger, I.L., Gray J.N., Galtieri, C.A.,
and Lindsay, B.G. Transactions and consis-
tency in distributed database systems. IBM
Res. Rep. RJ2555, San Jose, Calif., June
1979.
646 Communi cat i ons
of
the ACM
October 1981
Volume 24
Number 10

You might also like