0% found this document useful (0 votes)
4 views

DBMS-1

The document discusses the relational data model, focusing on its essential concepts, normalization, and relational languages. It highlights the advantages of relational systems, such as data independence and the ability to represent information abstractly, while also detailing the evolution of database management systems. Key definitions and examples are provided to illustrate the structure and properties of relations within this model.

Uploaded by

sanskargade102
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

DBMS-1

The document discusses the relational data model, focusing on its essential concepts, normalization, and relational languages. It highlights the advantages of relational systems, such as data independence and the ability to represent information abstractly, while also detailing the evolution of database management systems. Key definitions and examples are provided to illustrate the structure and properties of relations within this model.

Uploaded by

sanskargade102
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Relational Data-Base M a n a g e m e n t Systems

DONALD.D. CHAMBERLIN
IBM Research Laboratory, San Jose, California 9519S

The essential concepts of the relational data model are defined, and normalization,
relational languages based on the model, as well as advantages and
implementations of relational systems are discussed.
Keywords and Phrases: Data base, data-base management, data independence,
data model, relational systems
CR Categories: 8.5I, ~.3~,~.~

INTRODUCTION oriented toward the information content of


Before describing the relational model of their data, and decreasingly concerned with
data, we will briefly discuss some trends in its representation details. Increasingly, the
data-base management which give motiva- user interface of a modem DBMS deals
tion to the development of the relational with abstract information rather than with
model. The first large-scale, machine-read- the various bits, pointers, arrays, lists, etc.,
able collections of data were stored on ex- which may be used to represent informa-
ternal media such as cards or tape. Be- tion. Responsibility for choosing an ap-
ginning in the late fifties and early sixties, propriate representation for the information
data banks were being stored on-line using is being assumed by the system and is not
direct-access devices such as disks. Gen- exposed to the end user; indeed, the repre-
eralized software packages such as BDAM sentation of a given fact may change over
and ISAM IT21 were developed to aid pro- time without the user being aware of the
grammers in accessing the data. During the change. The general term for this trend
late sixties and early seventies, the idea of away from representation details is data
an integrated data-base management system independence.
was developed. This concept allowed several If we attempt to extrapolate the trend
applications to share a common bank of toward data independence, we observe that
data, maintained and protected by a central most current DBMS present the user with
system• In an integrated data-base environ- a view of records connected in some sort of
ment, the data-base management system structure, such as a network or hierarchy.
(DBMS) provides each application program In such a view, information may be repre-
with its own view of the common data, sented in at least three ways:
implements various operators for retrieval 1) by the contents ofrecords (e.g.,Smith's
and update of data, and resolves inter- employee record has DEPTNO = 50.);
ference between concurrent users. 2) by the connections between records
The overall trend which is visible in e.g., Smith's employee record occurs
data-base management today is the fol- in the hierarchy below the department
lowing: users are becoming increasingly record for Dept. 50.); and

Copyright © 1976, Association for Computing Machinery, Inc• General permission to'republish,
but not for profit, all or part of this material is granted provided that ACM's copyright notice is
given and that referenceis made to the publication, to its date of issue, and to::the fact that reprinting
privileges were granted by permission of the Associationfor Computing Machinery.
44 • DonaldD. Chamberlin

CONTENTS IBM 7090, the user was given sets of data-


tuples (not constrained to be of common
type) and set-theoretic operators, such as
INTRODUCTION union and intersection, to manipulate the
DEFINITIONS
NORMALIZATION
data.
LANGUAGES In the late sixties, several artificial in-
Query Facilities telligence-oriented systems were imple-
Relational Calculus
Relational Algebra mented based on binary relations, which are
Mapping-Oriented Languages simply collections of ordered pairs of ob-
Graphics-Oriented Languages ects related in a certain way. An example
Data Manipulation
Data Definition and Control of a binary relation is:
Language Evaluation
Natural Languages FATHER-OF: {{Mary, George),
ADVANTAGES {George, Bill), {John, Bill)}
IMPLEMENTATIONS
SUMMARY Systems incorporating binary relations for
ACKNOWLEDGMENTS
REFERENCES data storage included the Relational Data
File of Levien and Maron [Y2], the TRAMP
system of Ash and Sibley [Y5], and the
LEAP language of Feldman and Rovner [Y6].
The considerable attention paid to n-ary
3) by the ordering of records (e.g., all the relations as a tool for general data-base
sales records are stored in chrono- management dates from a 1970 paper by
logical order). E. F. Codd, of IBM [M2]. Codd was t h e
User requests made to the DBMS are then first to give a rigorous definition for n-ary
framed in terms which depend on user relations in the data-base context, and to
knowledge of the representation chosen emphasize their advantages for data :inde-
pendence and symmetry of access.
(e.g., "FIND N E X T RECORD OF ABC Codd's paper introduced concepts which
SET"). set the direction for research in relational
The relational data model makes it pos- data-base management for several years to
sible to eliminate this last representation- come. The paper defined a data sublanguage
dependence from the user interface. In the as a set of facilities, suitable for embedding
relational model, information is represented in a host programming language, which
in only one way at the user interface: by permits the retrieval of various subsets of
data values. User requests become free of data from a data bank. The paper noted
any dependence on internal representation, that a standard logical notation, the first
and hence may be framed in a high-level, order predicate cMculus, is appropriate as a
nonprocedural language. At the same time, data sublanguage for n-ary relations. The
the system becomes free to choose any paper also introduced a set of operators
physical structure for storage of data, and ("join," "projection," etc.) which were later
to optimize the execution of a given request. developed into the well-known relational
A very early proposal for a representation- algebra. Finally, the paper explored\~the
independent approach to file processing was properties of "redundancy" a n d ~ q q n -
made by the Language Structure Group of sistency" of relations, which laid the ~r(~nd-
the CODASYL Development Committee in work for Codd's later theory of normaliza-
1962 [Y1]. The CODASYL proposal, called tion.
"An Information Algebra," was nearer im-
plemented: An early worker in a related DEFINITIONS . ~,:
area was David Childs, of the University of
Michigan, who proposed a "set-theoretic We will now discuss the basic concepts and
data structure," based on a "reconstituted definitions which underlie the relational data
definition of relation" [Y3, Y4]. In Childs' model. Many of these concepts were first
system, which was implemented on an introduced by Codd's original paper [M2].

Computing Surveys, Vol. 8, No. 1, March 1976


Relational Data-Base Management ,gysten~ 45

ELECTIONS YEAR WINNER-NAME LOSERtNAME

O~ 1952 Eisenhower Stevenson


1956 Eisenhower Stevenson
1960 Kennedy Nixon
1964 Johnson Goldwater
1968 Nixon {HUmphrey, Wallace}
1972 Nixon MeGovern

FIGURE. 1 The ELECTIONS relation.

An excellent introduction to relational con- of Figure 1, the second and third columns
cepts can also be found in Date's recent text- are both based on the same domain: the
book [-Zt1]. set of names of Presidential candidates.
In mathematics, the term relation may However, ,each column has a different role-
be defined as follows: Given sets D1, D ~ , . . . , name to describe its meaning in this par-
D~ (not necessarily distinct), a relation R ticular relation: WINNER-NAME and
is a set of n-tuples each of which has its LOSER-NAME.
first element from D~, second element from The individual entries in each mple are
D2, etc. The sets D~ are called domains. called its components. Thus, we may say
The number n is called the degree of R, and that in the tuple whoseYEAR-eomponent is
the number of tuples in R is Called its "1952,'.' the LOSER-NAME-component is
cardinality. "Stevenson."
It is customary (though not essential) A column or set of columns whose values
when discussing relations to represent a uniquely identify a row of a relation is
relation as a table in which each row repre- called a candidate key (often shortened to
sents a tuple. An example of this representa- simply key) of the relation. In Figure 1,
tion is shown in Figure 1, which illustrates a YEAR is a key for. ELECTIONS •since no
relation describing Presidential elections. two rows have the same YEAR. I t is pos-
In the tabular representation of a relation, sible for a relation to have more than one
the following properties, which derive from key. For example, if the ELECTIONS re-
the definition of a relation, should be ob- lation had an additional column ADMIN-"
served: ISTRATION-NUMBER, it would also be
a key. When a relation has more than one
1) no two rows are identical;
key, it is customary to designate one as the
2) the ordering of rows is not signifi-
cant; and
primary key.
Often a column or set of columns in one
3) the ordering of columns is significant
relation will correspond to a key of another
(i.e., the meanings of the tuples
relation. For example, consider the PRESI-
(1972, Nixon, McGovern) and (1972,
DENTS relation of Figure 2, whose key is
McGovern, Nixon) are quite differ-
NAME. The values of WINNER-NAME
ent).
in the ELECTIONS relation correspond to
When a rdation is represented as a table, its values of the key-column NAME in PRESI-
degree is the number of columns' and its DENTS. Consequently, WINNER-NAME
cardinality is the number of rows. in ELECTIONS is called a foreign key.
In the tabular representation of a rela- Two facts should be noted: 1 ) a foreign
tion, it is customary to name the table and" key need not be (and often is not) a key of
to name each column, as shown in Figure 1. its own relation; and 2) the foreign key need
The columns of the table are called attributes." not have the same role-name (e.g.,
(Sometimes the name of a column is referred WINNER-NAME) as the corresponding
to as a role name.) It is important to dis'- key in the other relation (e.g., NAME).
tinguish between attributes and domains. In an integrated data-base management
For example, in the ELECTIONS relation system, different users may have a need to
46 • Donald D. Chamberlin

PRESIDENTS NAME PARTY HOME-STATE

Eisenhower Republican Texas


Kennedy Democrat Massachusetts
Johnson Democrat Texas
Nixon Republican California

FIGUaE 2. The PRESIDENTS relation.

see different subsets of the universe of data. in first normal form are sometimes called
The term data model denotes the universe "flat tables". If we look carefully at the re-
of data--the complete set of relations stored lation in Figure 1, we see that it is not in
in the system. A schema is a set of declara- first normal form. This is because an elec-
tions which describe the data model. The tion, while it has only one winner, may
term data submodel denotes the set of re- have several losing candidates. Thus, for
lations which is available to a particular example, the tuple for the election of 1968
user, and a subschema is a set of declarations contains the component {"Humphrey",
for the data submodel. A complete data- "Wallace"}. In fact, the LOSER-NAME
management system must provide a means component of each election tuple is a list.
for defining the schema and a subschema for whose length depends on the number of
each distinct class of users of the system. votes a candidate must receive to merit in-
clusion in the data base.
We can convert the ELECTIONS rela-
NORMALIZATION
tion into first normal form by breaking it
The issue of designing a schema and sub- Up into two relations, one containing infor-
schemas for a data base leads us to a discus- mation on winning candidates and the other
sion of normalization. The concept of nor- on losing candidates. This also gives us a
malization was introduced by Codd in [M2] good opportunity to record other attributes
and dealt with more rigorously in his later of interest about the candidates, such as
papers [N1] and [N2]. A number of other their party and number of votes received.
authors have also made contributions to the This leads us to the data base shown in
theory of normalization (see bibliography). Figure 3, which is in first normal form.
Normalization theory begins with the The key of ELECTIONS-WON is YEAR;
observation that certain collections of rela- the key of ELECTIONS-LOST is (YEAR,
tions have better properties in an updating LOSER-NAME}.
environment than do other collections of To illustrate the advantages of the higher
relations containing the same data. The normal forms, we need to make updates to
theory then provides a rigorous discipline the data base by inserting new tuples, de-
for the design of relations which have favor- leting existing tuples, and making changes
able update properties. The theory is based to existing tuples. These updates are not
on a series of normal forms--first, second, particularly well motivated for our example
and third normal form--which provide suc- data base, in which data is mostly static
cessive improvements in the update prop- and unchanging. Of course, in an operational
erties of a data base. We will discuss these data base describing, for example, the in-
normal forms on an intuitive basis; for a ventory of a store, updates would be very
thorough treatment, see [N1], IN8], or frequent. For the sake of consistency, we will
[z11]. continue with our Presidential example.
Almost all references to relations im- (You may imagine that some data was found
plicitly deal with relations in first normal to be in error and is being updated to correct
form. A relation in first normal form is a the data base.)
relation in which each component of each Relations in first normal form may be
tuple is nondecomposable; i.e., the com- used with any of the relational languages
ponent is not a list or a relation. Relations which are described in the next section.

Coml~uthag Su~veye, Vol. 8, No. 1, March 1976


Relational Data-Base Manwgeme~ 8y~t~18 • 47

However, a relation in first normal form may worse, it leads to the; possibility that differ-
exhibit three kinds of misbehavior, which ent tuples may contain inconsistent values
are called update anomalies, insertion of HOME-STATE for the same President.
anomalies, and deletion anomalies. All these Insertion anomalies: Suppose we wish to
anomalies arise because more than one insert a fact about a candidate which is
"concept" may be mixed together in the independent of any election, e.g., "Dewey,
same tuple. Consider the ELECTIONS- was a Republican." This is difficult in our
WON relation of Figure 3. Mixed together example data base because there is no rela-:
in one tuple of this relation are facts about tion for candidates. We are forced to invent
candidates (e.g., "Eisenhower came from a tuple in ELECTIONS-LOST (or ELEC-
Texas") and facts about elections (e.g., TIONS-WON?) having null values for
"In 1952 Eisenhower received 442 elec- YEAR and the o~,er irrelevant attributes.
toral votes"). In some applications it may In many systems we would be unable to
be important that each of these facts be store this fact because null values are not
independently updated, inserted, and de- permitted in the primary key.
leted. This gives rise to the three anomalies, Deletion anomalies: Suppose we wish to
which we can now illustrate by the following delete the information about elections as
examples. they fall beyond a certain number of years
Update anomalies: Suppose the fact that in the past. When we delete the 1952-tuple
"Eisenhower's home state is Texas" is from ELECTIONS-WON, we still retain
found to be in error, and his home state the fact that Eisenhower was a Republican.
must be changed to Nebraska. Since Eisen- But when we delete the 1956-tuple, all
hower appears in more than one tuple of facts about Eisenhower are lost. In some
ELECTIONS-WON, this erroneous fact applications, this might have •very serious
may be represented many times (in general, consequences. For example, consider a rela-
a time-varying number of times). This tion describing orders for various items,
makes it difficult to update this particular shown in Figure 4. As orders are filled we
fact, since all tuples where it is represented delete their tuples from the relation. When
must be searched out and updated. Even we have deleted the last order for toasters,

YEAR WINNER- WINNER- PARTY HOME-


ELECTIONS-WON NAME VOTES STATE

1952. Eisenhower 442 Republican Texas


1956 Eisenhower 447 Republican Texas
1960 Kennedy 303 Democrat Mass.
1964 Johnson 486 Democrat Texas
1968 Nixon 301 Republican Calif.
1972 Nixon 520 Republican Calif.

LOSER- LOSER-
ELECTIONS-LOST YEAR PARTY
NAME VOTES

1952 Stevenson 89 Democrat


1956 Stevenson 73 Democrat
1960 Nixon 219 Republican
1964 Goldwater 52 Republican
1968 Humphrey 191 Democrat
1968 Wallace 46 Am. Indep.
1972 MeGovern 17 Democrat

FIGURE 3. Elections data base in first normal form.

o
48 • Donald D. Chamberlin

QUANTITY-
ORDERS ITEM PRICE DATE
ORDERED

Toaster 20.00 1/lo/75


Toaster 20.00 2/15/75
Mixer 28.00 4/6/75

FIGURE4. The ORDERS relation.

we find we n o longer have any information variety of ways. The original definition was
about the price of toasters--possibly an given by Boyce and Codd in IN1]. Later
unintended result. This kind of relation writers, including Kent [N8], Codd [M14],
burdens the user with the responsibility of and Sharman [N15], proposed alternate
making sure that the tuple he deletes is not definitions which framed the same concept
the last tuple of some "category" (e.g., in simpler terminology. We present two of
toasters), and therefore the sole bearer of these equivalent definitions:
information about that category (e.g.,
price). Definition, Boyce and Codd [M14]:
An important objective of normalization A relation R is in third normal form if it is in
first normal form and, for every attribute
is the elimination of the update, insertion, collection C of R, if any attribute not in C is
and deletion anomalies. The most widely- functionally dependent on C, then all attri-
known result of normalization theory is butes in R are functionally dependent on C.
third normal form. Since second normal form
is of little significance except as a stopping- Definition, Sharman [N15]:
A relation is in third normal form if every
off place on the way to third, we will proceed determinant is a key.
directly to the definition of third normal
form. Both definitions are formal ways of ex-
In order to understand how third normal pressing a very simple idea:-that each re-
form avoids the three anomalies, we must lation should describe a single "concept,"
discuss the concept of functional dependence and if more than one "concept" is found in:
among the attributes of a relation. We say a relation, the relation should be split into
that an attribute B of relation R is func- smaller relations. The result of applying
tionally dependent on attribute A if, at every this "splitting" process to the sample data
instant of time, each A-value in R is as- base of Figure 3 is shown in Figure 5. A
sociated with only one B-value. We ex- moment's examination will show that the
press this relationship by the notation A --~ update, insertion, and deletion anomalies
B, and say "A determines B" or "B de- we discussed are not present in the data
pends on A." Similarly, a set of attributes in base of Figure 5.
R may be functionally dependent on an- The design of a data base in third normal
other attribute or set of attributes. The form depends on knowledge of the func-
attribute (or set of attributes) on the left tional dependencies among the attributes
side of the arrow (A in our example) is of the data. This knowledge cannot be
called the determinant. discovered automatically by a system (un-
Clearly, from our definition of key in the less the data base is completely static), but
previous section, every relation contains at must be furnished by a data-base designer
least one functional dependence: all attri- who understands the semantics of the in-
butes of the relation are dependent on the formation. In fact, there is not a mlique
key. (The dependence may be trivial if the third normal form representation for a
relation contains only a key.) If a relation given data base. In IN1] Codd briefly ad-
has more than one key, then all its attributes dressed the problem of choosing an "Optimal
are dependent on each key. Third Normal Form" from among the
Third normal form has been defined in a various alternatives.

Computing Surveys, Vol. 8, No. 1, March 1976


Relational Data-Base Management Systems • 49

LANGUAGES programming language. The term query lan-


guage usually refers to a stand-alone lan-
Such a great variety of relational languages guage in which an end user interacts di-
is available that it would be impossible to rectly With the data-base management
treat them all here. W e will describe a system. Mos~t query languages provide a
representative example of several important variety of facilities (e.g., update, creation,
categories of relational languages; references and deletion of relations) in addition to a
to other languages can be found in the query capability. As compared with a
bibliography. typical data sublanguage, a query language
The term data sublanguage, i n t r o d u c e d is usually at a higher level, less procedural,
earlier, denotes a set of data-base opera- and intended for a more casual user. Some-
tors intended to be embedded in a host times, however, the same basic set of opera-

ELECTIONS-WON YEAR WINNER -NAME WINNER-VOTES

1952 Eisenhower 442


1956 Eisenho~ver 447
1960 Kennedy 303
1964 Johnson 486
1968 Nixon 301
1972 Nixon 52O

PRESIDENTS NAME PARTY HOME-STATE

Eisenhower Republican Texas


Kennedy Democrat Mass.
Johnson Democrat Texas
Nixon Republican Calif.

ELECTIONS-LOST YEAR LOSER -NAME LOSER -VOTES

1952 Stevenson 89
1956 Stevenson 73
1960 Nixon 219 •
1964 Goldwater 52
1968 Humphrey 191
1968 Wallace 46
1972 McGovern 17

LOSERS NAME PARTY

Stevenson Democrat
Nixon Republican
Goldwater Republican
Humphrey Democrat
Wallace. Am. Indep.
McGovern Delnocrat "
FIGURE 5. a t a ~ a s e in third normaLform.

Computing Surveys, Vol. 8, No. 1, March 1976


50 . Donald D. Chamberlin

tots can serve both as a d a t a sublanguage A typical query in ALPHA has two parts:
and as a query language. a target, which specifies the particular at-
This section will explore the approach tributes of the particular relation which are
taken by various relational languages to to be returned, and a qualification, which
providing facilities for query, data manipu- selects particular tuples from the target
lation (e.g., insertion, deletion, and update relation by giving a condition which they
of tuples), data definition (e.g., creation of must satisfy. We will illustrate ALPHA (and
new relations and other structures), and other languages) by some sample queries
data control (e.g., authorization and control based on the data base of Figure 5.
of data integrity). We will then briefly In Q1) below, the RANGE statement de-
consider some ways in which languages clares P be a variable ranging over the rows
can be evaluated and compared, and discuss of the PRESIDENTS relation. The next
the role of natural language as a data-base statement retrieves into workspace W the
interface. HOME-STATE of row P whenever the
NAME of row P is " K E N N E D Y . "
Query Facilities The qualification part of an ALPHA query
may be quite complex and may use the
Query, or retrieval of information from the universal and existential quantifiers: "for
data base, is perhaps the aspect of relational all" (V), and "there exists" (3). For ex-
languages which has received the most at- ample, see display Q2) below.
tention. We will illustrate the variety of Various other languages based, like
approaches to query by presenting ex- ALPHA, on the relational calculus, have been
amples of four classes of languages: rela- proposed. This class of languages imfludes
tional calculus, relational algebra, mapping- QuEL [S15], CO]bARD [L3], and RIL [L7].
oriented languages, and graphics-oriented
languages. Although we deal only with
query facilities in this section, all the lan- Relational Algebra
guages discussed have facilities for update
and other operations in addition to query. A second major class of languages is based
on the relational algebra, which was in-
Relational Calculus troduced by Codd in [M2] and refined in
[M3]. The relational algebra is a collection
Codd's 1970 paper [M2] laid the ground- of operators that deal with whole relations,
work for two families of relational lan- yielding new relations as a result. The
guages which came to be called the rela- major operators of relational algebra in-
tional calculus and the relational algebra. The elude the following:
relational calculus family grew from the • Projection: The projection operator re-
observation that a first-order applied predi- turns only the specified columns of the
cate calculus can be used as a data sub- given relation, and eliminates dupli-
language for normalized relations. In ILl] cates from the result. For example, to
Codd presented the details of such a calculus- find all the unique (party, home-state}
based sublanguage, called ALPHA. pairs in the PRESIDENTS relation,

Q1) What was the home state of President Kennedy?


RANGE PRESIDENTS P
GET W P. HOME-STATE: (P.NAME ='KENNEDY').

Q2) List the election years in which a Republiban from Illinois was elected.
RANGE PRESIDENTS P
RANGE ELECTIONS-WON E
GET W E.YEAR: 3 P (P.NAME = E.WINNER-NAME &
P. PARTY ffi'REPUBLICAN' & P. HOME-STATE = 'ILLINOIS').

Computing Surveye, Vol. 8, No. 1, March 1976


Relational Data-Base Ma~vaQeme~ 8 y ~ b ~ • 51

we might write the following projec- If a given PRFEsIDENT8 tuple matches


tion: more than one E L E C T I O N S - W O N
PRESIDENTS [PARTY, HOME- tuple, it is concatenated with each of
STATE]. them, forming multiple output rows.
If a given tuple imatches no tuple in
(Note that some algebra-based lan- the other relation, it does not par-
guages use column numbers rather than ticipate in the output at all.
column names. In such languages we • Set-theoretic Operators: In relational al-
would write PRESIDENTS [2, 3] in gebra, the set-theoretic operators--
place of the given expression. This union, intersection, and set-difference---
notation, although less mnemonic, has take two relations as operands, treating
the advantage of avoiding ambiguity each as a set of tuples, and produce a
if some intermediate result has two single relation as a result. The operand
columns with the same name.) relations must have compatible sets of
Restriction: The restriction operator attributes.
selects only those tuples of a relation • Division: Some algebraic languages in-
which satisfy a given condition. As elude an operator, called "division,"
originally proposed, the condition only which operates on two input relations
allowed comparison of one component to produce a third 'relation. This ope-
of a tuple with another component. Some rator is sometimes useful in expressing
implementations of the algebra permit queries which contain the word "all."
other condition-types as well, e.g., com- However, since it can be expressed
parison of a tuple-component with 'a in terms of the other algebraic oper-
constant. For example, to seleet those ators, the division operator does not
tuples from the ELECTIONS-WON extend the logical power of the lan-
relation where YEAR is greater than guage. The reader is referred to [M3]
1945, we might write: for a complete treatment of division.
• Nesting: The algebra has the convenient
ELECTIONS-WON [YEAR:> 1945]. property that its operators can be
Join: The join operator takes two re- nested to form expressions of arbitrary
lations as arguments, which we will complexity, with parentheses used as
refer to as relations A and B. A new needed to remove ambiguities. To il-
relation is formed by concatenating a lustrate nesting of operators, we will
tuple of A with a tuple of B wherever repeat examples Q1) and Q2) (displayed
a given condition holds between them. below) in the relational algebra:
For example: Languages based on the relational al-
ELECTIONS-WON [WINNER- gebra have been implemented at M I T [SI]
NAME : NAME] PRESIDENTS. and [$12], the IBM Scientific Centre in
England [$4] and [$16], and General Motors
This expression concatenates a tuple Research Laboratory :[$5]. In addition,
of ELECTIONS-WON with a tuple of studies of optimization algorithms for the
PRESIDENTS whenever WINNER- relational algebra have been published by
NAME in ELECTIONS-WON matches Smith and Chang [TIS], Pecherer [T16],
NAME in the PRESIDENTS tuple. Gotlieb [T17], and others.

Q1) What was the home state of President Kennedy?


PRESIDENTS [NAME •'KENNEDY'] [HOME-STATE]
Q2) List the election years in which a Republican from Illinois was elected.
(ELECTIONS-WON [WINNER-NAME -- NAME] PRESIDENTS)
[PARTY = 'REPUBLICAN'] [HOME-STATE •'ILLINOIS'][YEAR]

C o m p u ~ SuerS, VoL8, No. 1. March !076


52 • Donald D. Chamberlin

Mapping.Oriented Languages these languages, the user states his query


not by a conventional linear syntax, but by
A third class of relational languages, called making choices or filling i n blanks on a
"mapping-oriented" languages, has been graphic display. Examples of this class of
proposed by R. F. Boyee and others [L9]. languages are Query By Example [L21,
These languages, directed at the nonpro- L24] and CuPm [L17]. We will illustrate
gramming professional, offer power equiva- this type of language by presenting ex-
lent to that of the relational calculus or amples using Query By Example.
algebra while avoiding mathematical con- In Query By Example, the user is pre-
cepts such as quantifiers. Mapping-oriented sented with a blank relation on his display.
languages include: SQUARE, a terse, APL- He fills in one or more rows of the relation
like notation [L25]; SEQUEL, a structured with an example of the desired result.
language based on English keywords ILl0, Known values are frilled in directly. Un-
L8]; and SLICK, a language intended for known values are represented by arbi-
implementation on associative hardware trarily chosen example values, which are
ILl2]. We will illustrate this class of lan- underscored to show that they are ex-
guages by presenting examples of SEQUEL. amples. The attributes to be printed are
The basic building block of mapping- identified by a "P." A query may be con-
oriented languages is the "mapping," which fined to a single relation, or span more than
maps a known attribute or set of attributes one relation, as illustrated by Q1) and Q2)
into a desired attribute or set of attributes at top of page 53.
by means of some relation. Q1) is an ex-
ample of a simple mapping:
Q1) What was the home state of President Data Manipulation
Kennedy? Most relational languages provide facilities
SELECT .HOME-STATE for data manipulation, which includes in-
FROM PRESIDENTS sertion, deletion, and update of tuples.
WHERE N A M E - - ' K E N N E D Y ' . Since update is not well motivated for the
In general, the result of a mapping may Presidential data base, we introduce the
be used in the specification of another following relation to illustrate data manipu-
mapping, as shown in Q2) below. This pro- lation:
cess of "nesting" mappings inside each other
makes it possible to express queries of great EMP (EMPNO, NAME, JOB,
complexity. SALARY)
This relation describes a set of employees,
Graphics-OrientedLanguages giving, in each instance, his or her employee
number, name, job, and salary.
Recently, another important class of rela- Many languages with set-oriented query
tional languages has been proposed: the features also allow set-oriented data manipu-
class of "graphics-oriented" languages. In lation. For example, the .following state-

q2) List the election years in which a Republican from Illinois


was elected.
SELECT YEAR
FROM ELECTIONS-WON
WHERE WINNER-NAME =
SELECT NAME
FROM PRESIDENTS
WHERE PARTY •'REPUBLICAN'
AND HOME-STATE = 'ILLINOIS'

ComputingSurveys, Vol. 8, No. 1, March 1976


Relational Data-Base Manag~w~,,~y~ms 53

Q1) What was the home state of President Kennedy?

PRESIDENTS NAME PARTY HOME.STATE

KENNEDY I P. NEVADA

Q2) List the election years in which a Republican from Illinois was elected.

ELECTIONS-WON YEAR WINNER-NAME WINNER-VOTES

P.1948 WILSON

PRESIDENTS NAME PARTY HOME-STATE

WILSON REPUBLICAN 'ILLINOIS

ment in SEQUEL[L10] has the effect of giving condition. Our first call to GAMMA-0 uses
a 10 % raise to all programmers: the operator CREATE-SCAN, which creates
a scan on the EMP relation to search for
UPDATE EMP tuples according to their EMPNO attribute.
SET SALARY = SALARY*I.1 The system returns a~ identifier, called a
WHERE JOB = 'PRDGRAMMER' SCANID, by which We may refer to the
newly created scan in future calls. Next we
All the languages we have discussed so call the operator SET-SCAN and furnish
far have been high level and nonprocedural the value which is to be searched for (in this
in nature. Indeed, one of the advantages of case the EMPNO, which is the parameter of
the relational model is that it is readily our transaction). Our next call is to the
compatible with high-level languages. But operator NEXT-SUBTUPLE, which re-
it should not be concluded that t h e rela- turns an actual tuple satisfying the cri-
tional model is incompatible with a lower- terion we established by the previous calls:
level, more procedural programming inter- (NEXT-SUBTUPLE ,could be called re-
face. In fact, several low-level, host-lan- peatedly if we expected many tuples to
guage relational interfaces have been pro- satisfy the criterion.) Having obtained the
posed, including GAMMA-0 [L4], XRM [$6], desired employee-tuple, we can compute a
and MINIZ [$8]. These interfaces are well new salary-value in our host program and
suited for writing programs that are to be then call UPDATE SUBTUPLE, which puts
called repeatedly and which update the the new salary-value into the data-base.
data base according to parameters furnished GAMMA-0allows a program to have as many
with the call. active scans as it wishes, and to control the
We will illustrate how one low-level re- position of each by explicit culls. When a
lational language, GAMMA-0, might be used .program has no further use for a scan, it
to write a transaction which finds the em- may drop it by .culling the operator DROP-
ployee-tuple having a given employee SCAN.
number and updates its salary component Although it i s a low-level, procedural
according to some computation. GAMMA-0 language, GAMMA-0 is considered .a rela-
consists of a set of operators which may be tional language because the means of ac-
called from a host language such as P L / I . cess to tuples is not predetermined. A rela-
GAMMA-0 is based on the concept of a tion may be accessed associatively through
"scan," which is like a cursor that moves any of its attributes--the attribute to be
through a relation testing tuples for some matched is declared when a scan is opened.

ComputingSurvive, V~l, 8DNo: I, March 1976


54 • Donald D. Chamberlin

Data Definition and Control the view as though it were a stored relation.
The supportability of updates to the data
In addition to query and data manipulation
base made by means of derived views is a
facilities, a complete data sublanguage
complicated question, one which requires
needs facilities for data definition and data
more research [M14].
control. Data definition has two main as-
pects: The issue of authorization is closely re-
lated to the issue of derived views. In fact,
• Specification of the characteristics of one approach to authorization is to grant to
data to be stored, e.g., the column- each user a particular restricted view [C6].
names and data-types for each rela- Another approach is to automatically add
tion; and certain predicates to the queries and up-
• definition of alternative "views" dates issued by a user in order to restrict
which are derived from the stored their scope to the set of authorized tuples
data. In relational terminology, a [C31.
view is a dynamic "window" on the This unified approach to language design
data base. Updates made to stored can be extended into the aTea of assertions
relations are visible through the concerning data integrity. An assertion is a
various views which are defined on statement about the data base which the
these relations. system automatically enforces by refusing
any update which fails to satisfy the as-
Data control also has two m a i n aspects:
sertion. In language terms, an assertion is
• control over authorization of various simply a predicate, which is syntactically a
users to perform various operations fragment of a query, and which may con-
on the data base; and tain other queries nested inside it. For
• ability to make integrity assertions example, suppose we wish to assert that for
that protect the validity of data and any given election the number of votes re-
define the set of permitted transitions ceiveed by the winner is greater than the
in the data base. number of votes received by any loser.
This assertion may be made as follows in
The relational model permits a language to S~QvEL (the variable X represents a tuple
take a consistent, unified approach to query, of the ELECTIONS-WON, relation):
data manipulation, data definition, and
data control. Several relational languages ASSERT ON ELECTIONS-WON X:
have gone to great lengths to provide such a WINNER-VOTES >
unified approach; these languages include (SELECT MAX (LOSER-VOTES)
S~QUEL [L10, LS, C6, I5], QvEL [S15, C3, FROM ELECTIONS-LOST
I4], and Query By Example [L21, L24]. WHERE YEAR=X.YEAR)
An important observation to be made in
data definition is that the definition of a Language Evaluation
view is simply a process of deriving a rela-
tion from the set of stored relations, and The great variety of proposed relational
that this is similar to the process of stating a languages leads us to the question: How can
query. Therefore, the full power of a query languages be evaluated and compared?
language may be applied to the definition of There are at least three criteria involved in
views. This is possible because all the re- any objective attempt to evaluate a lan-
lational query languages we have discussed guage: completeness, level, and learnability.
have the property of closure, i.e., they ope- Space constraints permit us to touch only
rate on relations to construct or define new briefly on each of these.
relations. A view may be a selected subset Codd [M3] was the first to establish a
of a stored relation, or it may span over careful definition of completeness for data-
more than one stored relation, as in the base sublanguages. He defined a language
ease of a join. Once the definition of a view to be relationally complete if it permits ex-
has been made, queries may be directed to pression of any query expressible in the

Computing Surveye, Vol. 8, No. 1, March 1076


Relational Data-Base Managem4mt S y s t ~ • 55

relational calculus. He then proceeded to are presently developing a system called


prove that the relational algebra was rela- RENDEZVOUS [E0] which engages in an
tionally complete and hence could serve as English dialog with the user to help him
the standard of comparison for completeness develop an unambiguous formulation of his
of algebra-oriented languages. Since the ap- query.
pearance.of this early work, proofs of rela-
tional completeness have been published for
the SQUARE and Query By Example lan- ADVANTAGES
guages [L25], ILl3].
The first attempt at a quantitative defi- We can now review and summarize the ad-
nition of language "level" was made by vantages of the relational model for data-
Halstead in his investigation of "software base management. Relations have four pri-
physics" [L23]. According to Halstead's mary advantages:
definition, "level" is a property of a par-
ticular expression of an algorithm. The 1) Simplicity: This term should need no
"simplest conceivable" expression of a given further explanation. The relational user
algorithm is assigned a level of 1, and more is presented with ia single, consistent
complicated expressions of the same al- data structure. He formulates his re-
gorithm are given level-values ranging from quests strictly in terms of information
0 to 1, computed on the basis of parameters content, without reference to system-
such as the number of operators and ope- oriented complexities.
rands used in expressing the algorithm. In 2) Data independence:" C. J. Date [Zll] has
[L14], Halstead applies the formulas of defined data independence as "im-
software physics to a comparison of Codd's munity of applications to change in
ALPHA language [L1] and DBTG-CoRoL storage structure and access strategy."
As we have seen, the relational model
[Zll.
The last method of language evaluation makes it possible to eliminate the de-
we will discuss is that of psychological tails of storage structure and access
tests in which the language is taught to a strategy from the user interface.
group of subjects under controled conditions 3) Symmetry: Data-base systems which are
and their learning progress is measured. The based on connections between records
emphasis of the experiment may be placed make some questions easier to ask than
on measuring speed of learning or degree of others--namely, questions whose struc-
comprehension, or on identifying particular ture matches that of the data base.
language features which seem to cause For example, in a hierarchic data base,
learning difficulties. Studies of this type the easiest question to ask is a question
have been published on the languages that begins at the root of the tree and
SQUARE and SEQUEL[L20], and Query By moves toward the leaves, applying suc-
Example [L22]. cessive qualifications at each step. Ques-
tions not reflecting this preferred struc-
ture can be asked awkwardly if at all.
Natural Languages
Since all information is represented by
Recently there has been considerable in- data values in relations, there is no
terest in the use of a natural language such preferred format for a question at the
as English u s a query language. The rela- user interface. It should be noted here
tional data model is well adapted to such an that symmetry of the data model does
attempt because it contains no implementa- not necessarily imply symmetry of the
tion-oriented concepts. M o s t natural-lan- underlying physical data structures
guage-oriented query systems, such as REL maintained by t h e system. A data-base
[E4] and CONVERSE [E5], attempt to trans- designer may choose to optimize the
late, without feedback to the user, from performance of some frequently posed
natural language into a computer-oriented query-type (for example, by providing
language. E. F. Codd and J. M. Cadiou an index for a certain attribute). The
56 • Donald D. Chamberlin

important thing to note is that such IMPLEMENTATIONS


optimizations do not appear in the user
interface. The greatest open research question of the
4) Strong theoretical foundation: The rela- relational data model is whether it can be
tional data model rests on the well- implemented to form an efficient and opera-
developed mathematical theory of re- tionally complete data-base management
lations and on the first-order predicate system. Many individuals and groups have
calculus. This theoretical background made contributions to this area of research;
makes possible the definition of rela- unfortunately, space limitations only per-
tional completeness and the rigorous mit mention of some of the major land-
study of good data-base design (nor- marks and large, ongoing projects in the
implementation of relations. For references
malization).
to many systems not discussed here, see the
The relational model also has a variety of Implementations sections of the bibli-
secondary advantages which derive from ography; special attention should be given
the fundamental advantages just outlined. to [$19], which is a transcript of a panel
Perhaps the most important of these is the discussion among implementors of r e l a -
ease with which high-level, nonprocedural tional systems.
relational languages may be defined. Be- Since the earliest n-ary relational lan-
cause they are easy to learn and use, high- guages proposed were Codd's relational
level languages make data bases available algebra and relational calculus, it is natural
to a new class of casual users who lack the that much of the earliest implementation
training required by conventional pro- work was directed toward these languageS.
gramming languages. High-level languages In [M3], Codd observed that the relational
also give the system maximum flexibility calculus has many advantages over the re -~
to optimize the execution of a given re- lational algebra from the end-user's point of
quest, and to adapt the stored data struc- view, but that the relational algebra pro-
tures to the changing needs of the user vides a sequence of operations which can
population. The nonprocedural approach to be more directly implemented on a machine.
language design permits a unified treatment In [M3], Codd also provides an algorithm,
of data definition, manipulation, and con- called a "reduction algorithm," for trans-
trol, as discussed in the section on "Lan- lating a relational calculus expression into
guages" (pages 49-55). Finally, high-level a sequence of operations in relational al-
languages make it easy to define and manipu- gebra. This approach was extended by
late views of data which are not directly Palermo [T6], who made certain improve-
supported by physical structures. (Of course, ments in the efficiency of the reduction al-
many of these advantages may also be ob- gorithm and implemented the operators of
tained by the use of high-level languages not the relational algebra using APL/360.
based on the relational data model.) A number of early projects in relational
One additional advantage of relations will data-base management adopted the ap-
be mentioned here. The relational model proach of implementing the relational al-
makes it possible to draw a clear distinction gebra directly. Perhaps the earliest of these
between data semantics and data structure. was the MACAIMS system, developed by
For example, the semantics of a data base Goldstein and Strnad at M I T [$1]. The
may be such that when a department record MACAIMSsystem, implemented on MULTtCs,
is deleted, all employee records for that introduced the important concept of en-
department should also be deleted. In the coding each data item by a fixed-length
relational model such semantic rules can be identifier, a n d using these identifiers rather
stated independently of data-base structure. than the actual data items in stored rela-
In some other data models (e.g., if employee tions. MACAIMSalso made a contribution to
records occur hierarchically under depart- the field of data independence by enabling
ment records) this type of semantic rule is different relations to be stored in different
closely related to (and often constrained by) forms and converted to a canonical form,
the data structure. when necessary, for comparison. More

Computing Surveys, Vol. 8, No. 1, March 1976


Relational Data-Base Manageme~ ~ystom8 • 57

recent developments in the use of rela- queries in relational calculus which s p a n


tional algebra at M I T are presented in more than one relation.
[S12]. In addition to the work on l~igh-level
Another early algebra-oriented system is languages, such as the ialgebra and calculus,
the Relational Data Management System efforts have been made to develop a lower-
(RDMS) of General Motors [$5]. RDMS level, procedural, relational interface for
is a display-oriented query system which host-language systems, or to serve as an
implements not only the operators of the intermediate interface.in implementing some
relational algebra, but also a number of other relational language. The first such
other set-oriented operators such as SORT, interface to b e implemented was the Rela-
GRAPH, and HISTOGRAM. tional Memory (RM) developed b y IBM in
An ongoing project in the implementation Cambridge, Massachusetts [$2, $3]. RM per-
of relational algebra is located at the IBM mits variable-length byte strings (entities)
Scientific Centre in Peterlee, England. The to be stored and referenced by numeric
Peterlee system was first called IS/1 and identifiers. Binary relations whose data-
later renamed the Peterlee Relational Test elements are integer s or entity-identifiers
Vehicle (PRTV) [$4, S16]. The system has may then be constructed. RM provides
been used in an environmental research efficient associative access to the binary re-
study with a data base of ten million char- lations: a hashing technique is used to lo-
acters [All, as well as by the Greater London cate a given "left-side" value, and all its
Council in an urban planning application associated "right-side" values are then ac-
having a data base of 50 million characters cessed by means of a linked list. R M also
[S19]. In addition to the usual algebraic provides a recovery capability for restoring
operators (join, projection, etc.), PRTV the data base to an earlier state in the event
provides an easy means to extend the system of a failure.
b y adding new relational operators. A user In 1973 the R M system was extended to
may construct temporary relations by ap- support n-ary relations; the resulting system
plying various operators either to stored was named X R M (Extended 'Relational
relations or to existing temporary relations. Memory) [$6]. X R M uses the "entities" of
The definition of a temporary relation is RM to store n-ary tuples; it also uses R M
kept irL ~he form of a tree of operators, and binary relations as "inversions" which pro-
the actual tuples are not materialized until vide efficient associative access to these
they are needed for output. An optimizer n-tuples. X R M maintains a "master rela-
may rearrange the operators in the defini- tion" whic~describes the various relations
tion of a temporary relation, e.g., choosing and inversions in the system. A user:may
to do restriction as early as possible and re- access a tuple associatively by its key-value
ordering as late as possible. PRTV allows (or the data-value in some inverted column),
different visible subsets of the data base, or may scan over a relation, retrieving all
but does not permit simultaneous use of the tuples which satisfy a given condition.
system by more than one user. X R M was used as the underlying access
The problem of optimizing the execution method in a prototype system developed at
of relational systems has recently attracted IBM Research in San Jose, which imple-
a great deal of interest. Smith and Chang, ' ments the SEQUEL data sublanguage IS10,
based at the University of Utah [T18], have Sll]. The SEQUEL system, which became
applied techniques of automatic program-" operational in 1974, provides set-oriented
ming t o transform relational algebra ex- facilities for: query, insertion, deletion, and
pressions into equivalent but more efficient update; dynamic creation and dropping of
expressions. Gotlieb, working at the Uni- relations; and automatic enforcement of
versity of Toronto [T17], has published a assertions about data integrity. These fea-
study of various algorithms for imple- tures are made available either as a stand-
menting the join operator. Rothnie, of M I T alone, display-oriented interface for casual
and the Defense Department Computer In- users, or as a host-language interface that
stitute [T8, T20], has developed an al- can be called from P L / I programs. The
gorithm for limiting the search• space for system contains an optimizer (described in

Coming .... ':!' / ;ii~O, 1 , " "~ ~


58 • Donald D. Chamberlin

[SLID which uses XRM inversions to limit ing a relational prototype is the INGRES
the search space for a given query. The (Interactive Graphics and Retrieval Sys-
SEQUEL prototype has been extended by tem), of the University of California at
IBM at Cambridge and by t h e ' M I T Sloan Berkeley [$7, $9, $15]. INGRES, which runs
School of Management to accommodate a on a P D P - 1 1 / 4 0 under the UNIX operating
multiple-user environment. The resulting system, implements QUEL, a relationally
system, called GMIS, is being used at MIT complete query language based on the re-
•as an ~information system for modeling New lational calculus. The INGRES system im-
England energy resources. [A12, $19]. plements a variety of features by automatic
Another prototype system based on XRM modification of the QUEL statement sub-
is being developed at IBM Research in mitted by the user. Alternative views are
Yorktown Heights, to implement Query supported by substituting the view-defini-
By Example. The system contains an tion into the user's statement [I4]. Authori-
optimizer which interprets Query By Ex- zation and integrity control are provided by
ample queries in terms of operations similar adding extra predicates to the user's state-
to those of the relational algebra (join, re- ment which limit its scope [C3]. Concurrent
striction, etc). At present, the system sup- update requests are kept from interfering
ports only a single user and does not pro- with each other by analyzing their respec-
vide update facilities. tive scopes and allowing an update to
A large-scale prototype data-base man- proceed only when it is "safe" [I2]. Finally,
agement system, called System R, is pres- the QUEL statement, which may contain
ently under construction a t ' I B M Research m a n y variables, is broken up by a "de-
in San Jose [$20]. System R is the first at- composition" algorithm into a series of
tempt to apply the relational data model to one-variable statements which are executed
an environment of many concurrent users one at a time. The physical data structures
and a high volume of requests. It will pro- used by INGRES include hashed tables (in-
vide an operationally complete data-man- cluding "order-preserving" hash functions
agement capability, with facilities for au- which permit sequential scanning in key-
thorization, logging and recovery, definition value order) and "generalized directories,"
of alternative views, and enforcement of which employ a tree-structure to map a
data consistency and integrity. System R key into an address interval, and then use
will support the SEQVEL language as an an order-preserving function to compute
external interface, as well as a set of pro- an address within the interval [$9].
cedural operators for host-language pro- Implementation of another relational
gramming. Requests to the system will be system, called ZETA, is presently under way
executed by an optimizer which chooses at the University of Toronto [$8, S14].
among various physical access methods, The ZETA system is constructed in three
including inversions maintained in the form levels. The lowest level is a language called
of B-trees IT1], physical pointer-chains, and MINIZ, which provides such basic operations
a sort-merge facility. A user is not con- as scanning a relation and accumulating a
strained to protect himself against the up- list of identifiers of tuples which satisfy a
dates of other concurrent users by explicit given condition. The middle level imple-
locking statements; the system automati- ments views ("derived relations") and has
cally generates locks as needed at the level an optimizer/interpreter which accepts
of individual tuples. Deadlocks are auto- queries spanning multiple relations. Three
matically detected and resolved. Some of types of end-user interfaces are supported
the locking techniques developed as part of by ZETA :
the System R project have been described
in [C1, C4, C8]. System R is being imple- • a host-language facility which pro-
mented on an IBM 370, using a VM/370 vides features similar to SEQUEL;
operating system modified for the data- • a query language generator system
base environment [T13]. whereby a user may create his own
Another large-scale attempt at construct- self-contained query language using

Computing Surveye, Vol. 8, No. 1, March 1976


Relational Data-BaseManag~m~ . 59

a syntax-driven compiler/compiler; decision is made as to w l ~ h e r the tuple


and satisfies the output cri~erian.
• a natural-language recognition system
based on semantic networks. This
system, called TORTS, is presently SUMMARY
being tested on a data base of student
records. This paper has discussed the terminology of
the relational data model and traced its
A second relational system, called OMEO~, development in t e r m s of normalization
[T23], is also being implemented at Toronto theory, language design, and implementa-
on a PDP-11/45. Like ZETA, OMEGA has a tion techniques. We h a v e discussed the
multilevel architecture. One of the internal advantages of n-ary relations for data-base
levels of OMEGA is the Link and Selector management, including simplicity, data in-
Language (LSL) [T12], an expression- dependence, symmetry, and a strong theo-
oriented language which provides subsetting retical foundation.
operations on a relation ~"selectors") and In order to be considered a true relational
connections from one relation to another system, a data-base system must possess at
("links"). least the following attributes:
A recent, and very promising, develop-
ment is the emergence of several designs for 1) All information is represented by
associative hardware to support relational data values. No essential informa-
data bases. One such proposal, called CASSM tion is contained in invisible connec-
(Context Addressed Segment Sequential tions among records.
Memory) was made by Su, et al, at the 2) At the user interface, no particular
University of Florida [HI, H2, H5]. C~ssM access path is "preferred" over any
is an array of processors, each having access other.
to a circular memory space (e.g., a disk 3) The user interface is independent of
track or circulating magnetic bubble reg- the means by which data is physically
ister). As data circulates in the memory, stored.
the processors search in parallel for data In [M14], E. F. Codd summarized the
which satisfies a given condition. In [L12], areas in which further research is most
Copeland and Su discuss implementation of needed in relational data-base manage-
a high-level, mapping-oriented relational meat. The following areas were included:
language called S~CK, superimposed on
CASSM. 1) Development of concurrency control
A similar design, called RAP, which is techniques specifically geared to the
also based on a cellular array of processors relational model.
with circulating memories, was recently 2) Measuring the performance attain-
reported by Ozkarahan, et al., of the Uni- able when the relational approach is
versity of Toronto [H3]. applied to a large-scale data base.
A third associative hardware system for 3) Development of the theory whereby
relational applications, called RARES (Ro- multiple alternative views of shared
tating Associative Relational Store), has data may be supported for retrieval
been proposed by Lin and Smith at the and update.
University of Utah [H4]. Like CASSM and 4) Demonstration of the viability of
RAP, RARES contains multiple rotating natural language query formulation
memory tracks with a read/write head per subsystems.
track. However, unlike CAss~ and RAP,
which store tuples in a linear fashion along ACKNOWLEDGMENTS
a track, RARES lays each tuple across many
tracks so that the entire tuple is read in The author is indebted to E. F. Codd of I B M for
his helpful comments during the preparation of
one character-read-time. Each tUple, read this paper. The bibliography which follows is
from memory, is held in a pipeline while a based on a bibliography compiled by E. F. Codd.

m
60 • Donald D. Chamberlin

The author i s also grateful to his colleagues at systems: a tutorial," Proc. Fourth In-
the IBM Research Laboratory in San Jose for ternatl. •ymposium on Computer and In-
their support and discussions. formation Sciences, Dec. 1972, Plenum
Press, New York, 1972.
[M8] CORD, E . F . "Understanding relations,"
CLASSIFICATION OF REFERENCES continuing series of articles published in
FDT, the quarterly bulletin of ACM-
Models and Theory SIGMOD, beginning with Vol. 5, 1 (June
M 1) General 1973),* ACM, New York, 1973.
N 2) Normalization, Decomposition, and [M9] HAWRYSZKIEWYCZ, I. T. "Semantics of
Synthesis data base systems," M I T Project, MAC
Z 3) Relationships between CODASYL Report MAC TR-112, Cambridge, Mass.,
D D L / D B T G and the Relational Dec. 1973.
Model [M10] BRACCHI, G. ; FEDELI, A. ; AND PAOLINI, P.
L Languages and Human Factors " A multi-level relational model for data-
Implementations base management systems," Data Base
S 1) Software Management, Proc. I F I P TC-2 Working
H 2) Hardware Conf. on Data-Base Management Systems,
T Implementation Technology April 1974, North-Holland Publ. Co.,
C Authorization, Views, and Concurrency Amsterdam, The Netherlands, 1974.
I Integrity Control [Mll] STONEBRAKER, M. "A functional view
A Applications of data independence," Proc. ACM-
D Deductive Inference and Approximate SIGFIDET Workshop on Data Descrip-
Reasoning tion, Access, and Control, May 1974,*
E Natural Language Support ACM, New York, 1974, pp. 63-81.
Y Sets and Relations (prior to 1969) [MI2] MBLTZER, H. S. "Relations and rela-
Certain references include asterisks with the tional operations," IBM Report to GUIDE
following meaning: 38 Information Systems Division, Dallas,
* Proceedings of ACM-SIGFIDET and Texas, May 1974.
ACM-SIGMOD Workshops are obtain- [M13] HI~CHCOCK, P. "Fundamental opera-
able from ACM Headquarters, 1133 Ave- tions on relations in a relational data
nue of the Americas, New York, N.Y. base," IBM Scientific Centre Report
10036 UKSC 0051, Peterlee, England, May 1974.
** Proceedings of the 1975 ACM Pacific [MI4] CorD, E. F. "Recent investigations in
Conference, San Francisco, April 17-18, relational data base systems," Informa-
1975 are obtainable from: Mail Room, tion Processing 74, Proc. I F I P Congress,
Boole & Babbage, 850 Stewart Drive, August 1974, Vol. 5, North-Holland Publ.
Sunnyvale, California 94086 Co., Amsterdam, The Netherlands, 1974,
~Vp. 1017-1021.
[M15] ~D~XI~D, H. "Datenbanksysteme 1,"
Models and Theory Reihe Informatik/16 (1974), Bibliogra-
1) General phisches Institut, Mannheim, W. Ger-
[M1] Coon, E. F. "Derivability, redundancy many.
and consistency of relations stored in [M16] HALL, P. A. V.; TODD, S. J. P.; AND
large data banks," IBM Research Re- HITCHCOCK, P. " A n algebra of relations
port RJ599, August 1969. for machine computation," IBM Scien-
[M2] Cony, E. F. " A relational model of tific Centre Report UKSC 0066, Peterlee,
d a t a for large shared d a t a banks," England, Jan. 1975.
Comm. ACM 13, 6 (June 1970), pp 377-397. [M17] SCHMID, H. A.; ANDSWENSON,J . R . "On
[M3] CODD,E . F . "Relational completeness of the semantics of the relational data
data-base sublanguages", Courant Com- model," Proc. ACM-S1GMOD C o n f . ,
May 1975,* ACM, New York, 1975, pp 211-

uter Science Symposia 6, "Data Base
vstems," New York, May 1971, Pren- 223.
t~ce-Hall, Englewood Cliffs, N.J., 1971,
pp. 65-98.
[M4] STRNAD,A . L . " T h e relational approach Models and Theory
to the management of data bases," Proc. 2) Normalization, Decomposition, and Synthesis
I F I P Congress, August 1971, Vol. 2, [N1] CODD,E. F. " F u r t h e r normalization of
North-Holland Publ. Co., Amsterdam, the data base relational model," Courant
The Netherlands, 1971, pp. 901-904. Computer Science Symposia 6, "Data
[MS] DURCHHOLZ,R. " D a s Datenmodell bei Base Systems," New York, May 1971,
Codd," Technical Report No. 69, Gesell- Prentice-Hall, New York, 1971, pp. 33-64.
schaft fiir Mathematik und Datenver- [N2] CODD, E. F. "Normalized data base
arbeitung, Bonn, W. Germany, July 1972. structure: a brief tutorial," Proc. 1971
[M6] HAWRYSZKIEWYCZ,I T.; AtqD DENNIS, ACM-SIGFIDET Workshop on Data
J . B . " A n approach to proving the cor- Description, Access, and Control, Nov.
rectness of data-base operations," Proc. 1971,* ACM, New York, 1971, pp. 1-17.
ACM-SIGFIDET Workshop on Data [N3] HEA'rH, I. J. "Unacceptable file opera-
Description, Access, and Control, Nov.- tions in a relational data base," Proc.
Dec. 1972,* ACM, New York, 1972, pp. 1971 ACM-SIGFIDET Workshop on Data
323-348. Description, Access, and Control, Nov.
[M7] DATE, C. J. "Relational data base 1971, ACM, New York, 1971, pp. 19-33.

ComputingSurveys,Voi.8, No. I, March 1976


Relational Data-Base Managemen~ ~y~tcff~ • '61

IN4] DELOBEL, C. "Aspects theoretiques sur Holland Publ. Co., Amsterdam, The
la structure de l'information dans une Netherlands, 1974.
base de donn~es", Revue Francaise d'In- [Z5] Co•D, E. F.; AND DATB,:C. J. " I n t e r -
formatique el de Recherche Operationelle, active support for non-prbgrammers: the
B - 3 (Sept. 1971). relational and network approaches,"
INS] DELOnEL, C. " A theory about data in Proc. 1974 ACM-SI(YMOD Dsbate "Data
an information system," IBM Research Models: Data Structure Set versus Rela-
Report, RJ964, San Jose, Calif., Jan. 1972. tional," May 1974,* ACM, New York,
[N6] RISSANEN, J.; AND DELOBEL, C. " D e - 1974.
composition of files, a basis for data stor- [Z6] DATE, C. J.; ANvCoDv, E . F . " T h e re-
age and retrieval," IBM Research Re- lational and network approaches: com-
port R J1220, San Jose, Calif., May 11,973. parison of the application programming
[N7] DELOBEL, C.; AND CASEY, R . G . De- interfaces," Prec. 1974 ACM-SIGMOD
composition of a data base and the theory Debate "Data Models: Data Structure Set
of Boolean switching functions," IBM versus Relational~" May, 1974,* ACM,
J. R. & D. 17, 5 (Sept. 1973), pp. 374-387. New York, 1974. •
[NS] KENT, W. " A primer of normal forms," [Z7] BACHMAN,C. W. " T h e data structure
IBM Technical Report TR 02.600, San set model," PreC. 1975 ACM-SIGMOD
Jose, Calif., Dec. 1973. Debate "Data Models: Data Structure Set
[N9] ARMSTRONG,W.W. "Dependency struc- versus Relational," May 1974,* ACM,
tures of data base relationships," In- New York, 1974.
formation Processing 7~, Prec. I F I P Con- [Z8] SZBLEY,E. H. "On the equivalences of
gress, August 1974, Vol. 3, North-Holland data based systems," Prec. ACM-
Publ. Co., Amsterdam, The Netherlands, SIGMOD Debate "Data Models: Data
1974, pp. 580-584. Structure Set versus Relational," May
[N10] DELOBEL, C.; AND LEONARD, M. " T h e 1974,* ACM, New York, 1974.
decomposition process in a relational [Z9] EVEREST,G . C . " T h e futures of data-
model," Technical Report, Laboratoire base management," Prec. ACM-SIGMOD
d'Informatique, Univ. of Grenoble, Workshop on Data Description, Access,
France, Sept. 1974. and Control, May, 1974, ACM, New
[Nll] WANG, C. P.; AND WEDEKIND, H. "Seg- York, 1974, pp. 445-.462.
ment synthesis in logical data base de- [Z10] OLLE,T . W . "Current and future trends
sign," I B M J. R. & D. 19, 1 (Jan. 1975) in data base management systems," In-
pp 71-77. formation Processing 7~, Prec. I F I P
[N12] ~ERNSTEIN, P. A.; SWENSON,J. R.; AND Congress, August, 1974. Vol. 5, North-
TSICHRITZIS, D. " A unified approach to Holland Publ. Co., Amsterdam, The
functional dependencies and relations," Netherlands, 1974, pp 998-1006.
Proc. ACM-S[GMOD Conf. May 1975,* {Zll] DATE, C. J. " ~ n introduction to data
ACM, New York, 1975, pp. 237-245. base systems," Addison-Wesley, Reading,
[N13] FADOUS, R. Y.; AND FORSYTH, J. " F i n d - Mass., 1975.
ing candidate keys for relational data [Z12] KAY, M. H. " A n assessment of the
bases," Prec. ACM-SIGMOD Conf., May CODASYL DDL for use with a rela-
1975,* ACM, New York, 1975, pp. 203-210. tiona 1 schema, " Data Base Description,
[N141 FADers, R. Y. "Mathematical founda- B. C. M. Douque aad G. M. Nijssen
tions for relational data bases," PhD. (Eds.), North-Holland Puhl. Co., Am-
Thesis, Michigan State Univ., Lansing, sterdam, The Netherlands, 1975, pp.
1975. 199-214.
IN15] SHARMAN,G. C. H. " A new model of [Z13] ROnINSON, K. A. " A n analysis of the
relational data base and high level lan- uses of the CODASYL set concept,"
guages," Technical Report TR. 12.136, Data Base Description, B. C. M. Douque
IBM Hursley Park Laboratory, England, and G. M. Nijssen, (Eds.), North-Holland
Feb., 1975. Publ. Co., Amsterdam, The Netherlands,
1975, pp. 169-182.
[Z14] TAYLOR, R. W. "Observations on the
Models and Theory attributes of database sets," Data Base
3) Relationships between CODASYL D D L / Description, B. C. M. Douque and G. M.
DBTG and Relational Model Nijssen (Eds.), North-Holland Publ. Co.,
{Z1] CODASYL Data Base Task Group Re- Amsterdam, The Netherlands, 1975, pp.
port, April 1971, ACM, New York. 73-84.
[Z2] CANNING,R . G . "Problem areas in data [Z15] OLLE, T. W. " A n analysis of short-
management," EDP Analyzer 12, 3 comings in the schema DDL with an
(March 1974). outline of proposed improvements,"
[Z3] EARNEST,C. P. " A comparison of the Data Base Description, B. C. M. Douque
network and relational data structure and G. M. Nijssen (Eds.), North-Holland
models," Technical Report, Computer Publ. Co., Amsterdam, The Netherlands,
Sciences Corp., El Segundo, Calif., April 1975, pp. 283-298.
1974. [Z16] HuiTs, M. "Requirements for languages
[Z4] NIJSSEN, G. M. " D a t a structuring in in data-base systems," Data Base Descrip-
tion, B. C. M. D o u q u e and G. M.
DDL and relational d a t a m o d e l , " Prec. Nijssen (Eds.), North-Holland Publ. Co.,
I F I P TC-2 Working Conf. on Data Base Amsterdam, The Netherlands, 1975, pp.
Management Systems, April 1974, North- 85-110.

Computing Surv~y~ Vol. ~No. 1, March J976


62 • Donald D. Chamberlin

[Z17] RORINSON,K.A. "Data base--the ideas ory," Proc. ACM-EIGMOD Workshop on


behind the ideas," Computer J. 18, 1 Data Description, Access, and Control,
(Jan. 1975), pp. 7-12. May 1974,* ACM, New York, 1974, pp.
[Z18] HELD,.G.; AND STONEBRAKER,M: " N e t - 265-276.
works, hierarchies, and relations in data [L13] ZLOOF, M. M. "Query by example,"
base management systems," Proc. • ACM$. Research Report RC4917, IBM T. J.
Pacific 75 Regional Conf., Aprd, 1975, Watson Research Center, Yorktown
ACM, New York, 1975, pp. 1-9. Heights, N. Y., July 1974.
[Z19] MARTIN,J. T. "Computer data.base or- [L14] HAt,STEAn, M. H. "Software physics
ganization," Prentice-Hall, Englewood comparison of a sample program in DSL
Cliffs, N.J., 1975. ALPHA and COBOL," IBM Research Re-
port RJ1460, San Jose, Calif., Oct. 1974•
[L15] PIRO'I~rE, A.; AND WODON, P: "A com-
Languages and Human Factors prehensive formal query language for a
[L1] CoDv,E. F. " A data base sublanguage relational data base: FQL," Technical
founded on the relational calculus," Proc. Report R283, M.B.L.E. Laboratoire de
1971 ACM-SIGFIDET Workshop on Recherches, Brussels, Belgium, Dec.
Data Description, Access, and Control, 1974.
Nov. 1971,* ACM, New York, 1971, pp. [L16] SUMMERS, R. C.; COLEMAN, C. D.; AND
35--68. FERNANDEZ, E. B. " A programming
[L2] CODD,E. F. "Relational algebra," Cou- language extension for access to a shared
rant Computer Science Symposia 6, data base," Proc. ACM Pacific 75 Re-
"Data Base Systems," New York, May gional Conf., April 1975,** ACM, New
1971, Prentice-Hall, New York, 1971. York, 1975, pp 114-118.
[L3] BRACCHI,G.; FEDELI, A.; AND PAOLINI, [L17] McDoNALD, N.; AND STONEBRAEER,M.
P• "A language for a relational data "CUPID: the friendly query language,"
base," Sixth Annual PrincEton Conf. on Proc. ACM Pacific 75 Regional Conf.,
Information Sciences and Systems, April 1975,** ACM, New York, 1975,
March 1972, Princeton Univ., N.J., 1972. 127-131•
[IA] B~ORNER,D.; Conn, E. F.; DECKERT,
K. L.; AND TRAIGER, I. L. "The GAMMA
[L18] •E STGAARD,R . E . " A COBOL data base
facility for the relational data model,"
ZERO n-ary relational data base interface: Proc. ACM Pacific 75 Regional Conf.,
specifications of objects and operations," April, 1975,** ACM, New York, 1975, pp
IBM Research Report R J1200, San Jose, 132-139.
Calif., April, 1973. [L19] SHU, N~, C.; HOUSEL, B. C.; AND LUM,
[L5] EARLEY, J. "Relational level data V.Y. CONVERT:a high level transla-
structures for programming languages," tion definition language for data con-
Acta Informatica, 2, 4 (1973), pp. 293-309. version," Proc. ACM-SIGMOD Conf.,
[L6] D E E , E.; HILDER, W.; KING, P. J. H.; May, 1975,* ACM, New York, 1975, p 3.
AND TAYLOR, E. "ConoL extensions to (Comm. ACM) 18, 10 (Oct. 75) 557-567.
handle a relational data base," British [L20] REISNER, P.; BOYCE, R. F.; ANn CHAM-
Computer Society, Working Party #5, BERLIN, D. D. "Human factors evalu-
Oct. 1973. ation of two data base query languages:
[L7] FEHDER, P. L. "The representation- SQUAREand SEQUEL," Proc. A F I P S Na-
independent language," IBM Research tioual Computer Conf., May 1975, Vol. 44,
Reports RJll21 & RJ1251, San Jose, AFIPS Press, Montvale, N.J., 1975, pp
Calif., Nov. 1972 & July 1973 respectively. 447-452.
[LS] BOYCE,R. F.; AND CHAMBERLIN, D . D . [L21] ZLOOF, M. M. "Query by Example,"
"Using a structured English query lan- Proc. A F I P S National Computer Conf.,
guage as a data.definition facility," May 1975, Vol. 44, AFIPS Press, Mont-
IBM Research Report RJ1318, San vale, N.J., 1975, pp 431-438.
Jose, Calif., Dec. 1973• THOMAS, J. C.; AND GOULD, J. ]). "A
[L9] BoYcE,R. F. ; CHAMBERLIN,D. D. ; KING, [L22] psychological study of Query by Ex-
W. F., III; AND HAMMER,M.M. "Speci- amvle," Proc. AFIPS National Computer
fying queries as relational expressions: Conf., May 1975, Vol. 44, AFIPS Press,
SQUARE," Data Base Management, Proc. Montvale, N.J., pp 439-445.
IFIP Working Conf., April 1974, North- [L23] HALSTEAn, M. H. "Software physics:
Holland Publ. Co., Amsterdam, The basic principles," Research Report
Netherlands, 1974, pp 169-177• RJ1582, IBM Research Laboratory, San
[L10] CHAMnERLXN,D. D.; ANB BOYCE, R. F• Jose, Calif., May 1975.
"SEQUEL: A structured English query
language," Proc. ACM-SIGMOD Work- [L24] ZLOOF, M.M. "Query by Example: the
shop on Data Description, Access, and invocation and definition of tables and
Control, May 1974,* ACM, New York, forms," Proc. Internatl. Conf. on Very
1974, pp. 249-264. Large Data Bases, Sept. 1975, ACM, New
[Lll] JERVIS,B. "Query languages for rela- York, 1975, pp 1-24.
tional data-base management systems," [L25] BoYcE, R. F.; CHAMBERLIN,D. D.; KING,
Masters Thesis, Univ. of British Colum- W. F.; AND HAMMER,M.M. "Specifying
bia, Vancouver, B.C., May 1974. queries as relational expressions: the
[L12] COP~LAND,G. P•; AND Su, S. Y• W. "A
high level data sublanguage for a con- SQUAREdata sublanguage," Comm. ACM
text-addressed segment-sequential mere- 18, 11 (Nov. 1975), pp. 621-628.

Computing Surveye,Vol. 8. No. 1, March 1976


Relational Data-Base Management • 63

Implementations Computer Oonf., May 1976~yoL 44, AFIPS


Press, Montvale, N.J., 1975, pp. 403-408.
1) Software
ISl] GOLDSTEIN, R. C.; ANn STRNAD, A. L. [S15] HELD,G. D.; STONEBRAKER,M. R.; A N D
WONO, E. "IN6aES: a relational data
"The MACAIMS data management sys- base system," Pron. A F I P 8 National
tem," Proc. 1970 ACM-SIGFIDET Work- Computer Conf., May 1975, Vol. 44, AFIPS
shop on Data Description and Access, Press, Montvale, N.J., 1975, pp 409-416.
Nov. 1970,* ACM, New York, 1970, pp. [$16] TODD,S. J. P. "Peterlee relational test
201-229. vehicle PRTV, a technical overview,"
IS2] SYMONDS, A. J.; ANn LORIE, R. A. " A IBM Scientific Centre Report UKSC
schema for describing a relational data 0075, Peterlee, England, J u l y 1975.
base," Proc. ACM-SIGFIDET Workshop
on Data Description and Access, Nov. [$17] WINsLoW, L. E. " A n efficient imple=
mentation of Codd's relational model
1970,* ACM, New York, 1970, pp. 230-245. data base," Proc. COMPCON 75, llth
[s31 LORIE, R. A.; ANn SYMONnS, A. J. " A Annual IEEE Computer Society Conf.,
relational access method for interactive Sept. 1975, IEEE, New York, 1975.
applications," Courant Computer Science
Symposia, 6, Data Base Systems, Prentice* [S18] MANACHER,G. K, "On the feasibility
of implementing a large relational data
Hall, New York, 1971, pp 99-124. base with optimal performance on a
[$4] NOTLEY, M. G. "The Peterlee IS/1 mini-computer," "Pron. Int¢fnatl. Conf.
system," IBM UK Scientific Centre Re- on "Very Large Data Bases, Sept. 1975,
port UKSC-0018, March 1972. ACM, New York, 1975, p p 175-201.
IS5] WHITNEY, V. K. M. "RDMS: a rela- [$19] CorD, E. F. (Ed.), "Implementation of
tional data management system," Proc.
Fourth Internatl. Symposium on Computer relational data base management sys-
and Information Sciences (COINS IV), tems," FDT, Qnar~erly Bulletin of ACM-
Dec. 1972, Plenum Press, New York, 1972. • SIGMOD 7, 3-4 (1975).
[S6] LoRIE, R. A. " X R M - - a n extended [S20] ASTRAHAN,M. M: et. al., "System R:
(n-ary) relational memory," IBM Scien- a relational approach to data-base man-
tific Center Report G320-2096, Cam- ageme n t," Research Report RJ 1738, IBM
bridge, Mass., Jan. 1974. Research Laboratory, San Jose, Calif.
IS7] McDONALD, N.; STONEBRAKER,M.; AND
Feb. 1976.
WONG, E., "Preliminary design of
INGRES: Part I , " Electronics Research
Lab. Report ERL-M435, Univ. of Cali-
fornia, Berkeley, April 1974. Implementations
IS8] CZARNIK, B. ; SCHUSTER, S. ; AND
2) Hardware
TSICHRITZIS, D. "ZETA: a relational
data base management system," Proc. [H1] Su, S. Y. W.; COPELAND, (]. P.; AND
ACM Pacific 75 Regional Conf., April LXPOVSKI,G. J . " Retrieval operations and
1975,** ACM, New York, 1975, pp. 21-25. data representations in a context-ad-
IS9] HELD, G.; ANn STONEBRAKER,M. "Stor- dressed disk sytsem," Proc. ACM-EIG-
age structures and access methods in the PLAN-SIGIR lnterface Meeting on Pro-
relational data base management system gramming Languafes and Information Re-
INGRES," Proc. ACM Pacific 75 Regional trieval, Nov. 1978, AC,M, New York, 1973.
Conf., April 1975,** ACM, New York,
1975, pp 26-33. [H2] COPELAND,G. P.; LIPOVSKI, G. J.; ANn
[S10] ASTRAHAN, M. M.; AND LOreS, R. A. Su, S. Y. W. " T h e architecture of
"SEQUEL-XRM: a relational system," CASSM: a cellular system for non-nu-
Proc. ACM Pacific 75 Regional Conf., meric processing," Pros. First Annual
April 1975,** ACM, New York, 1975, pp Symposium on Computer Archilecture, Dec.
34-38. 1973, IEEE, N.Y.,: 1973.
[su] ASTRAHAN, M. m.; AND CHAMBERLIN, [H3] OZKARAHAN,E. A.; SCHUS'rER, S. A.;
D . D . "Implementation of a structured AND SMITH, K . C . "RAP: an associative
English query language," Comm. ACM
18, 10 (Oct. 1975), pp 580-588. processor for data base management,"
[812] STEWERT, J.; AND GOLDMAN, J. "The Proc. A F I P S National Computer Conf.,
relational data management system: a May 1975, Vol. 44, AFIPS Press, Mont-
perspective," Proc. ACM-SIGMOD Work- vale, N.J., 1975, pp'379-357.
shop on Data Description, Access, and [H4] LIN, C. S.; ANn Sm,rti, D. C. P. " T h e
Control, May 1974,* ACM, New York, design of a rotating associative array
1974, pp 295-320. memory for a relational data-base man-
[S13] McLEoD, D. J.; ANn MELDMAN, M. J.
" R I s s : a generalized minicomputer rela- agement application," Pron. lnternatl.
tional data base management system," Conf. on Very Laege Data Bases, Sept.
Proc. A F I P S National Computer Conf., 1975, ACM, New York, 1975, pp. 453.-454.
May 1975, Vol. 44, AFIPS Press, Mont- [H5] Su, S. Y. W.; ~tNn LIPOVSKX, G. J.
vale, N.J., 1975, pp 397--402. '~CASSM: a cellula~ system for very large
[S14] MYLOPOULOS, J.; SCHUSTER, S. A.; AND data bases," Pros. lnternatl. Conf. on
TSICHRITZm, D. " A multi-level rela- Very Large Data Bases, Sept. 1975, ACM,
tional system," Proc. A F I P S National New York, 1975, pp 456-472.
64 • Donald D. Chamberlin

Implementation Technology conditions," The Soken Kiyo 5, 1 (1975),


[TI] BAYER, R.; AND McCR~IGHT, E. "Or- 159.-175.
[T151 FARLEY, J. H. GILLES; AND SCHUSTER,
ganization and maintenance of large or- S . A . "Query execution and index selec-
dered indices," Proe. ACM-SIGFIDET tion for relational data bases," Tech-
Workshop, Nov. 1970, ACM, New York, nical Report CSRG-53, Computer Sys-
1970 pp 107-141. tems Research Group, Univ. of Toronto,
[T2] " I B M system/360 operating system data • Toronto, Ont., Canada, March 19'75.
management services," IBM Publication [Ti6] PECHERER, R. M. "Efficient evaluation
No. GC26-3746, 1971. of expressions in a relational algebra,"
[T3] DATE, C. J.; AND HOPEWELL, P. " F i l e Proc. ACM Pacific 75 Regional Conf.,
definition and logical data inde- April 1975,~'* ACM, New York, 1975, pp
• ndence," Proc. 1971 ACM-SIGFIDET
orkshop on Data Description, Access, IT17]
44--49.
GOTLIEB, L. R. "Computing joins of
and Control, Nov. 1971,* ACM, New relations," Proc. ACM-SIGMOD Conf.
York, 1971, pp 117-138. May 1975,* ACM, New York, 1975.
[T4] DATE, C. J.; AND HOPEW~SI~L,P. "Stor- [T18] SMITH, J. M.; AND CHANG, P. "Opti-
age structure and physical data inde- mizing the performance of a relational
• ndence," Proc. 1971 ACM-SIGFIDET
orkshop on Data Description, Access, algebra data base interface," Comm.
ACM 18, 10 (Oct. 1975), pp 568-579.
and Control, Nov. 1971,* ACM, New [T19] SCHKOLNICK,M. "Secondary index op-
York, 1971, pp 139-168. timization," Proc. ACM-SIGMOD Conf.
[T5] ROVHNI~,J. B. " T h e design of general- May 1975,* ACM, New York, 1975, pp
ized data management systems," PhD 186-192.
Thesis, MIT, Cambridge, Mass., Sept. [T20] ROTHNIE, J. B. "Evaluating inter-
1972. e n t r y retrieval expressions in a rela-
IT6] PALERMO,F. P. " A data base search tional data base management system,"
problem," Fourth Internatl. Symposium Proc. A F I P S National Computer Conf.,
on Computer and Information Science May 1975, Vol. 44, AFIPS Press, Mont-
(COINS IV), Dec. 1972, Plenum Press, vale, N.J., 1975, pp. 417-423.
New York, 1972. [T21] PALERMO,F. P. " A n APL environment
IT7] HALL, P. A. V.; AND TODD, S. J . P . " F a c - for testing relational operators and data
torisations of algebraic expressions," base search algorithms," Proc. A P L 75
IBM Scientific Centre Report UKSC Conf., June 1975, ACM, New York, 1975,
005,5, Peterlee, England, April 1974. pp 249-256.
IT8] ROVHNIE,J. B. " A n approach to imple- [T22] HALL, P. A.V. "Optimisation of a single
menting a relational data management relational expression in a relational data
system," Proc. ACM-SIGMOD Workshop base system," IBM Scientific Centre
on Data Description, Access, and Control, Report UKSC 0076, Peterlee, England,
May 1974,* ACM, New York, 1974, pp July 1975.
277-294. [T23] SCHMID~H. A.; AND BERNSTEIN, P. A.
[T9] WHITNEY, V. K. M. "Relational data " A multi-level architecture for relational
management implementation tech- data base systems," Proc. Internatl.
niques," Proc. ACM-SIGFIDET Work- Conf. on Very Large Data Bases, Sept.
shop on Data Description, Access, and 1975, ACM, New York, 1975, pp 202-226.
Control, May 1974,* ACM, New York, [T24] LIEN, Y. E.; TAYLOR, C. E.; REYNOLDS,
1974, pp; 321-350. ,, M. L.; AND DRISCOLL, J. R. "Binary
[T10] CAsEY, R. G.; AND OSMAN, I. Gen- search tree complex--a realization of a
eralized page replacement algorithms in relational database management system~"
a relational data base," Proc. ACM- Proc. Internatl. Conf. on Very Large
SIGFIDET Workshop on Data Descrip- Data Bases, Sept. 1975, ACM, New York,
tion, Access, and Control, May 1974,* 1975, pp 540-542.
ACM, New York, 1974, pp 101-124.
[TIll HALL, P. A. V. "Common sub-expres-
sion identification in general algebraic
systems," IBM Scientific Centre Report
Authorization,Views and Concurrency
UKSC 0060, Peterlee, England, Nov.
[C1] CHAMBERLIN,D. D.; BoYcE, R. F.;
1974. TRAIGER, I . L . " A deadlock-free scheme
IT12] TSICHnITZIS, D. " A network frame- for resource locking in a data base en-
vironment," Information Processing 74,
work for relation implementation," Proc. Proe. I F I P Congress, August 1974, North-
I F I P TC-$ Special Working Conf. on the Holland Publ. Co., Amsterdam, The
DDL, Jan. 1975, published as Data Base Netherlands, 1974, pp. 340-343.
Description, North-Holland Publ. Co., [C2] OWENS, R. C. "Evaluation of access
Amsterdam, The Netherlands, 1975, pp. authorization characteristics of derived
269-282. data sets," Proc. ACM-SIGFIDET Work-
IT13] GRAY,J. N.; AND WATSON,V. " A shared shop on Data Description, Access, and
segment and inter-process communica- Control, Nov. 1971,* ACM, New York,
tion facility for VM/370," Research Re- 1971, pp 263-278.
port RJ1579, IBM Research Laboratory, [C3I STONEBRAKER,M.; AND WONG, E. "Ac-
San Jose, Calif., Feb. 1975. cess control in a relational data base
[T14] CHINA, Y. " A data base search algo- management system by query modifi-
rithm based on complicated retrieval cation," Electronics Research Lab.,

ComputingSurveys,Voi. 8, No. 1, March 1976


Relalional Data-BaseM a n ~ e ~
Report ERL-M438, Univ. of Calif., [A2] KUNII, T. L.;AMANO, T,; ARISAwA,H.;
Berkeley, May 1974. AND OKAVA,S. "An interactive fashion
[C4] ESWARAN,K. P.; GRAY, J. N.; LORIE, • design system ImrAvs," abstract in Proc.
R. A.; AND TRAIGER, I. L. "On the Conf. on Uomputsr ~bt~phics & Interac-
notions of consistency and predicate tive Tech~,iques, July 197/4; paper in Com-
locks in a data base system," IBM Re- puters & Graphics 1, (1975), Pergamon
search Report RJ1487, San Jose, Calif., Press, New York:
Dec. 1974. [A3] WILLIAMS,R. "On the application of
[C5] FERNANDEZ,E. B.; SUMMERS, R.. C.; AND relational data structures in computer
COLEMAN, C. D. "An authorization raphics," Information Procsssing 7~,
model for a shared data base," Proc. roc. IFIP Congress, August 1974, Vol. 4,
ACM-SIGMOD Conf., May 1975,* ACM, North-Holland Publ. Co., Amsterdam,
New York, 1975, pp23-31. The Netherlands, pp 722-726.
[C6] CHAMBERLIN,D. D.; GRAY, J. N.; AND [A4] KUNII, T. L. ; WEYL, S. ; ANDTENENRAUM,
J.M. "A relational data base schema for
describing complex pictures with color
system," Proc. A F I P S National Com- and texture," Pro¢. Second Jr. Conf. on
uter Conf., May 1975, Vol. 44, AFIPS Pattern Recognition, August 1974, IEEE
ress, Montvale, N.J., 1975, pp 425-430. Cat• No. 74CH0885-4C, IEEE, New
[C7] WEE,R. S.S. "Problems in the dynamic York, 1974.
sharing of data in a relational data base [A5] YALLE, G. "Interactive handling of
environment," IBM Scientific Centre data base relations: experiments with
Report UKSC 0067, Peterlee, England, the relational approach," Technical Re-
August 1975. port, Univ. of Bologna, Bologna, Italy,
[C8] GRAY, J. N. ; LORIE, R. A. ; AND PUTZOLU, March 1975.
G . R . "Granularity of locks in a large [A6] DEJONO,S. P. ; ANy ZLOOF,M.M. "The
shared data base," Proc. [nternatl. Conf• system for business automation (SBA):
on Very Large Data Bases, Sept. 1975, programming lauguage," IBM Research
ACM, New York, 1975, pp 428--451. Report RC5302, Yorktown Heights, N.Y.,
March 1975.
[A7] DEJoNo,S. P.; AND ZLooF~M. M. "Ap-
Integrity Control plication design within the system for
[I1] FLORENTIN,J.J. "Consistency auditing business automation", IBM Research Re-
of data bases," Computer J. 17, 1 (Feb. port RC5366, Yorktown Heights, N.Y.,
1974), pp 52-58. April 1975.
[I2] STONEBRAKER, M• "High level in- [A8] NAVATHE, S. B.; AND MlmvrBN, A. G.
tegrity assurance in relational data base "Investigations into the application of
management systems," Electronics Re- the relationalmodel to data translation,"
search Lab. Report ERL-M473, Univ• Proe. A C M - 8 1 G M O D C'tmf.M a y 1975,*
of Calif. at Berkeley, August 1974. A C M , New York, 1975,pp 123-138.
[I3] GRAVES,R. W. "Integrity control in a [AP] BANDURSKI, A. E.; ANY JBItFI~nSON,D. K.
relational data description language," "Data description for computer-aided
Proc. ACM Pacific 75 Regional Conf., design," Proc. ACM-81GMOD Conf.
April 1975,** ACM, New York, 1975, pp May 1975,* ACM, New York, 1975 pp
108-113. 193-202.
[I4] STONEnRAKER,M. "Implementation of [A10] WILLIAMS, R.; AND GIvmsos, G. M.
integrity constraints and views by query "A picture building system," Proe. Conf•
modification," Proe. ACM-SIGMOD on Computer Graphics, Pattern Recog-
Conf. May 1975,* ACM, New York, 1975, nition, & Data ,Structure, May 1975,
pp 65-78. IEEE Cat. N o . 75CH0981-1C, IEEE,
[I5] ESWARAN,K. P.; ANn (3HAMRERLIN, New York, 1975.
D. D. "Functional specifications of a [All] Go, A.; STONEnRAKSR, M.; AND WIL-
subsystem for data base integrity," LIAMS, C. " A n approach to implement-
Proc. Internatl. Conf. on Very Large ing a geo-data system," Proc. ACM-
Data Bases, Sept. 1975, ACM, New SIGDA-~IGMOD.~IGqRAPH Workshop
York, 1975, pp 48-68. on Data Bases for Interactive Design,
[16] HAMMER,M. M.; AND McLEov, D. J. Sept. 1975, ACM, New York, 1975, pp.
'l~S~m~ts~e ~,t,e ~ot cYi?n~errneal~t.i°~:nlfdaot~ 67-77.
[A12] DONOVAN,J.; FZSS~L, R.; GREENRERG,
Very Large Data Bases, Sept. 1975, ACi~, S.; ANY GU~S~AO, ..L', "An experi-
New York, 1975, pp 25--47. mental VM/370 based miormation sys-
tem," Proc. lnternaN. Conf. on Very Large
Data Bases, Sept. 1975, ACM, New York,
Applications 1975, pp 549-553.
[A1] SooP, K.; SVENSSON, P.; AND WIKTORIN,
L. "An experiment with a relational
data base system in environmental re- Deductive Inference and Approximate
search," Proc. Fourth Internatl. Sym- Reasoning
osium on Computer and Information The references in this Section represent a small
ciences (COINS IV), Dec. 1972 Plenum sample of the publications in deductive inference.
Press, New York, 1972. Many additional referenceswill be found in [D1].

Computer Survt,.vs.V~ S. No:l. ~reb l~e

. ~.~*~?,. ~ - : U~
66 • Donald D. Chamberlin

[D1] CHANG,C. L.; AND LEE, R. C. T. ,Sym- Center, Yorktown Heights, New York,
bolic logic and mechanical theorem proving, July 1973.
Academic Press, New York, 1973. leg] CODD, E. F. "Seven steps to REN-
[D2] MINKER,J. "Performing inferences over DEZVOUS with the casual user," Proc.
relational data bases," Proc. ACM- IFIP TC-~ Working Conf. on Data Base
,SIGMOD Conf. May 1975,* ACM, New Management ,Systems, April 1974, North-
York, 1975 pp 79--91. Holland Publ. Co., Amsterdam, The
{D3] Z-ADEH,L. A. "Calculus of fuzzy re- Netherlands, 1974.
strictions," Report ERL-M562, Elec-
tronics Research Lab., Univ. of Calif.,
Berkeley, Calif., Feb. 1975. Sets and Relations (prior to 1969)
These references are included to enable the reader
to trace work published prior to 1969 on computer
Natural Language Support support for (mathematical) sets and relations.
[El] SIMMONS, R. F. "Natural language [YI] CODASYL Development Committee.
question-answering systems: 1969," "An information algebra", Comm. A C ~
Comm. ACM 15, 1 (Jan. 1970), 15-30. 6, 4 (April 1962), 190-204.
[E2] SCH-ANK,R. C. ; ANDCOLnY,K.M. (Eds.), [Y2] LEVIEN,R. E.; AND MARON, M.E. " A
Computer models of thought and language computer system for inference execution
W. H. Freeman, San Francisco, 1973. and data retrieval," Comm. ACM 10,
[~] RusTIN, R-AND-ALL(Ed.), "Natural lan- 11 (Nov. 1967), 715-721.
guage processing," Courant Computer [Y3] CmLDS, D. L. "Feasibility of a set-
Science Symposia 8, New York, Dee. theoretical data strueture--a general
1971, Prentice-Hall, Englewood Cliffs, structure based on a reconstituted defi-
N.J., 1971. nition of relation," Proc. IFIP Congress
[E4] THOMPSON, F. P.; LOCKEM-ANN, P. C.; 1968, North-Holland Publ. Co., Amster-
DOSTERT, B. H.; .AND DEVERILL, R. dam, The Netherlands, 1968, pp 162-172.
"REL: a rapidly extensible language [Y4] CmLvS, D. L. "Description of a set-
system," Proc. ~th ACM National Conf., theoretic structure," Proc. AFIP,S 1968
New York, 1969, ACM, New York, 1969, Fall Jr. Computer Conf., Vol. 33, AFIPS
vv 399--417. Press, Montvale, N.J., 1968, pp 557-564.
[E5] KELLOGG,C. H.; BURGER, J.; DILLER, T.:
-ANDFOGT, K. "The CONVERSEnatural [Y5] lSH~elWa tiLo~ N~nSe~rE;, EwitHh ~'e~uR:~i:~
language data management system: cur- capabilities," Proc. ACM ~$rd National
rent status and plans," Proc. ACM ,Sym- Conf., August 1968, Brandon/Systems
posium on Information ~torage and Re- Press, Princeton, N.J., 1968, pp 143-156.
trieval, 1971, ACM, New York, 1971, [Y6] FELDMAN,J. A.; AND ROVNER, P. D.
pp 33-46. , " A n ALGoL-based associative language,"
[E6] WINgeR-An,T. 'Procedures as a repre- Comm. ACM 12, 8 (August 1969), 439-449.
sentation for data in a computer program [Y7] KUHNS,J. L. "Logical aspects of ques-
for understanding natural language,"
MIT Project MAC Report MAC TR-84,
Cambridge, Mass., 1971. f - . ( ),
[ET] MONTGOMERY,C. A. " I s natural lan- • Dec. 1969, Academic Press, New York,
guage an unnatural query language?" 1969.
Pros. ACM National Conf., New York, [Y8] KOCHEN,M. "Adaptive mechanisms in
1972, ACM, New York, 1972, pp 1075-- digital concept processing," in Discrete
1078. Adaptive Proeesses--,symposium and
[E8] PETRICK, S. R. "Semantic interpreta-
tion in the REQUESTsystem," IBM Re- Panel Discussion, AIEE, New York, 1962
search Report RC4457, IBM Research pp 50--58.

Computing Surveys,Vol.8pNo. 1, March 1976

You might also like