DBMS-1
DBMS-1
DONALD.D. CHAMBERLIN
IBM Research Laboratory, San Jose, California 9519S
The essential concepts of the relational data model are defined, and normalization,
relational languages based on the model, as well as advantages and
implementations of relational systems are discussed.
Keywords and Phrases: Data base, data-base management, data independence,
data model, relational systems
CR Categories: 8.5I, ~.3~,~.~
Copyright © 1976, Association for Computing Machinery, Inc• General permission to'republish,
but not for profit, all or part of this material is granted provided that ACM's copyright notice is
given and that referenceis made to the publication, to its date of issue, and to::the fact that reprinting
privileges were granted by permission of the Associationfor Computing Machinery.
44 • DonaldD. Chamberlin
An excellent introduction to relational con- of Figure 1, the second and third columns
cepts can also be found in Date's recent text- are both based on the same domain: the
book [-Zt1]. set of names of Presidential candidates.
In mathematics, the term relation may However, ,each column has a different role-
be defined as follows: Given sets D1, D ~ , . . . , name to describe its meaning in this par-
D~ (not necessarily distinct), a relation R ticular relation: WINNER-NAME and
is a set of n-tuples each of which has its LOSER-NAME.
first element from D~, second element from The individual entries in each mple are
D2, etc. The sets D~ are called domains. called its components. Thus, we may say
The number n is called the degree of R, and that in the tuple whoseYEAR-eomponent is
the number of tuples in R is Called its "1952,'.' the LOSER-NAME-component is
cardinality. "Stevenson."
It is customary (though not essential) A column or set of columns whose values
when discussing relations to represent a uniquely identify a row of a relation is
relation as a table in which each row repre- called a candidate key (often shortened to
sents a tuple. An example of this representa- simply key) of the relation. In Figure 1,
tion is shown in Figure 1, which illustrates a YEAR is a key for. ELECTIONS •since no
relation describing Presidential elections. two rows have the same YEAR. I t is pos-
In the tabular representation of a relation, sible for a relation to have more than one
the following properties, which derive from key. For example, if the ELECTIONS re-
the definition of a relation, should be ob- lation had an additional column ADMIN-"
served: ISTRATION-NUMBER, it would also be
a key. When a relation has more than one
1) no two rows are identical;
key, it is customary to designate one as the
2) the ordering of rows is not signifi-
cant; and
primary key.
Often a column or set of columns in one
3) the ordering of columns is significant
relation will correspond to a key of another
(i.e., the meanings of the tuples
relation. For example, consider the PRESI-
(1972, Nixon, McGovern) and (1972,
DENTS relation of Figure 2, whose key is
McGovern, Nixon) are quite differ-
NAME. The values of WINNER-NAME
ent).
in the ELECTIONS relation correspond to
When a rdation is represented as a table, its values of the key-column NAME in PRESI-
degree is the number of columns' and its DENTS. Consequently, WINNER-NAME
cardinality is the number of rows. in ELECTIONS is called a foreign key.
In the tabular representation of a rela- Two facts should be noted: 1 ) a foreign
tion, it is customary to name the table and" key need not be (and often is not) a key of
to name each column, as shown in Figure 1. its own relation; and 2) the foreign key need
The columns of the table are called attributes." not have the same role-name (e.g.,
(Sometimes the name of a column is referred WINNER-NAME) as the corresponding
to as a role name.) It is important to dis'- key in the other relation (e.g., NAME).
tinguish between attributes and domains. In an integrated data-base management
For example, in the ELECTIONS relation system, different users may have a need to
46 • Donald D. Chamberlin
see different subsets of the universe of data. in first normal form are sometimes called
The term data model denotes the universe "flat tables". If we look carefully at the re-
of data--the complete set of relations stored lation in Figure 1, we see that it is not in
in the system. A schema is a set of declara- first normal form. This is because an elec-
tions which describe the data model. The tion, while it has only one winner, may
term data submodel denotes the set of re- have several losing candidates. Thus, for
lations which is available to a particular example, the tuple for the election of 1968
user, and a subschema is a set of declarations contains the component {"Humphrey",
for the data submodel. A complete data- "Wallace"}. In fact, the LOSER-NAME
management system must provide a means component of each election tuple is a list.
for defining the schema and a subschema for whose length depends on the number of
each distinct class of users of the system. votes a candidate must receive to merit in-
clusion in the data base.
We can convert the ELECTIONS rela-
NORMALIZATION
tion into first normal form by breaking it
The issue of designing a schema and sub- Up into two relations, one containing infor-
schemas for a data base leads us to a discus- mation on winning candidates and the other
sion of normalization. The concept of nor- on losing candidates. This also gives us a
malization was introduced by Codd in [M2] good opportunity to record other attributes
and dealt with more rigorously in his later of interest about the candidates, such as
papers [N1] and [N2]. A number of other their party and number of votes received.
authors have also made contributions to the This leads us to the data base shown in
theory of normalization (see bibliography). Figure 3, which is in first normal form.
Normalization theory begins with the The key of ELECTIONS-WON is YEAR;
observation that certain collections of rela- the key of ELECTIONS-LOST is (YEAR,
tions have better properties in an updating LOSER-NAME}.
environment than do other collections of To illustrate the advantages of the higher
relations containing the same data. The normal forms, we need to make updates to
theory then provides a rigorous discipline the data base by inserting new tuples, de-
for the design of relations which have favor- leting existing tuples, and making changes
able update properties. The theory is based to existing tuples. These updates are not
on a series of normal forms--first, second, particularly well motivated for our example
and third normal form--which provide suc- data base, in which data is mostly static
cessive improvements in the update prop- and unchanging. Of course, in an operational
erties of a data base. We will discuss these data base describing, for example, the in-
normal forms on an intuitive basis; for a ventory of a store, updates would be very
thorough treatment, see [N1], IN8], or frequent. For the sake of consistency, we will
[z11]. continue with our Presidential example.
Almost all references to relations im- (You may imagine that some data was found
plicitly deal with relations in first normal to be in error and is being updated to correct
form. A relation in first normal form is a the data base.)
relation in which each component of each Relations in first normal form may be
tuple is nondecomposable; i.e., the com- used with any of the relational languages
ponent is not a list or a relation. Relations which are described in the next section.
However, a relation in first normal form may worse, it leads to the; possibility that differ-
exhibit three kinds of misbehavior, which ent tuples may contain inconsistent values
are called update anomalies, insertion of HOME-STATE for the same President.
anomalies, and deletion anomalies. All these Insertion anomalies: Suppose we wish to
anomalies arise because more than one insert a fact about a candidate which is
"concept" may be mixed together in the independent of any election, e.g., "Dewey,
same tuple. Consider the ELECTIONS- was a Republican." This is difficult in our
WON relation of Figure 3. Mixed together example data base because there is no rela-:
in one tuple of this relation are facts about tion for candidates. We are forced to invent
candidates (e.g., "Eisenhower came from a tuple in ELECTIONS-LOST (or ELEC-
Texas") and facts about elections (e.g., TIONS-WON?) having null values for
"In 1952 Eisenhower received 442 elec- YEAR and the o~,er irrelevant attributes.
toral votes"). In some applications it may In many systems we would be unable to
be important that each of these facts be store this fact because null values are not
independently updated, inserted, and de- permitted in the primary key.
leted. This gives rise to the three anomalies, Deletion anomalies: Suppose we wish to
which we can now illustrate by the following delete the information about elections as
examples. they fall beyond a certain number of years
Update anomalies: Suppose the fact that in the past. When we delete the 1952-tuple
"Eisenhower's home state is Texas" is from ELECTIONS-WON, we still retain
found to be in error, and his home state the fact that Eisenhower was a Republican.
must be changed to Nebraska. Since Eisen- But when we delete the 1956-tuple, all
hower appears in more than one tuple of facts about Eisenhower are lost. In some
ELECTIONS-WON, this erroneous fact applications, this might have •very serious
may be represented many times (in general, consequences. For example, consider a rela-
a time-varying number of times). This tion describing orders for various items,
makes it difficult to update this particular shown in Figure 4. As orders are filled we
fact, since all tuples where it is represented delete their tuples from the relation. When
must be searched out and updated. Even we have deleted the last order for toasters,
LOSER- LOSER-
ELECTIONS-LOST YEAR PARTY
NAME VOTES
o
48 • Donald D. Chamberlin
QUANTITY-
ORDERS ITEM PRICE DATE
ORDERED
we find we n o longer have any information variety of ways. The original definition was
about the price of toasters--possibly an given by Boyce and Codd in IN1]. Later
unintended result. This kind of relation writers, including Kent [N8], Codd [M14],
burdens the user with the responsibility of and Sharman [N15], proposed alternate
making sure that the tuple he deletes is not definitions which framed the same concept
the last tuple of some "category" (e.g., in simpler terminology. We present two of
toasters), and therefore the sole bearer of these equivalent definitions:
information about that category (e.g.,
price). Definition, Boyce and Codd [M14]:
An important objective of normalization A relation R is in third normal form if it is in
first normal form and, for every attribute
is the elimination of the update, insertion, collection C of R, if any attribute not in C is
and deletion anomalies. The most widely- functionally dependent on C, then all attri-
known result of normalization theory is butes in R are functionally dependent on C.
third normal form. Since second normal form
is of little significance except as a stopping- Definition, Sharman [N15]:
A relation is in third normal form if every
off place on the way to third, we will proceed determinant is a key.
directly to the definition of third normal
form. Both definitions are formal ways of ex-
In order to understand how third normal pressing a very simple idea:-that each re-
form avoids the three anomalies, we must lation should describe a single "concept,"
discuss the concept of functional dependence and if more than one "concept" is found in:
among the attributes of a relation. We say a relation, the relation should be split into
that an attribute B of relation R is func- smaller relations. The result of applying
tionally dependent on attribute A if, at every this "splitting" process to the sample data
instant of time, each A-value in R is as- base of Figure 3 is shown in Figure 5. A
sociated with only one B-value. We ex- moment's examination will show that the
press this relationship by the notation A --~ update, insertion, and deletion anomalies
B, and say "A determines B" or "B de- we discussed are not present in the data
pends on A." Similarly, a set of attributes in base of Figure 5.
R may be functionally dependent on an- The design of a data base in third normal
other attribute or set of attributes. The form depends on knowledge of the func-
attribute (or set of attributes) on the left tional dependencies among the attributes
side of the arrow (A in our example) is of the data. This knowledge cannot be
called the determinant. discovered automatically by a system (un-
Clearly, from our definition of key in the less the data base is completely static), but
previous section, every relation contains at must be furnished by a data-base designer
least one functional dependence: all attri- who understands the semantics of the in-
butes of the relation are dependent on the formation. In fact, there is not a mlique
key. (The dependence may be trivial if the third normal form representation for a
relation contains only a key.) If a relation given data base. In IN1] Codd briefly ad-
has more than one key, then all its attributes dressed the problem of choosing an "Optimal
are dependent on each key. Third Normal Form" from among the
Third normal form has been defined in a various alternatives.
1952 Stevenson 89
1956 Stevenson 73
1960 Nixon 219 •
1964 Goldwater 52
1968 Humphrey 191
1968 Wallace 46
1972 McGovern 17
Stevenson Democrat
Nixon Republican
Goldwater Republican
Humphrey Democrat
Wallace. Am. Indep.
McGovern Delnocrat "
FIGURE 5. a t a ~ a s e in third normaLform.
tots can serve both as a d a t a sublanguage A typical query in ALPHA has two parts:
and as a query language. a target, which specifies the particular at-
This section will explore the approach tributes of the particular relation which are
taken by various relational languages to to be returned, and a qualification, which
providing facilities for query, data manipu- selects particular tuples from the target
lation (e.g., insertion, deletion, and update relation by giving a condition which they
of tuples), data definition (e.g., creation of must satisfy. We will illustrate ALPHA (and
new relations and other structures), and other languages) by some sample queries
data control (e.g., authorization and control based on the data base of Figure 5.
of data integrity). We will then briefly In Q1) below, the RANGE statement de-
consider some ways in which languages clares P be a variable ranging over the rows
can be evaluated and compared, and discuss of the PRESIDENTS relation. The next
the role of natural language as a data-base statement retrieves into workspace W the
interface. HOME-STATE of row P whenever the
NAME of row P is " K E N N E D Y . "
Query Facilities The qualification part of an ALPHA query
may be quite complex and may use the
Query, or retrieval of information from the universal and existential quantifiers: "for
data base, is perhaps the aspect of relational all" (V), and "there exists" (3). For ex-
languages which has received the most at- ample, see display Q2) below.
tention. We will illustrate the variety of Various other languages based, like
approaches to query by presenting ex- ALPHA, on the relational calculus, have been
amples of four classes of languages: rela- proposed. This class of languages imfludes
tional calculus, relational algebra, mapping- QuEL [S15], CO]bARD [L3], and RIL [L7].
oriented languages, and graphics-oriented
languages. Although we deal only with
query facilities in this section, all the lan- Relational Algebra
guages discussed have facilities for update
and other operations in addition to query. A second major class of languages is based
on the relational algebra, which was in-
Relational Calculus troduced by Codd in [M2] and refined in
[M3]. The relational algebra is a collection
Codd's 1970 paper [M2] laid the ground- of operators that deal with whole relations,
work for two families of relational lan- yielding new relations as a result. The
guages which came to be called the rela- major operators of relational algebra in-
tional calculus and the relational algebra. The elude the following:
relational calculus family grew from the • Projection: The projection operator re-
observation that a first-order applied predi- turns only the specified columns of the
cate calculus can be used as a data sub- given relation, and eliminates dupli-
language for normalized relations. In ILl] cates from the result. For example, to
Codd presented the details of such a calculus- find all the unique (party, home-state}
based sublanguage, called ALPHA. pairs in the PRESIDENTS relation,
Q2) List the election years in which a Republiban from Illinois was elected.
RANGE PRESIDENTS P
RANGE ELECTIONS-WON E
GET W E.YEAR: 3 P (P.NAME = E.WINNER-NAME &
P. PARTY ffi'REPUBLICAN' & P. HOME-STATE = 'ILLINOIS').
KENNEDY I P. NEVADA
Q2) List the election years in which a Republican from Illinois was elected.
P.1948 WILSON
ment in SEQUEL[L10] has the effect of giving condition. Our first call to GAMMA-0 uses
a 10 % raise to all programmers: the operator CREATE-SCAN, which creates
a scan on the EMP relation to search for
UPDATE EMP tuples according to their EMPNO attribute.
SET SALARY = SALARY*I.1 The system returns a~ identifier, called a
WHERE JOB = 'PRDGRAMMER' SCANID, by which We may refer to the
newly created scan in future calls. Next we
All the languages we have discussed so call the operator SET-SCAN and furnish
far have been high level and nonprocedural the value which is to be searched for (in this
in nature. Indeed, one of the advantages of case the EMPNO, which is the parameter of
the relational model is that it is readily our transaction). Our next call is to the
compatible with high-level languages. But operator NEXT-SUBTUPLE, which re-
it should not be concluded that t h e rela- turns an actual tuple satisfying the cri-
tional model is incompatible with a lower- terion we established by the previous calls:
level, more procedural programming inter- (NEXT-SUBTUPLE ,could be called re-
face. In fact, several low-level, host-lan- peatedly if we expected many tuples to
guage relational interfaces have been pro- satisfy the criterion.) Having obtained the
posed, including GAMMA-0 [L4], XRM [$6], desired employee-tuple, we can compute a
and MINIZ [$8]. These interfaces are well new salary-value in our host program and
suited for writing programs that are to be then call UPDATE SUBTUPLE, which puts
called repeatedly and which update the the new salary-value into the data-base.
data base according to parameters furnished GAMMA-0allows a program to have as many
with the call. active scans as it wishes, and to control the
We will illustrate how one low-level re- position of each by explicit culls. When a
lational language, GAMMA-0, might be used .program has no further use for a scan, it
to write a transaction which finds the em- may drop it by .culling the operator DROP-
ployee-tuple having a given employee SCAN.
number and updates its salary component Although it i s a low-level, procedural
according to some computation. GAMMA-0 language, GAMMA-0 is considered .a rela-
consists of a set of operators which may be tional language because the means of ac-
called from a host language such as P L / I . cess to tuples is not predetermined. A rela-
GAMMA-0 is based on the concept of a tion may be accessed associatively through
"scan," which is like a cursor that moves any of its attributes--the attribute to be
through a relation testing tuples for some matched is declared when a scan is opened.
Data Definition and Control the view as though it were a stored relation.
The supportability of updates to the data
In addition to query and data manipulation
base made by means of derived views is a
facilities, a complete data sublanguage
complicated question, one which requires
needs facilities for data definition and data
more research [M14].
control. Data definition has two main as-
pects: The issue of authorization is closely re-
lated to the issue of derived views. In fact,
• Specification of the characteristics of one approach to authorization is to grant to
data to be stored, e.g., the column- each user a particular restricted view [C6].
names and data-types for each rela- Another approach is to automatically add
tion; and certain predicates to the queries and up-
• definition of alternative "views" dates issued by a user in order to restrict
which are derived from the stored their scope to the set of authorized tuples
data. In relational terminology, a [C31.
view is a dynamic "window" on the This unified approach to language design
data base. Updates made to stored can be extended into the aTea of assertions
relations are visible through the concerning data integrity. An assertion is a
various views which are defined on statement about the data base which the
these relations. system automatically enforces by refusing
any update which fails to satisfy the as-
Data control also has two m a i n aspects:
sertion. In language terms, an assertion is
• control over authorization of various simply a predicate, which is syntactically a
users to perform various operations fragment of a query, and which may con-
on the data base; and tain other queries nested inside it. For
• ability to make integrity assertions example, suppose we wish to assert that for
that protect the validity of data and any given election the number of votes re-
define the set of permitted transitions ceiveed by the winner is greater than the
in the data base. number of votes received by any loser.
This assertion may be made as follows in
The relational model permits a language to S~QvEL (the variable X represents a tuple
take a consistent, unified approach to query, of the ELECTIONS-WON, relation):
data manipulation, data definition, and
data control. Several relational languages ASSERT ON ELECTIONS-WON X:
have gone to great lengths to provide such a WINNER-VOTES >
unified approach; these languages include (SELECT MAX (LOSER-VOTES)
S~QUEL [L10, LS, C6, I5], QvEL [S15, C3, FROM ELECTIONS-LOST
I4], and Query By Example [L21, L24]. WHERE YEAR=X.YEAR)
An important observation to be made in
data definition is that the definition of a Language Evaluation
view is simply a process of deriving a rela-
tion from the set of stored relations, and The great variety of proposed relational
that this is similar to the process of stating a languages leads us to the question: How can
query. Therefore, the full power of a query languages be evaluated and compared?
language may be applied to the definition of There are at least three criteria involved in
views. This is possible because all the re- any objective attempt to evaluate a lan-
lational query languages we have discussed guage: completeness, level, and learnability.
have the property of closure, i.e., they ope- Space constraints permit us to touch only
rate on relations to construct or define new briefly on each of these.
relations. A view may be a selected subset Codd [M3] was the first to establish a
of a stored relation, or it may span over careful definition of completeness for data-
more than one stored relation, as in the base sublanguages. He defined a language
ease of a join. Once the definition of a view to be relationally complete if it permits ex-
has been made, queries may be directed to pression of any query expressible in the
[SLID which uses XRM inversions to limit ing a relational prototype is the INGRES
the search space for a given query. The (Interactive Graphics and Retrieval Sys-
SEQUEL prototype has been extended by tem), of the University of California at
IBM at Cambridge and by t h e ' M I T Sloan Berkeley [$7, $9, $15]. INGRES, which runs
School of Management to accommodate a on a P D P - 1 1 / 4 0 under the UNIX operating
multiple-user environment. The resulting system, implements QUEL, a relationally
system, called GMIS, is being used at MIT complete query language based on the re-
•as an ~information system for modeling New lational calculus. The INGRES system im-
England energy resources. [A12, $19]. plements a variety of features by automatic
Another prototype system based on XRM modification of the QUEL statement sub-
is being developed at IBM Research in mitted by the user. Alternative views are
Yorktown Heights, to implement Query supported by substituting the view-defini-
By Example. The system contains an tion into the user's statement [I4]. Authori-
optimizer which interprets Query By Ex- zation and integrity control are provided by
ample queries in terms of operations similar adding extra predicates to the user's state-
to those of the relational algebra (join, re- ment which limit its scope [C3]. Concurrent
striction, etc). At present, the system sup- update requests are kept from interfering
ports only a single user and does not pro- with each other by analyzing their respec-
vide update facilities. tive scopes and allowing an update to
A large-scale prototype data-base man- proceed only when it is "safe" [I2]. Finally,
agement system, called System R, is pres- the QUEL statement, which may contain
ently under construction a t ' I B M Research m a n y variables, is broken up by a "de-
in San Jose [$20]. System R is the first at- composition" algorithm into a series of
tempt to apply the relational data model to one-variable statements which are executed
an environment of many concurrent users one at a time. The physical data structures
and a high volume of requests. It will pro- used by INGRES include hashed tables (in-
vide an operationally complete data-man- cluding "order-preserving" hash functions
agement capability, with facilities for au- which permit sequential scanning in key-
thorization, logging and recovery, definition value order) and "generalized directories,"
of alternative views, and enforcement of which employ a tree-structure to map a
data consistency and integrity. System R key into an address interval, and then use
will support the SEQVEL language as an an order-preserving function to compute
external interface, as well as a set of pro- an address within the interval [$9].
cedural operators for host-language pro- Implementation of another relational
gramming. Requests to the system will be system, called ZETA, is presently under way
executed by an optimizer which chooses at the University of Toronto [$8, S14].
among various physical access methods, The ZETA system is constructed in three
including inversions maintained in the form levels. The lowest level is a language called
of B-trees IT1], physical pointer-chains, and MINIZ, which provides such basic operations
a sort-merge facility. A user is not con- as scanning a relation and accumulating a
strained to protect himself against the up- list of identifiers of tuples which satisfy a
dates of other concurrent users by explicit given condition. The middle level imple-
locking statements; the system automati- ments views ("derived relations") and has
cally generates locks as needed at the level an optimizer/interpreter which accepts
of individual tuples. Deadlocks are auto- queries spanning multiple relations. Three
matically detected and resolved. Some of types of end-user interfaces are supported
the locking techniques developed as part of by ZETA :
the System R project have been described
in [C1, C4, C8]. System R is being imple- • a host-language facility which pro-
mented on an IBM 370, using a VM/370 vides features similar to SEQUEL;
operating system modified for the data- • a query language generator system
base environment [T13]. whereby a user may create his own
Another large-scale attempt at construct- self-contained query language using
m
60 • Donald D. Chamberlin
The author i s also grateful to his colleagues at systems: a tutorial," Proc. Fourth In-
the IBM Research Laboratory in San Jose for ternatl. •ymposium on Computer and In-
their support and discussions. formation Sciences, Dec. 1972, Plenum
Press, New York, 1972.
[M8] CORD, E . F . "Understanding relations,"
CLASSIFICATION OF REFERENCES continuing series of articles published in
FDT, the quarterly bulletin of ACM-
Models and Theory SIGMOD, beginning with Vol. 5, 1 (June
M 1) General 1973),* ACM, New York, 1973.
N 2) Normalization, Decomposition, and [M9] HAWRYSZKIEWYCZ, I. T. "Semantics of
Synthesis data base systems," M I T Project, MAC
Z 3) Relationships between CODASYL Report MAC TR-112, Cambridge, Mass.,
D D L / D B T G and the Relational Dec. 1973.
Model [M10] BRACCHI, G. ; FEDELI, A. ; AND PAOLINI, P.
L Languages and Human Factors " A multi-level relational model for data-
Implementations base management systems," Data Base
S 1) Software Management, Proc. I F I P TC-2 Working
H 2) Hardware Conf. on Data-Base Management Systems,
T Implementation Technology April 1974, North-Holland Publ. Co.,
C Authorization, Views, and Concurrency Amsterdam, The Netherlands, 1974.
I Integrity Control [Mll] STONEBRAKER, M. "A functional view
A Applications of data independence," Proc. ACM-
D Deductive Inference and Approximate SIGFIDET Workshop on Data Descrip-
Reasoning tion, Access, and Control, May 1974,*
E Natural Language Support ACM, New York, 1974, pp. 63-81.
Y Sets and Relations (prior to 1969) [MI2] MBLTZER, H. S. "Relations and rela-
Certain references include asterisks with the tional operations," IBM Report to GUIDE
following meaning: 38 Information Systems Division, Dallas,
* Proceedings of ACM-SIGFIDET and Texas, May 1974.
ACM-SIGMOD Workshops are obtain- [M13] HI~CHCOCK, P. "Fundamental opera-
able from ACM Headquarters, 1133 Ave- tions on relations in a relational data
nue of the Americas, New York, N.Y. base," IBM Scientific Centre Report
10036 UKSC 0051, Peterlee, England, May 1974.
** Proceedings of the 1975 ACM Pacific [MI4] CorD, E. F. "Recent investigations in
Conference, San Francisco, April 17-18, relational data base systems," Informa-
1975 are obtainable from: Mail Room, tion Processing 74, Proc. I F I P Congress,
Boole & Babbage, 850 Stewart Drive, August 1974, Vol. 5, North-Holland Publ.
Sunnyvale, California 94086 Co., Amsterdam, The Netherlands, 1974,
~Vp. 1017-1021.
[M15] ~D~XI~D, H. "Datenbanksysteme 1,"
Models and Theory Reihe Informatik/16 (1974), Bibliogra-
1) General phisches Institut, Mannheim, W. Ger-
[M1] Coon, E. F. "Derivability, redundancy many.
and consistency of relations stored in [M16] HALL, P. A. V.; TODD, S. J. P.; AND
large data banks," IBM Research Re- HITCHCOCK, P. " A n algebra of relations
port RJ599, August 1969. for machine computation," IBM Scien-
[M2] Cony, E. F. " A relational model of tific Centre Report UKSC 0066, Peterlee,
d a t a for large shared d a t a banks," England, Jan. 1975.
Comm. ACM 13, 6 (June 1970), pp 377-397. [M17] SCHMID, H. A.; ANDSWENSON,J . R . "On
[M3] CODD,E . F . "Relational completeness of the semantics of the relational data
data-base sublanguages", Courant Com- model," Proc. ACM-S1GMOD C o n f . ,
May 1975,* ACM, New York, 1975, pp 211-
•
uter Science Symposia 6, "Data Base
vstems," New York, May 1971, Pren- 223.
t~ce-Hall, Englewood Cliffs, N.J., 1971,
pp. 65-98.
[M4] STRNAD,A . L . " T h e relational approach Models and Theory
to the management of data bases," Proc. 2) Normalization, Decomposition, and Synthesis
I F I P Congress, August 1971, Vol. 2, [N1] CODD,E. F. " F u r t h e r normalization of
North-Holland Publ. Co., Amsterdam, the data base relational model," Courant
The Netherlands, 1971, pp. 901-904. Computer Science Symposia 6, "Data
[MS] DURCHHOLZ,R. " D a s Datenmodell bei Base Systems," New York, May 1971,
Codd," Technical Report No. 69, Gesell- Prentice-Hall, New York, 1971, pp. 33-64.
schaft fiir Mathematik und Datenver- [N2] CODD, E. F. "Normalized data base
arbeitung, Bonn, W. Germany, July 1972. structure: a brief tutorial," Proc. 1971
[M6] HAWRYSZKIEWYCZ,I T.; AtqD DENNIS, ACM-SIGFIDET Workshop on Data
J . B . " A n approach to proving the cor- Description, Access, and Control, Nov.
rectness of data-base operations," Proc. 1971,* ACM, New York, 1971, pp. 1-17.
ACM-SIGFIDET Workshop on Data [N3] HEA'rH, I. J. "Unacceptable file opera-
Description, Access, and Control, Nov.- tions in a relational data base," Proc.
Dec. 1972,* ACM, New York, 1972, pp. 1971 ACM-SIGFIDET Workshop on Data
323-348. Description, Access, and Control, Nov.
[M7] DATE, C. J. "Relational data base 1971, ACM, New York, 1971, pp. 19-33.
IN4] DELOBEL, C. "Aspects theoretiques sur Holland Publ. Co., Amsterdam, The
la structure de l'information dans une Netherlands, 1974.
base de donn~es", Revue Francaise d'In- [Z5] Co•D, E. F.; AND DATB,:C. J. " I n t e r -
formatique el de Recherche Operationelle, active support for non-prbgrammers: the
B - 3 (Sept. 1971). relational and network approaches,"
INS] DELOnEL, C. " A theory about data in Proc. 1974 ACM-SI(YMOD Dsbate "Data
an information system," IBM Research Models: Data Structure Set versus Rela-
Report, RJ964, San Jose, Calif., Jan. 1972. tional," May 1974,* ACM, New York,
[N6] RISSANEN, J.; AND DELOBEL, C. " D e - 1974.
composition of files, a basis for data stor- [Z6] DATE, C. J.; ANvCoDv, E . F . " T h e re-
age and retrieval," IBM Research Re- lational and network approaches: com-
port R J1220, San Jose, Calif., May 11,973. parison of the application programming
[N7] DELOBEL, C.; AND CASEY, R . G . De- interfaces," Prec. 1974 ACM-SIGMOD
composition of a data base and the theory Debate "Data Models: Data Structure Set
of Boolean switching functions," IBM versus Relational~" May, 1974,* ACM,
J. R. & D. 17, 5 (Sept. 1973), pp. 374-387. New York, 1974. •
[NS] KENT, W. " A primer of normal forms," [Z7] BACHMAN,C. W. " T h e data structure
IBM Technical Report TR 02.600, San set model," PreC. 1975 ACM-SIGMOD
Jose, Calif., Dec. 1973. Debate "Data Models: Data Structure Set
[N9] ARMSTRONG,W.W. "Dependency struc- versus Relational," May 1974,* ACM,
tures of data base relationships," In- New York, 1974.
formation Processing 7~, Prec. I F I P Con- [Z8] SZBLEY,E. H. "On the equivalences of
gress, August 1974, Vol. 3, North-Holland data based systems," Prec. ACM-
Publ. Co., Amsterdam, The Netherlands, SIGMOD Debate "Data Models: Data
1974, pp. 580-584. Structure Set versus Relational," May
[N10] DELOBEL, C.; AND LEONARD, M. " T h e 1974,* ACM, New York, 1974.
decomposition process in a relational [Z9] EVEREST,G . C . " T h e futures of data-
model," Technical Report, Laboratoire base management," Prec. ACM-SIGMOD
d'Informatique, Univ. of Grenoble, Workshop on Data Description, Access,
France, Sept. 1974. and Control, May, 1974, ACM, New
[Nll] WANG, C. P.; AND WEDEKIND, H. "Seg- York, 1974, pp. 445-.462.
ment synthesis in logical data base de- [Z10] OLLE,T . W . "Current and future trends
sign," I B M J. R. & D. 19, 1 (Jan. 1975) in data base management systems," In-
pp 71-77. formation Processing 7~, Prec. I F I P
[N12] ~ERNSTEIN, P. A.; SWENSON,J. R.; AND Congress, August, 1974. Vol. 5, North-
TSICHRITZIS, D. " A unified approach to Holland Publ. Co., Amsterdam, The
functional dependencies and relations," Netherlands, 1974, pp 998-1006.
Proc. ACM-S[GMOD Conf. May 1975,* {Zll] DATE, C. J. " ~ n introduction to data
ACM, New York, 1975, pp. 237-245. base systems," Addison-Wesley, Reading,
[N13] FADOUS, R. Y.; AND FORSYTH, J. " F i n d - Mass., 1975.
ing candidate keys for relational data [Z12] KAY, M. H. " A n assessment of the
bases," Prec. ACM-SIGMOD Conf., May CODASYL DDL for use with a rela-
1975,* ACM, New York, 1975, pp. 203-210. tiona 1 schema, " Data Base Description,
[N141 FADers, R. Y. "Mathematical founda- B. C. M. Douque aad G. M. Nijssen
tions for relational data bases," PhD. (Eds.), North-Holland Puhl. Co., Am-
Thesis, Michigan State Univ., Lansing, sterdam, The Netherlands, 1975, pp.
1975. 199-214.
IN15] SHARMAN,G. C. H. " A new model of [Z13] ROnINSON, K. A. " A n analysis of the
relational data base and high level lan- uses of the CODASYL set concept,"
guages," Technical Report TR. 12.136, Data Base Description, B. C. M. Douque
IBM Hursley Park Laboratory, England, and G. M. Nijssen, (Eds.), North-Holland
Feb., 1975. Publ. Co., Amsterdam, The Netherlands,
1975, pp. 169-182.
[Z14] TAYLOR, R. W. "Observations on the
Models and Theory attributes of database sets," Data Base
3) Relationships between CODASYL D D L / Description, B. C. M. Douque and G. M.
DBTG and Relational Model Nijssen (Eds.), North-Holland Publ. Co.,
{Z1] CODASYL Data Base Task Group Re- Amsterdam, The Netherlands, 1975, pp.
port, April 1971, ACM, New York. 73-84.
[Z2] CANNING,R . G . "Problem areas in data [Z15] OLLE, T. W. " A n analysis of short-
management," EDP Analyzer 12, 3 comings in the schema DDL with an
(March 1974). outline of proposed improvements,"
[Z3] EARNEST,C. P. " A comparison of the Data Base Description, B. C. M. Douque
network and relational data structure and G. M. Nijssen (Eds.), North-Holland
models," Technical Report, Computer Publ. Co., Amsterdam, The Netherlands,
Sciences Corp., El Segundo, Calif., April 1975, pp. 283-298.
1974. [Z16] HuiTs, M. "Requirements for languages
[Z4] NIJSSEN, G. M. " D a t a structuring in in data-base systems," Data Base Descrip-
tion, B. C. M. D o u q u e and G. M.
DDL and relational d a t a m o d e l , " Prec. Nijssen (Eds.), North-Holland Publ. Co.,
I F I P TC-2 Working Conf. on Data Base Amsterdam, The Netherlands, 1975, pp.
Management Systems, April 1974, North- 85-110.
. ~.~*~?,. ~ - : U~
66 • Donald D. Chamberlin
[D1] CHANG,C. L.; AND LEE, R. C. T. ,Sym- Center, Yorktown Heights, New York,
bolic logic and mechanical theorem proving, July 1973.
Academic Press, New York, 1973. leg] CODD, E. F. "Seven steps to REN-
[D2] MINKER,J. "Performing inferences over DEZVOUS with the casual user," Proc.
relational data bases," Proc. ACM- IFIP TC-~ Working Conf. on Data Base
,SIGMOD Conf. May 1975,* ACM, New Management ,Systems, April 1974, North-
York, 1975 pp 79--91. Holland Publ. Co., Amsterdam, The
{D3] Z-ADEH,L. A. "Calculus of fuzzy re- Netherlands, 1974.
strictions," Report ERL-M562, Elec-
tronics Research Lab., Univ. of Calif.,
Berkeley, Calif., Feb. 1975. Sets and Relations (prior to 1969)
These references are included to enable the reader
to trace work published prior to 1969 on computer
Natural Language Support support for (mathematical) sets and relations.
[El] SIMMONS, R. F. "Natural language [YI] CODASYL Development Committee.
question-answering systems: 1969," "An information algebra", Comm. A C ~
Comm. ACM 15, 1 (Jan. 1970), 15-30. 6, 4 (April 1962), 190-204.
[E2] SCH-ANK,R. C. ; ANDCOLnY,K.M. (Eds.), [Y2] LEVIEN,R. E.; AND MARON, M.E. " A
Computer models of thought and language computer system for inference execution
W. H. Freeman, San Francisco, 1973. and data retrieval," Comm. ACM 10,
[~] RusTIN, R-AND-ALL(Ed.), "Natural lan- 11 (Nov. 1967), 715-721.
guage processing," Courant Computer [Y3] CmLDS, D. L. "Feasibility of a set-
Science Symposia 8, New York, Dee. theoretical data strueture--a general
1971, Prentice-Hall, Englewood Cliffs, structure based on a reconstituted defi-
N.J., 1971. nition of relation," Proc. IFIP Congress
[E4] THOMPSON, F. P.; LOCKEM-ANN, P. C.; 1968, North-Holland Publ. Co., Amster-
DOSTERT, B. H.; .AND DEVERILL, R. dam, The Netherlands, 1968, pp 162-172.
"REL: a rapidly extensible language [Y4] CmLvS, D. L. "Description of a set-
system," Proc. ~th ACM National Conf., theoretic structure," Proc. AFIP,S 1968
New York, 1969, ACM, New York, 1969, Fall Jr. Computer Conf., Vol. 33, AFIPS
vv 399--417. Press, Montvale, N.J., 1968, pp 557-564.
[E5] KELLOGG,C. H.; BURGER, J.; DILLER, T.:
-ANDFOGT, K. "The CONVERSEnatural [Y5] lSH~elWa tiLo~ N~nSe~rE;, EwitHh ~'e~uR:~i:~
language data management system: cur- capabilities," Proc. ACM ~$rd National
rent status and plans," Proc. ACM ,Sym- Conf., August 1968, Brandon/Systems
posium on Information ~torage and Re- Press, Princeton, N.J., 1968, pp 143-156.
trieval, 1971, ACM, New York, 1971, [Y6] FELDMAN,J. A.; AND ROVNER, P. D.
pp 33-46. , " A n ALGoL-based associative language,"
[E6] WINgeR-An,T. 'Procedures as a repre- Comm. ACM 12, 8 (August 1969), 439-449.
sentation for data in a computer program [Y7] KUHNS,J. L. "Logical aspects of ques-
for understanding natural language,"
MIT Project MAC Report MAC TR-84,
Cambridge, Mass., 1971. f - . ( ),
[ET] MONTGOMERY,C. A. " I s natural lan- • Dec. 1969, Academic Press, New York,
guage an unnatural query language?" 1969.
Pros. ACM National Conf., New York, [Y8] KOCHEN,M. "Adaptive mechanisms in
1972, ACM, New York, 1972, pp 1075-- digital concept processing," in Discrete
1078. Adaptive Proeesses--,symposium and
[E8] PETRICK, S. R. "Semantic interpreta-
tion in the REQUESTsystem," IBM Re- Panel Discussion, AIEE, New York, 1962
search Report RC4457, IBM Research pp 50--58.