16 Entity-Relationship Diagramming: 16.1 Semantic Data Models
16 Entity-Relationship Diagramming: 16.1 Semantic Data Models
Probably the most frequently cited of the SDMs is the Entity-Relationship data
model (E-R model) (Chen, 1976). In the E-R model the ‘real world’ is
represented in terms of entities, the relationships between entities and the
attributes associated with entities. Entities represent objects of interest in the
real world such as employees, departments and projects. Relationships represent
named associations between entities. A department employs many employees.
139
140 Information Systems Development
16.2.1 Entities
16.2.2 Relationships
More than one relationship can exist between any two entities. For instance, the
entities house and person can be related by ownership and/or by occupation. In
theory, having identified a set of say 6 entities, up to 15 relationships could exist
between these entities. In practice, it will usually be quite obvious that many
entities are quite unrelated. Furthermore, the object of entity modelling is to
document only so-called direct relationships. For instance, direct relationships
exist between the entities Parent and Child and between Child and School. The
relationship between Parent and School is indirect; it exists only by virtue of the
child entity (Shave, 1981).
Entity-Relationship Diagramming 141
16.2.3 Attributes
The data needed to support a given information system does not usually fall
irrevocably into one of the three categories: entity, relationship and attribute. A
classic example is the data needed to be stored on marriages. Marriage could be
regarded as an entity with attributes such as date, place, and names of bride and
groom. It could similarly be regarded as the attribute marital status associated
with the entity Person. Finally, it could be represented as a relationship between
the entities Man and Woman.
One of the tasks of the entity modeller is to decide which of these viewpoints
is the most important for the information system under consideration. Hence,
data analysis is frequently referred to as semantic modelling (Date, 1990). The
aim is to represent data as it is perceived in the organisation under
consideration (Klein and Hirschheim, 1987).
16.4 Notation
Entities
Relationship
holds
Customer BankAccount
Attributes
customerNo accountNo
holds
Customer BankAccount
customerName accountName
There are two properties of the concept of a relationship that are usually
considered important: we shall refer to them as cardinality and participation.
Cardinality (or degree) concerns the number of instances involved in a
relationship. A relationship can be said to be a 1:1 (one-to-one) relationship, a
1:M (one-to-many) relationship, or a M:N (many-to-many) relationship.
For instance, the relationship between bankaccounts and customers can be
said to be one-to-one (1:1) if it can be defined in the following way:
There are a number of competing notational devices available for portraying the
cardinality of a relationship. We choose to represent cardinality by drawing a
crows foot on the many end of a relationship (see figure 16.2a).
Participation (or optionality) concerns the involvement of entities in a
relationship. An entity’s participation is optional if there is at least one instance
of an entity that does not participate in the relationship. An entity’s
participation is mandatory if all instances of an entity must participate in the
relationship. The default participation is mandatory. If the participation is
optional we add a circle (an ‘O’ for optional) alongside the relevant entity (see
16.2b).
Degree/Cardinality
holds
Customer BankAccount
holds
Customer BankAccount
holds
Customer BankAccount
Optionality/Participation
holds
Customer BankAccount
holds
Customer BankAccount
Generalisation Product a
BankAccount
Bank Customer
Pension
Mortgage
b
Product
Car
A first-pass E-R diagram represents the basic structure of data needed in a given
information system. Most of the adherents of the technique recommend that an
entity model should be validated against some definition of processing
requirements. This definition will identify which entities and relationships must
be accessed, in what order, by what means, and for what purpose. A detailed
discussion of this topic is given in Beynon-Davies (1992).
Suppose we have the following extract from an E-R diagram representing an
educational application (figure 16.5) (Shave, 1981). We wish to validate this
entity model against a requirement to produce the staff/student ratio for a given
department. To perform this processing we need to access:
Department
StaffMember
Module
Student
deptNo
Department studentNo
deptName moduleNo
studentName
moduleName
staffNo
staffName
Allocation Registration
From such a description, we get some initial idea of the important entities
involved in this system:
Clinic
ClinicSession
Appointment
Patient
Doctor
Operation
OperatingTheatre
TheatreSession
Note that we might have been tempted to add the entity Hospital to this list.
However, in this system there is only one instance of a hospital. Entities should
normally have many instances associated with them.
The text also gives some idea of relationships. A possible list is given below:
Patient - Operation
Doctor - Operation
Doctor - Appointment
Patient - Appointment
ClinicSession - Appointment
Clinic - ClinicSession
TheatreSession - Operation
Theatre - TheatreSession
Clinics(clinicName, ....)
ClinicSessions(clinicName, sessionNo, ...)
Appointments(clinicName, sessionNo, patientNo, doctorNo, ....)
Patients(patientNo, ...)
Doctors(doctorNo, ...)
Operations(theatreName, theatreSession, operationNo, patientNo, ...)
Schedule(doctorNo, operationNo, ...)
Theatres(theatreName, ...)
TheatreSessions(theatreName, theatreSession, ...)
Note how we have formed compound keys in a number of tables, where the
foreign keys form part of the primary key. Note also how the table schedule
constitutes a link entity between doctors and operations.
Clinic
Appointment TheatreSession
Patient Operation
Figure 16.8 illustrates an E-R diagram drawn for the case of the Goronwy
Galvanising system. Note how we have distinguished between delivery advice
notes and despatch advice notes. The entity despatch is a breakdown of the
many-to-many relationship between Job and DespatchAdvice. In other words, a
given job can be recorded on more than one despatch advice note.
150 Information Systems Development
Delivery Despatch
Product
Advice Advice
Job
Despatch
DespatchAdvices(despatchAdviceNo, ...)
DeliveryAdvices(deliveryAdviceNo, ...)
Jobs(jobNo, deliveryAdviceNo, productCode, ...)
Products(productCode, ...)
Despatches(despatchAdviceNo, jobNo, ...)
Fortunately, this relational schema matches closely with the one generated from
the determinancy diagrams in chapter 10. In many circumstances this will not
be the case. The analyst frequently has to reconcile the results from top-down
data analysis with the results from bottom-up data analysis. See Beynon-Davies
(1996) for a more detailed discussion of reconciliation. Figure 16.9 indicates
how we might modify the E-R diagram in figure 16.8 to exploit generalisation.
Here we have made delivery advices and despatch advices both subtypes of an
advice class.
Entity-Relationship Diagramming 151
Advice
adviceNo
Delivery Despatch
Product
Advice Advice
adviceDate
Job
Despatch
16.9 Conclusion
16.10 References
Howe D.R. (1986). Data Analysis for Database Design. (2nd Edn.). Edward
Arnold, London.
King R. and McCleod D. (1985) Semantic Data Models. In Bing Yao S. (ed.).
Principles of Database Design. Vol 1: Logical Organisations. Prentice-Hall,
Englewood Cliffs. N.J.
Klein H.K. and Hirschheim R.A. (1989). Four Paradigms of Information
Systems Development. CACM. 32(10) October 1199-1216.
Shave M.J.R. (1981). Entities, Functions and Binary Relations: steps to a
conceptual schema. The Computer Journal. 24(1).
Smith J.M. and Smith D.C.P. (1977) Database Abstractions: Aggregation and
Generalisation. ACM Trans. Database Sys. 2(2) 105-133.
Teorey T.J. Yang D., and Fry J.P. (1986). A Logical Design Methodology for
Relational Databases Using the Extended Entity-Relationship Model. ACM
Computing Surveys. 18 197-222.
16.13 Exercises