0% found this document useful (0 votes)
58 views

Week4 Workshop

The document provides information about designing a database for a university student transcript system. It outlines the entities like students, departments, courses, sections, and their attributes and relationships. These include students having a major and optional minor, departments offering courses, sections teaching courses, and students receiving grades for sections.

Uploaded by

huhan jiang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Week4 Workshop

The document provides information about designing a database for a university student transcript system. It outlines the entities like students, departments, courses, sections, and their attributes and relationships. These include students having a major and optional minor, departments offering courses, sections teaching courses, and students receiving grades for sections.

Uploaded by

huhan jiang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Housekeeping information:

SQL Assignment (only for COMP2400 students) is available on Wattle, and


the submission via Wattle is due 23:59 Aug 27 (Monday).
Individual, no group work!
Do not post any partial solution/result on Wattle!
The mid-exam (only for COMP6240 students) is scheduled on Aug 31
(Friday of Week 6).
Students who are scheduled in ENGN E101 need to go to CS-MSI Lab 2
(Building 145).
More SQL exercises are available at
https://cs.anu.edu.au/dab/bench/comp2400-s2-2018/login/
(log in using ANU ID and password)
Writing SQL queries in SQL Assignment or the mid-exam
submit your queries, not the query results
the correctness of queries does not depend on any database state
partial marks may be awarded
problem-solving strategies
Week 4 Workshop
Database Design – Four Phases

The database design process has four phases:

1 Requirements Collection and Analysis

2 Conceptual Design
Entity-Relationship Model

3 Logical Design
From Entity-Relationship Model to Relation Schemas

4 Physical Design
Phase 2: Conceptual Design

Conceptual design is the process of constructing a conceptual data model


that is

modeled at a high-level of abstraction;


sufficiently simple and often graphical;
used to communicate the requirements of a database with
nontechnical users.

A conceptual data model is built using the information in users’


requirements specification.

Note: The conceptual design is based on the Entity-Relationship Model in


this course.
Model and Modeling

What is a model?
A model is
a simplification of reality
often a graphical depiction of data
associated with a modeling language

What does modeling do?


Modeling
creates an understanding and relationships of components of a system
helps in conceptualising and visualising the structure of a system that
we may want to build.
facilitates specifications of the behaviour of a system
gives rise to a template that guides us in constructing a system
...
Entity-Relationship (ER) Model

ER diagrams (Peter Chen in 1976):

Attribute as oval;

Key attribute with underlined;

Entity as rectangles;

Relationship as diamonds.
(Exercise 1) Consider the following data requirements for a university
student database that is used to keep track of students’ transcripts.
The university keeps track of each student’s name, student number, social
security number, address, phone, and birthdate. Both social security
number and student number have unique values for each student.
Each student has exactly one major, and may have a minor (if any) with
departments.
Each department has name, department code, office number, office phone,
and college. Both name and code have unique values for each department.
Each course has a course name, description, course number, number of
semester hours, level, and offering department. The value of course number
is unique for each course.
Each section of a course has an instructor, semester, year, and section
number and the section number distinguishes different sections of the same
course that are taught during the same semester/year; its values are 1, 2, 3,
..., up to the number of sections taught during each semester.
A grade record refers to each student and a particular section, consisting of
a final mark and a letter grade from (F, D, C, B, A).
Entities, Relationships and Attributes

Entities: “Things” in the real world (with independent existence).


Relationships: Associations between entities.
Attributes: Properties that describe entities and relationships.

Each student has name, student number, social security number, address,
phone and birthdate. Both social security number and student number have
unique values for each student.
Question: What are the entities, relationships and attributes?
Entities: STUDENT
Relationships:
Attributes: name, student number, social security number, address,
phone and birthdate for STUDENT
Entities, Relationships and Attributes

Entities: “Things” in the real world (with independent existence).


Relationships: Associations between entities.
Attributes: Properties that describe entities and relationships.

Each student has exactly one major, and may have a minor (if any) with
departments
Question: What are the entities, relationships and attributes?
Entities: STUDENT, DEPARTMENT
Relationships: has major with between STUDENT and
DEPARTMENT , has minor with between STUDENT and DEPARTMENT
Attributes: name for has major with , name for has minor with
Constraints on Relationships

Cardinality ratios: Specifies the maximum number of relationships that an


entity can participate in.
Participation constraints (total, partial): Specifies whether the existence of
any entity depends on its being related to another entity via the relationship
type.

Each student has exactly one major, and may have a minor (if any) with
departments.
Question: What are the constraints on relationship “has major with”?
Cardinality ratios: Every student has at most one major and a department
may offer many majors (to different students)
Participation constraints: Every student must have one major (total) and
each department must (typically) offer one major (total).
Constraints on Relationships

Cardinality ratios: Specifies the maximum number of relationships that an


entity can participate in.
Participation constraints (total, partial): Specifies whether the existence of
any entity depends on its being related to another entity via the relationship
type.

Each student has exactly one major, and may have a minor (if any) with
departments.
Question: What are the constraints on relationship “has minor with”?
Cardinality ratios: Every student has at most one minor and a department
may offer many minor (to different students)
Participation constraints: Every student may or may not have one minor
(partial) and each department must (typically) offer one minor (total).
Entities, Relationships and Attributes

Entities: “Things” in the real world (with independent existence).


Relationships: Associations between entities.
Attributes: Properties that describe entities and relationships.

Each course has a course name, description, course number, number of


semester hours, level, and offering department.
Question: What are the entities, relationships and attributes?
Entities: course, department
Relationships: offer (between department and course)
Attributes: course name, description, course number, number of
semester hours, level, and offering department (of the entity course)
Constraints on Relationships

Cardinality ratios: Specifies the maximum number of relationships that an


entity can participate in.
Participation constraints (total, partial): Specifies whether the existence of
any entity depends on its being related to another entity via the relationship
type.

Each course has a course name, description, course number, number of


semester hours, level, and offering department.
Question: What are the constraints on relationship “offer”?
Cardinality ratios: Every course is offered by at most one department and
a department may offer many courses
Participation constraints: Every course must be offered by some
department (total) and each department may (or may not) offer any
courses (partial).
Entities, Relationships and Attributes
Entities: “Things” in the real world (with independent existence).
Relationships: Associations between entities.
Attributes: Properties that describe entities and relationships.
Each section of a course has an instructor, semester, year, and section
number and the section number distinguishes different sections of the same
course that are taught during the same semester/year; its values are 1, 2, 3,
..., up to the number of sections taught during each semester.
A grade record refers to each student and a particular section, consisting of
a final mark and a letter grade from (F, D, C, B, A).
Question: What are the entities, relationships and attributes?
Entities: section, course, student
Relationships: section taught (between section and course),
grade record (between student and section)
Attributes: instructor, semester, year, and section number (of the
weak entity section), final mark and letter grade (of the relationship
grade record)
(Exercise 1) Consider the following data requirements for a university
student database that is used to keep track of students’ transcripts.
The university keeps track of each student’s name, student number, social
security number, address, phone, and birthdate. Both social security
number and student number have unique values for each student.
Each student has exactly one major, and may have a minor (if any) with
departments.
Each department has name, department code, office number, office phone,
and college. Both name and code have unique values for each department.
Each course has a course name, description, course number, number of
semester hours, level, and offering department. The value of course number
is unique for each course.
Each section of a course has an instructor, semester, year, and section
number and the section number distinguishes different sections of the same
course that are taught during the same semester/year; its values are 1, 2, 3,
..., up to the number of sections taught during each semester.
A grade record refers to each student and a particular section, consisting of
a final mark and a letter grade from (F, D, C, B, A).
Constructing an ER or EER Model

Identify the entities (including weak entity types)


student, course, department, section (weak entity)

Identify the relationships


has major (between student and department)
has major (between student and department)
offer (between department and course)
section taught (between section and course)
grade record (between student and section)
Identify the attributes of entities and relationships and identify a primary key
for each entity type
Identify cardinality ratios and participation constraints on relationships
Software tool to draw ER diagram

We require students to use an academic tool, TerraER, to draw the ER


diagrams.
TerraER allows you to save your ER diagrams into xml files and export your
ER diagrams as a JPEG figure.
You can download the jar file from the following website:
https://github.com/rterrabh/TerraER/releases/download/TerraER3.
01/TerraER3.01beta.jar
You can double-click that file to execute on Windows/Mac/Linux (assume
that the Java Runtime Environment JRE has been installed).
More information on how to use TerraER will be provided next week.
(Exercise 2) A retailer company wants to build a database application for
managing information about its sale process. The company sells
products in both local shops and webstores on the Internet. Each local
shop has a name, contact details (e.g., phone number and email), and a
unique location. The database application also needs to store the
URL(unique), name and last updated date of each webstore. Every
product has a unique productID, a description, an item price, and a
quantity in stock. The database application should also record customers’
details such as their name, address and email. Every customer is
assigned a unique ID. A customer may place an order that consists of at
least one product and each order is from either a shop or a webstore.
Customers have three payment options (i.e., cash, paypal, and credit
card) but for each order only one payment option can be chosen. A
delivery may be requested for each order. After full-payment is received,
a delivery would be sent out subject to products’ availability. Every
delivery has a unique tracking number.
(Exercise 2) A retailer company wants to build a database application for
managing information about its sale process. The company sells
products in both local shops and webstores on the Internet. Each local
shop has a name, contact details (e.g., phone number and email), and a
unique location. The database application also needs to store the
URL(unique), name and last updated date of each webstore. Every
product has a unique productID, a description, an item price, and a
quantity in stock. The database application should also record
customers’ details such as their name, address and email. Every
customer is assigned a unique ID. A customer may place an order that
consists of at least one product and each order is from either a shop or
a webstore. Customers have three payment options (i.e., cash, paypal,
and credit card) but for each order only one payment option can be
chosen. A delivery may be requested for each order. After full-payment
is received, a delivery would be sent out subject to products’ availability.
Every delivery has a unique tracking number.
Constructing an ER or EER Model

Identify the entities (including weak entity types)


shop, webstore, product, customer, order, delivery

Identify subclass/superclass and the corresponding disjointness and


completeness constraints
The company sells products in both local shops and webstores on
the Internet.
Each order is associated with either a shop or a webstore.

subclass shop, webstore


superclass store
disjoint and complete

Identify the relationships


(Exercise 2) A retailer company wants to build a database application for
managing information about its sale process. The company sells
products in both local shops and webstores on the Internet. Each local
shop has a name, contact details (e.g., phone number and email), and a
unique location. The database application also needs to store the
URL(unique), name and last updated date of each webstore. Every
product has a unique productID, a description, an item price, and a
quantity in stock. The database application should also record
customers’ details such as their name, address and email. Every
customer is assigned a unique ID. A customer may place an order that
consists of at least one product and each order is from either a shop or
a webstore. Customers have three payment options (i.e., cash, paypal,
and credit card) but for each order only one payment option can be
chosen. A delivery may be requested for each order. After full-payment
is received, a delivery would be sent out subject to products’ availability.
Every delivery has a unique tracking number.
(Exercise 2) A retailer company wants to build a database application for
managing information about its sale process. The company sells
products in both local shops and webstores on the Internet. Each local
shop has a name, contact details (e.g., phone number and email), and a
unique location. The database application also needs to store the
URL(unique), name and last updated date of each webstore. Every
product has a unique productID, a description, an item price, and a
quantity in stock. The database application should also record
customers’ details such as their name, address and email. Every
customer is assigned a unique ID. A customer may place an order
that consists of at least one product and each order is from either a
shop or a webstore. Customers have three payment options (i.e., cash,
paypal, and credit card) but for each order only one payment option can
be chosen. A delivery may be requested for each order. After
full-payment is received, a delivery would be sent out subject to
products’ availability. Every delivery has a unique tracking number.
Constructing an ER or EER Model

Identify the entities (including weak entity types)


shop, webstore, product, customer, order, delivery

Identify subclass/superclass and the corresponding disjointness and


completeness constraints
subclass shop, webstore
superclass store

Identify the relationships


customer place order
order consists of product
each order is from store(superclass) (either subclass shop or
subclass webstore)
delivery is for order
Constructing an ER or EER Model

Identify the entities (including weak entity types)


shop, webstore, product, customer, order, delivery
Identify subclass/superclass and the corresponding disjointness and
completeness constraints
subclass shop, webstore
superclass store

Identify the relationships


Identify the attributes of entities and relationships and identify a primary key
for each entity type
Every product has a unique productID, a description, an item price,
and a quantity in stock.
Attributes for product: productID, description, item price, quantity
Primary key for product: productID
Constructing an ER or EER Model

Identify the entities (including weak entity types)


shop, webstore, product, customer, order, delivery
Identify subclass/superclass and the corresponding disjointness and
completeness constraints
subclass shop, webstore
superclass store
Identify the relationships
Identify the attributes of entities and relationships and identify a primary key
for each entity type
The database application should also record customers’ details such
as their name, address and email. Every customer is assigned a
unique ID.
Attributes for customer: name, address, email, CustomerID
Primary key for customer: CustomerID
Constructing an ER or EER Model
Identify the entities (including weak entity types)
shop, webstore, product, customer, order, delivery
Identify subclass/superclass and the corresponding disjointness and
completeness constraints
subclass shop, webstore
superclass store
Identify the relationships
Identify the attributes of entities and relationships and identify a primary key
for each entity type
Each local shop has a name, contact details (e.g., phone number and
email), and a unique location. The database application also needs to
store the URL(unique), name and last updated date of each webstore.
Attributes for superclass store: name, location/URL
Primary key for superclass store: location/URL
Attributes for subclass shop: phone number, email
Attributes for subclass webstore: last updated date
Constructing an ER or EER Model

Identify the entities (including weak entity types)


Identify subclass/superclass
Identify the relationships
customer place order
...
Identify the attributes of entities and relationships
Identify cardinality ratios and participation constraints on relationships
A customer may place an order
Cardinality ratios: A customer may place many orders and an order
is placed by one customer.
Participation constraints: A customer may or may not place any
orders (Partial). An order must be placed by one customer (Total).
Constructing an ER or EER Model

Identify the entities (including weak entity types)


Identify subclass/superclass
Identify the relationships
delivery is for order
...
Identify the attributes of entities and relationships
Identify cardinality ratios and participation constraints on relationships
A delivery may be requested for each order.
Cardinality ratios: A delivery is for at most one order and an order
has at most one delivery.
Participation constraints: A delivery must be for an order (Total).
An order may or may not have a delivery (Partial).
Constructing an ER or EER Model
Identify the entities (including weak entity types)
Identify subclass/superclass
Identify the relationships
order consists of product
...
Identify the attributes of entities and relationships
Identify cardinality ratios and participation constraints on relationships
Each order consists of at least one product
Cardinality ratios: An order may contain many products and an
product may be contained in many orders.
Participation constraints: A order must contain some product
(Total). A product may or may not be contained in an order
(Partial).
Constructing an ER or EER Model

Constructing an ER or EER Model


Identify the entities (including weak entity types)
Identify subclass/superclass
Identify the relationships
Identify the attributes of entities and relationships
Identify cardinality ratios and participation constraints on relationships

Not all the constraints can be expressed in the ER model


(Exercise 2) A retailer company wants to build a database application for
managing information about its sale process. The company sells
products in both local shops and webstores on the Internet. Each local
shop has a name, contact details (e.g., phone number and email), and a
unique location. The database application also needs to store the
URL(unique), name and last updated date of each webstore. Every
product has a unique productID, a description, an item price, and a
quantity in stock. The database application should also record customers’
details such as their name, address and email. Every customer is
assigned a unique ID. A customer may place an order that consists of at
least one product and each order is from either a shop or a webstore.
Customers have three payment options (i.e., cash, paypal, and
credit card) but for each order only one payment option can be
chosen. A delivery may be requested for each order. After full-payment
is received, a delivery would be sent out subject to products’
availability. Every delivery has a tracking number.
Phase 3: Logical Design

Logical design is the process of constructing a logical data model (e.g.


relational or object-oriented).

A conceptual data model is translated onto a logical data model, which can
be further refined (e.g., normalisation) to meet the data requirements. For
example,

From: An ER model

To: Relations with their primary and foreign keys, which facilitates SQL
to deal with retrieving, updating and deletion.
Note: The logical design is based on the relational data model in this
course.
ER-to-Relations Algorithm

7-step algorithm to convert the basic ER model into relations, and more
steps for the EER model.

Step 1: Mapping of Regular Entity Types


Step 2: Mapping of Weak Entity Types
Step 3: Mapping of Binary 1:1 Relationship Types
Foreign key approach
Merged relation approach
Cross-reference approach
Step 4: Mapping of Binary 1:N Relationship Types
Step 5: Mapping of Binary M:N Relationship Types
Step 6: Mapping of Multi-valued Attributes
Step 7: Mapping of N-ary Relationship Types
Step 8: Mapping of Superclass/Subclass
Step 1: Regular Entity types
For each regular entity type E, create a relation schema with the attributes
of E (ignore multi-valued attributes until Step 6), where
PK: the key attributes of E

C OURSE(course num, course name, description, num sem hours, level)


with PK: {course num}
Note: This is not necessarily the final relation schema of C OURSE.
Step 2: Weak Entity Types
For each weak entity type Ew , create a relation schema with the attributes
of Ew plus the PK of its identifying entity type, where
PK: the partial key attributes of Ew plus the PK of its identifying entity
type
FK: references the PK of its identifying entity type

S ECTION(section num, instructor, semester, year, course num)


with PK: {section num, course number}
with FK: [course num]⊆C OURSE[course num]
Step 3: Binary 1:1 Relationship Types - (Foreign key)
For a 1:1 relationship type R with one total participation, extend the
relation schema of the total-side entity type by the attributes of R and
the PK of the partial-side entity type, where
PK: still the PK of the total-side entity type
FK: references the PK of the partial-side entity type

D EPARTMENT(Name, Address, Mgr SSN, Start date) with


PK: {Name}
FK: [Mgr SSN]⊆E MPLOYEE[SSN].
How can we model the total participation?
Add NOT NULL constraint to Mgr SSN for total participation.
Step 3: Binary 1:1 Relationship Types - (Foreign key)
For a 1:1 relationship type R with one total participation, extend the
relation schema of the total-side entity type by the attributes of R and
the PK of the partial-side entity type, where
PK: still the PK of the total-side entity type
FK: references the PK of the partial-side entity type

D EPARTMENT(Name, Address, Mgr SSN, Start date) with


PK: {Name}
FK: [Mgr SSN]⊆E MPLOYEE[SSN].
Why don’t we extend the relation schema of the partial-side entity type?
This may cause many NULL values.
Step 3: Binary 1:1 Relationship Types - (Merged relation)
How can we translate the following kind of 1:1 relationship type?

If participation on both sides is total, we may merge the relation schemas


of both entity types and the attributes of the relationship type into a
single relation.
E MPLOYEE -D EP(SSN, Name, Salary, Start date, Dname, Address) with
PK: {SSN} or {Dname}
How can we model the total participations?
Add NOT NULL constraint to both SSN and Dname for total participations.
Is merging them always a good solution?
Step 3: Binary 1:1 Relationship Types - (Merged relation)
How can we translate the following kind of 1:1 relationship type?

If participation on both sides is total, we may merge the relation schemas


of both entity types and the attributes of the relationship type into a
single relation.
However, merging them is not always a good solution. Why?
(1) The two entity types represent different entities in the real world.
(2) The two entity types participate in different relationship types.
(3) Having separate relation schemas for two entity types often leads to
more efficient updates than a single relation schema.
(4) ...
Step 3: Binary 1:1 Relationship Types - (Cross-reference)
How can we translate the following kind of 1:1 relationship type?

If both sides are partial, we may create a (new) relation schema which
cross-references the PKs of the relation schemas of the two entity types.
M ANAGES(SSN, Dname, Start date) with
PK: {SSN} or {Dname}
FKs: [SSN]⊆E MPLOYEE[SSN] and [Dname]⊆D EPARTMENT[Name]
Can we still merge them into a single relation using previous approaches?
We cannot; otherwise what would be the primary key for the merged relation
schema?
Step 4: Binary 1:N Relationship Types
For each 1:N relationship type R, extend the relation schema of the
N-side entity type by the attributes of R and the PK of the 1-side entity
type, where
PK: still the PK of the N-side entity type
FK: references the PK of the 1-side entity type

S TUDENT(SSN, Name, Number, DoB, address, phone, major dept,


major name) with
PK: {SSN}
FK: [major dept]⊆D EPARTMENT[dept code]
Step 5: Binary M:N (N:N) Relationship Types
For each M:N (N:N) relationship type R, create a relation schema with the
attributes of R plus the PKs of the participating entity types, where
PK: the combination of the PKs of the participating entity types
FKs: references the PKs of the participating entity types

G RADE REPORT (ssn, section num, course num, letter grade, final mark)
PK: {ssn, section num, course num}
FK: [ssn] ⊆ S TUDENT[ssn]
FK: [section num, course num] ⊆ S ECTION[section num, course num].
Step 6: Multi-valued Attributes
For each multi-valued attribute A, create a relation schema with an
attribute corresponding to A plus the PK of the entity/relationship type that
has A as an attribute, where
PK: the combination of A and the PK of the entity/relationship type
that has A
FK: references the PK of the entity/relationship type that has A

E MPLOYEE ADDRESS (SSN, Address) with


PK: {SSN, Address}
FK: [SSN]⊆E MPLOYEE[SSN]
ER-to-Relations Algorithm (Recall)

The algorithm to first convert the basic ER model into relations, and then
convert superclass/subclass from the EER model into relations.

Step 1: Mapping of Regular Entity Types


Step 2: Mapping of Weak Entity Types
Step 3: Mapping of Binary 1:1 Relationship Types
Foreign key approach
Merged relation approach
Cross-reference approach
Step 4: Mapping of Binary 1:N Relationship Types
Step 5: Mapping of Binary M:N Relationship Types
Step 6: Mapping of Multi-valued Attributes
Step 7: Mapping of N-ary Relationship Types
Step 8: Mapping of Superclass/Subclass
(credit cookie) The first paper in ACM Transactions on Database
Systems (TODS)

Title: The entity-relationship model — toward a unified view of data


Author: Peter Pin-Shan Chen (Massachusetts Institute of Technology)
Journal: ACM Transactions on Database Systems (TODS)
Volume 1 Issue 1, Pages 9-36, 1976
This is one of the selected papers from the 1st international conference on
very large data bases (VLDB), 1975.

You might also like