Database Design & System Analysis
Database Design & System Analysis
Database System
Databases
Database A database is a collection of data, typically describing the activities of one or more related organizations. For example, a university database might contain information about the following: Entities such as students, faculty, courses, and classrooms. Relationships between entities, such as students' enrollment in courses, faculty teaching courses, and the use of rooms for courses. Databases are useful Many computing applications deal with large amounts of information. Database systems give a set of tools for storing, searching and managing this information.
Introduction
Data: refer to what actually stored in the database. Field: Group of characters with specific meaning Record: Logically connected fields that describe a person, place, or thing File: Collection of related records Information: it refer to the meaning of that data as understood by some users.
Single-User System: is a system in which at most one user can access the database at any given time. Multi-User System: is a system in which many users can access the database at the same time. A major objective of Multi-User Systems is to allow each user to behave as if he or she were working with a single-user system instead. The data in the database will be both integrated and shared.
Database Applications: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions Databases touch all aspects of our lives
(DBMS)
Database
Application Programs
End users
Database Models
Collection of logical constructs used to represent data structure and relationships within the database
Data Models
A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the given data model. The relational model of data is the most widely used model today.
Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.
Levels of Abstraction
Many views, single conceptual (logical) schema and physical schema.
Views describe how users see the data. Conceptual schema defines logical structure Physical schema describes the files and indexes used. View 1 View 2 View 3
Conceptual Schema
Physical Schema
Students (sid: string, name: string, login: string, age: integer, gpa:real) Courses (cid: string, cname:string, credits:integer) Enrolled (sid:string, cid:string, grade:string) Relations stored as unordered files. Index on first column of Students. Course_info(cid:string,enrollment:integer)
Physical schema:
Entity
A person, place, object, event or concept in the user environment about which the organization wishes to maintain data Represented by a rectangle in E-R diagrams
Entity Type
A collection of entities that share common properties or characteristics
Attribute
A named property or characteristic of an entity that is of interest to an organization
Identifier
A candidate key that has been selected as the unique identifying characteristic for an entity type
1. Avoid using intelligent keys
Notation Guide
ENTITY TYPE
WEAK ENTITY TYPE RELATIONSHIP TYPE
Notation Guide
ATTRIBUTE
_____
KEY ATTRIBUTE
Notation Guide
E1 R E2
TOTAL PARTICIPATION OF E2 IN R
CARDINALITY RATIO 1:N FOR E1:E2 IN R STRUCTURAL CONSTRAINT (min, max) ON PARTICIPATION OF E IN R (Alternative Notation)
1 E1 R
N E2
(min,max)
R E2
ER Diagram Basics
Entity
sname Store Locations
Relationship
manager
qty Keeps pname price descrip
Attributes
Product
Entity
Real-world object distinguishable from other objects (e.g a student, car, job, subject, building ...) An entity is described using a set of attributes In the Company database, an employees car is of lesser importance In the Department of Transportations registration database, cars may be the most important concept In both cases, cars will be represented as entities; but with different levels of detail
Entity Sets
A collection of similar entities (e.g. all employees) All entities in an entity set have the same set of attributes Each entity set has a key Each attribute has a domain Can map entity set to a relation easily
EMPLOYEES SSN 321-23-3241 645-56-7895 NAME Kim Jones SAL 23,000 45,000
Entity Type
Defines set of entities that have the same attributes (e.g. EMPLOYEE) Each Entity Type is described by its NAME and attributes The Entity Type describes the Schema or Intension for a set of entities Collection of all entities of a particular entity type at a given point in time is called the Entity Set or Extension of an Entity Type Entity Type and Entity Set are customarily referred to by the same name
EMPLOYEE
Notation
Attributes
Notation Key Attributes Value Sets of Attributes Null Valued Attributes Attribute Types Composite Vs. Simple Attributes Single-valued Vs. Multi-valued Attributes Derived Vs. Stored Attributes
EMPLOYEE
An attribute in the relational model is always single valued - Values are atomic!
M-sal
Y-sal
qty price
Item
Representing Attributes
Parenthesis ( ) for composite attributes Brackets { } for multi-valued attributes
Assume a person can have more than one residence and each residence can have multiple telephones {AddressPhone ({ Phone ( AreaCode,PhoneNum ) }, Address (StreetAddresss (Number, Street, AptNo), City,State,PostalCode) ) }
Key Definitions
Primary Key:
One attribute whose value can uniquely identify a complete record (one row of data) within an entity.
Foreign Key
A copy of a primary key that exists in another entity for the purpose of forming a relationship between the entities involved.
Degrees of a Relationship
One-to-one (1:1)
Man
One-to-many (1:n)
Woman
Customer
Many-to-many (n:m)
Order
Course
Subject
NOTE: Every many to many relationship consists of two one to many relationships working in opposite directions
Man
One-to-many (1:n)
Woman
Customer
Many-to-many (n:m)
Order
Course
Subject
NOTE: Every many to many relationship consists of two one to many relationships working in opposite directions
Person
Car
A person must own at least one car. A car doesnt have to be owned by a person, but if it is, it is owned by at least one person. A person may own many cars.
optional relationship
mandatory relationship
A Sample ER Diagram
Student
Course
Subject
Entity A
Entity B
Entity C
Relationship
Add some attributes to entities here Add some attributes to entities here Courses may have another course as pre-requisite Courses may have another course as pre-requisite
Relationship Degree
The degree of a relationship type is the number of participating entity types
2 entities: Binary Relationship 3 entities: Ternary Relationship n entities: N-ary Relationship Same entity type could participate in multiple relationship types
Supplier Supply Project Departments
Works_In
Binary
Employees
Multiple
Assigned_to
Ternary
Part
Movies
Stars-in
Stars
Name
Address
Kinds of Constraints
What kind of constraints can be defined in the ER Model? Cardinality Constraints Participation Constraints Together called Structural Constraints
Constraints are represented by specific notation in the ER diagram
Departments
Works_In
Employees
One-to-Many
A film is directed by at most one director A director can direct any number of films
id Director name Directed Film
title
Director
Directed
Film
Many-to-Many
A film is directed by any number of directors A director can direct any number of films
id Director name Directed Film
title
Director
Directed
Film
One-to-One
A film is directed by at most one director A director can direct at most one film
id Director name Directed Film title
Director
Directed
Film
Another Example
Where would you put the arrow?
age
father
id Person name child FatherOf
Normalisation
Introduction
Normalization: is the process of efficiently organizing
Benefits of Normalization
Less storage space Quicker updates Less data inconsistency Clearer data relationships Easier to add data Flexible Structure
Insert Anomaly: We cant insert a dept without inserting a member of staff that works in that department
Update Anomaly: We could change the name of the dept that SA51 works in without simultaneously changing the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all information pertaining to the Sales dept.
Repeating Groups
A repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value.
Example: We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff.
staffNo job SL10 Salesman SA51 Manager DS40 Clerk OS45 Clerk dept 10 20 20 30 dname Sales Accounts Accounts Operations city Stratford Barking Barking Barking contact number 018111777, 018111888, 079311122 017111777
079311555
Repeating Groups are not allowed in a relational design, since all attributes have to be atomic - i.e., there can only be one value per cell in a table!
Functional Dependency
Formal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time.
Formal Notation: A B This should be read as A determines B or B is functionally dependant on A. A is called the determinant and B is called the object of the determinant.
Example:
staffNo job SL10 Salesman SA51 Manager DS40 Clerk OS45 Clerk dept 10 20 20 30 dname Sales Accounts Accounts Operations
Functional Dependencies staffNo job staffNo dept staffNo dname dept dname
Dependencies: Definitions
Partial Dependency when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key.
CUSTOMER Cust_ID 101 101 125 Name AT&T AT&T Cisco
Partial Dependency
Dependencies: Definitions
Transitive Dependency when a non-key attribute determines another non-key attribute.
Transitive Dependency
EMPLOYEE Emp_ID 111 122 F_Name Mary Sarah L_Name Jones Smith Dept_ID 1 2 Dept_Name Acct Mktg
Example: Table 1
Title Author1 Author2 ISBN Subject Pages Publisher
Abraham Silberschatz
Henry F. Korth
0072958863
MySQL, Computers
1168
McGraw-Hill
Abraham Silberschatz
Henry F. Korth
0471694665
Computers
944
McGraw-Hill
Table 1 problems
This table is not very efficient with storage.
This design does not protect data integrity. Third, this table does not scale well.
Abraham Silberschatz
0072958863
MySQL
1168
McGraw-Hill
Henry F. Korth
0072958863
Computers
1168
McGraw-Hill
Henry F. Korth
0471694665
Computers
944
McGraw-Hill
Abraham Silberschatz
0471694665
Computers
944
McGraw-Hill
We now have two rows for a single book. Additionally, we would be violating the Second Normal Form A better solution to our problem would be to separate the data into separate tablesan Author table and a Subject table to store our information, removing that information from the Book table:
Subject Table
Subject_ID 1 Subject MySQL
Computers
Author Table
Author_ID 1 2 Last Name Silberschatz Korth First Name Abraham Henry
Book Table
ISBN
0072958863
Title
Database System Concepts Operating System Concepts
Pages
1168
Publisher
McGraw-Hill
0471694665
944
McGraw-Hill
Each table has a primary key, used for joining tables together when querying the data. A primary key value must be unique with in the table (no two books can have the same ISBN number), and a primary key is also an index, which speeds up data retrieval based on the primary key. Now to define relationships between the tables
Relationships
Book_Author Table
Book_Subject Table
ISBN 0072958863 0072958863 0471694665 0471694665 Author_ID 1 2 1 2 ISBN 0072958863 0072958863 0471694665 Subject_ID 1 2 2
2NF Table
Publisher Table
Publisher_ID
1
Publisher Name
McGraw-Hill
Book Table
ISBN 0072958863 Title Database System Concepts Operating System Concepts Pages 1168 Publisher_ID 1
0471694665
944
2NF
Here we have a one-to-many relationship between the book table and the publisher. A book has only one publisher, and a publisher will publish many books. When we have a one-tomany relationship, we place a foreign key in the Book Table, pointing to the primary key of the Publisher Table. The other requirement for Second Normal Form is that you cannot have any data in a table with a composite key that does not relate to all portions of the composite key.
Stages of Normalisation
Unnormalised (UDF)
Remove repeating groups
DISTRIBUTED DATABASES
WHAT IS A DISTRIBUTED DATABASE?
DISTRIBUTED DATABASES
Stores logically related database over physically independent sites
DISTRIBUTED DATABASES
ADVANTAGES
DISTRIBUTED DATABASES
ADVANTAGES
Expandability
It is easier to accommodate increasing the size of the global (logical) database.
Local autonomy
The database is brought nearer to its users. This can effect a cultural change as it allows potentially greater control over local data .
DISTRIBUTED DATABASES
HORIZONTAL DATA FRAGMENTATION
ACCOUNT CUSTOMER BRANCH BALANCE
JONES STRATFORD GRAY BARKING SMITH STRATFORD GREEN BARKING ONO BARKING KHAN STRATFORD
e.g.,
DISTRIBUTED DATABASES
HORIZONTAL DATA FRAGMENTATION
ACCT NO.
BALANCE
DISTRIBUTED DATABASES
VERTICAL DATA FRAGMENTATION
S# NAME
200 324 456
SITE PHONE NO
LOGIN PASSWORD
JON200T
DISTRIBUTED DATABASES
VERTICAL DATA FRAGMENTATION
STUDENT ADMINISTRATION SITE PHONE NO. S# NAME
Id
100 200 300
Name
A B C
Sal
10K 20K 30K
Dept
D1 D2 D3
Horizontal Fragmentation
Vertical Fragmentation
Introduction to SQL
SQL is a standard language for accessing and manipulating databases What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate databases SQL is an ANSI (American National Standards Institute) standard
SQL can execute queries against a database SQL can retrieve data from a database SQL can insert records in a database SQL can update records in a database SQL can delete records from a database SQL can create new databases SQL can create new tables in a database SQL can create stored procedures in a database SQL can create views in a database SQL can set permissions on tables, procedures, and views
The SELECT statement is used to select data from a database. The result is stored in a result table, called the result-set. SQL SELECT Syntax SELECT column_name(s( FROM table_name and SELECT * FROM table_name
Now we want to select the content of the columns named "LastName" and "FirstName" from the table above. We use the following SELECT statement: SELECT LastName, FirstName FROM Persons The result-set will look like this:
LastName Hansen Svendson FirstName Ola Tove
Pettersen
Kari
SELECT * Example
Now we want to select all the columns from the "Persons" table. We use the following SELECT statement : SELECT * FROM Persons Tip :The asterisk (*) is a quick way of selecting all columns! The result-set will look like this:
P_Id 1 2 3 LastName Hansen Svendson Pettersen FirstName Ola Tove Kari Address Timoteivn 10 Borgvn 23 Storgt 20 City Sandnes Sandnes Stavanger
For numeric values: This is correct: SELECT * FROM Persons WHERE Year=1965 This is wrong: SELECT * FROM Persons WHERE Year='1965'
Now we want to select only the persons living in the city "Sandnes" from the table above.
We use the following SELECT statement: SELECT * FROM Persons WHERE City='Sandnes'
SQL uses single quotes around text values (most database systems will also accept double quotes). Although, numeric values should not be enclosed in quotes. For text values:
This is correct: SELECT * FROM Persons WHERE FirstName='Tove' This is wrong: SELECT * FROM Persons WHERE FirstName=Tove
This is correct: SELECT * FROM Persons WHERE Year=1965 This is wrong: SELECT * FROM Persons WHERE Year='1965'
= <>
> < >= <= BETWEEN LIKE IN
Now we want to select only the persons with the first name equal to "Tove" AND the last name equal to "Svendson":
Svendson
Tove
Borgvn 23
Sandnes
P_Id
1 2
LastName
Hansen Svendson
FirstName
Ola Tove
Address
Timoteivn 10 Borgvn 23
City
Sandnes Sandnes
ORDER BY Example
P_Id LastName FirstName Address City
1
2 3 4
Hansen
Svendson Pettersen Nilsen
Ola
Tove Kari Tom
Timoteivn 10
Borgvn 23 Storgt 20 Vingvn 23
Sandnes
Sandnes Stavanger Stavanger
Now we want to select all the persons from the table above, however, we want to sort the persons by their last name. We use the following SELECT statement:
ORDER BY Example
The result-set will look like this:
P_Id
1 4 3 2
LastName
Hansen Nilsen Pettersen Svendson
FirstName
Ola Tom Kari Tove
Address
Timoteivn 10 Vingvn 23 Storgt 20 Borgvn 23
City
Sandnes Stavanger Stavanger Sandnes
ORDER BY DESC Example: Now we want to select all the persons from the table above, however, we want to sort the persons descending by their last name.
ORDER BY Example
We use the following SELECT statement:
3
4 1
Pettersen
Nilsen Hansen
Kari
Tom Ola
Storgt 20
Vingvn 23 Timoteivn 10
Stavanger
Stavanger Sandnes
Now we want to insert a new row in the "Persons" table. We use the following SQL statement:
P_Id 1 2
3
4
Pettersen
Nilsen
Kari
Johan
Storgt 20
Bakken 2
Stavanger
Stavanger
4
5
Nilsen
Tjessem
Johan
Jakob
Bakken 2
Stavanger
Now we want to update the person "Tjessem, Jakob" in the "Persons" table. We use the following SQL statement: UPDATE Persons SET Address='Nissestien 67', City='Sandnes' WHERE LastName='Tjessem' AND FirstName='Jakob'
2
3 4 5
Svendson
Pettersen Nilsen Tjessem
Tove
Kari Johan Jakob
Borgvn 23
Storgt 20 Bakken 2 Nissestien 67
Sandnes
Stavanger Stavanger Sandnes
SQL UPDATE Warning Be careful when updating records. If we had omitted the WHERE clause in the example above, like this: UPDATE Persons SET Address='Nissestien 67', City='Sandnes'
P_Id 1 2 3 4 5
Note: Notice the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or records that should be deleted. If you omit the WHERE clause, all records will be deleted!
1
2 3 4 5
Hansen
Svendson Pettersen Nilsen Tjessem
Ola
Tove Kari Johan Jakob
Timoteivn 10
Borgvn 23 Storgt 20 Bakken 2 Nissestien 67
Sandnes
Sandnes Stavanger Stavanger Sandnes
Now we want to delete the person "Tjessem, Jakob" in the "Persons" table. We use the following SQL statement: DELETE FROM Persons WHERE LastName='Tjessem' AND FirstName='Jakob'
1
2 3 4
Hansen
Svendson Pettersen Nilsen
Ola
Tove Kari Johan
Timoteivn 10
Borgvn 23 Storgt 20 Bakken 2
Sandnes
Sandnes Stavanger Stavanger
Delete All Rows It is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes, and indexes will be intact: DELETE FROM table_name or DELETE * FROM table_name
Note: Be very careful when deleting records. You cannot undo this statement!
1
2 3 4 5 6
2008/11/12
2008/10/23 2008/09/02 2008/09/03 2008/08/30 2008/10/04
1000
1600 700 300 2000 100
Hansen
Nilsen Hansen Hansen Jensen Nilsen
Now we want to find the average value of the "OrderPrice" fields. We use the following SQL statement: SELECT AVG(OrderPrice) AS OrderAverage FROM Orders
Now we want to count the number of orders from "Customer Nilsen". We use the following SQL statement:
(2) SQL COUNT(*) Example If we omit the WHERE clause, like this: SELECT COUNT(*) AS NumberOfOrders FROM Orders The result-set will look like this:
NumberOfOrders 6
which is the number of unique customers (Hansen, Nilsen, and Jensen) in the "Orders" table.
2 3 4 5
Hansen
Pettersen Pettersen
Ola
Kari Kari
24562
77895 44678
The INNER JOIN keyword return rows when there is at least one match in both tables. If there are rows in "Persons" that do not have matches in "Orders", those rows will NOT be listed.