0% found this document useful (0 votes)
19 views

Database Systems

This document provides an overview of database systems and their main components. It discusses the system structure with applications, queries, and the physical database interacting through a database management system. The main components of a DBMS are described including the database manager, query processor, language compilers and file manager. Database users such as application programmers, casual users and administrators are defined. The roles and responsibilities of the database administrator are outlined. Database languages including DDL, DML, and datalog are introduced. Common data models like relational, hierarchical and object-oriented are listed. Key aspects of relational databases such as relations, relational algebra, and SQL are summarized at a high level.

Uploaded by

Ochena Manush
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Database Systems

This document provides an overview of database systems and their main components. It discusses the system structure with applications, queries, and the physical database interacting through a database management system. The main components of a DBMS are described including the database manager, query processor, language compilers and file manager. Database users such as application programmers, casual users and administrators are defined. The roles and responsibilities of the database administrator are outlined. Database languages including DDL, DML, and datalog are introduced. Common data models like relational, hierarchical and object-oriented are listed. Key aspects of relational databases such as relations, relational algebra, and SQL are summarized at a high level.

Uploaded by

Ochena Manush
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 126

DATABASE SYSTEMS

SYSTEM STRUCTURE

Application Queries Database


programs schema

Database Management System

Physical Database
MAIN COMPONENTS OF DBMS

• Database manager
• Query processor
• DML precompiler
• DDL compiler
• File manager
• Transaction manager
DATABASE USERS

• Application programmers
• Casual users
• Naive users
• Database administrator
DATABASE ADMINISTRATOR

The DBA is responsible for


• Information content of the database
• Storage structure and access strategy
• Granting of authorisation for data access
• Strategy for back-up and recovery
• Monitoring performance
• Cooperation with users
DATABASE LANGUAGES

• Data Definition Language


• Data Manipulation Language
 Procedural Languages

 Declarative Languages
DATA ABSTRACTION

External External … External


schema A schema B schema N

Conceptual Schema

Internal Schema
DATA INDEPENDENCE

• Physical data independence


• Logical data independence
DATA MODELS

• Data structures
• Operators
• Integrity constraints
DATA MODELS

• Relational
• Hierarchical
• Network
• Object-oriented
• Object-relational
RELATIONAL DATABASES
RELATIONS
• SCH = {A1, A2, …, An} – set of attributes
• DOM (A1) – domain of A1
• Relation R(A1, A2, …, An) – subset of
DOM (A1)  DOM (A2) …  DOM (An)
• SCH = {A1, A2, …, An} – schema of
R(A1, A2, …, An)
RELATIONS

• Superkey SK of R(SCH) – a subset of SCH =


{A1, A2, …, An}, such that for any two distinct
tuples t1 , t2  R t1 (SK)  t2 (SK)
• Candidate key K of R – a minimal superkey i.e. no
proper subset K’of K is a superkey
• Foreign key A – an attribute of R, which forms the
primary key of another relation S
EXAMPLE

• PROJECTS (P#, Pname, Ptown, Budget)


• EMPLOYEES (E#, Ename, Etown, Profession)
• CUSTOMERS (C#, Cname, Ctown)
• CONTRATS (P#, E#, C#, Salary)
PROJECTS

P# Pname Ptown Budget


P1 Visa London 100 000
P2 Pharmacy London 60 000
P3 Policy Paris 30 000
P4 Education Madrid 80 000
P5 Credit Oslo 20 000
EMPLOYEES

E# Ename Etown Profession


E1 Boyle Warsaw Lawyer
E2 Clay London Economist
E3 Harris Paris Programmer
E4 Long Rome Lawyer
E5 Snell London Chemist
E6 Wells Paris Economist
CUSTOMERS

C# Cname Ctown
C1 Evans Glasgow
C2 Hayes Leeds
C3 Jackson Brussel
C4 Mills Rome
C5 Robson London
C6 Smith Paris
CONTRACTS
P# E# C# Salary P# E# C# Salary
P1 E1 C1 1 000 P2 E6 C3 3 000
P1 E1 C2 4 000 P3 E1 C4 9 000
P1 E2 C3 3 000 P3 E2 C4 6 000
P1 E3 C2 8 000 P3 E3 C4 5 000
P2 E1 C1 2 000 P3 E4 C4 4 000
P2 E2 C1 7 000 P4 E1 C1 4 000
P2 E3 C1 2 000 P4 E3 C2 3 000
P2 E4 C1 5 000 P4 E4 C4 7 000
P2 E5 C3 5 000 P4 E5 C2 6 000
RELATIONAL ALGEBRA

• Union R  S = {t: t  R  t  S}
• Intersection R  S = {t: t  R  t  S}
• Difference R - S = {t: t  R  t  S}
• Product
R(X)  S(Y) =
{t:  r  R   s  S  t(X) = r(X)  t(Y) = r(Y)}
• Rename R(X) = R(X) (S(Y))
RELATIONAL ALGEBRA

• Selection F (R) = {t: t  R  F(t)} ,


F – selection condition
• Projection X (R) = {t: r  R  t(X) = r(X)}
• Join R *F S = F (R  S)
F – join condition
• Division
Let R and S be relations with attribute sets X and Y,
respectively, and let Y  X
R  S = X-Y (R) - X-Y ((X-Y (R)  S) – R)
EXAMPLES
1. Names of employees working at P1
Ename (EMP * (E# ( P# = ‘P1’ (CON))))
2. Numbers of employees who are not employed at
any project
E# (EMP) - E# (CON)
3. Numbers of employees who take part in all
projects
P#,E# (CON)  P# (PROJ)
EXAMPLES
4. Names of employees working at least at one project with
budget > 50 000
Ename (EMP *
(E# ( CON * P# (Budget > 50 000 (PROJ))))
5. Numbers of employees working at least at one project at
which Mr. Blake works
E# (CON * P# (CON * E# (Ename = ‘Blake’(EMP))))

6. Numbers of employees working at all the projects at which


Mr. Blake works
P#,E# (CON * P# ( CON *E# (Ename = ‘Blake’(EMP)))) 
P#(CON * E# ( Ename = ‘Blake’(EMP)))
DATALOG

A datalog rule:
R(t):  R1(t1), R2(t2), …, Rn(tn)
where R – header of the rule
R1(t1), R2(t2), …, Rn(tn) - predicates

Relational predicate: P(A, B, C, …, Z)


Arithmetic predicate: A  B, where
A, B – arithmetic expresions,   { =, , >, , , <,}
ALGEBRAIC OPERATIONS IN
DATALOG

UNION ST
R(X, Y)  S(X, Y) R(X, Y)  T(X, Y)

INTERSECTION ST
R(X, Y)  S(X, Y), T(X, Y)

DIFFERENCE S-T
R(X, Y)  S(X, Y),  T(X, Y)
ALGEBRAIC OPERATIONS IN
DATALOG

PROJECTION X (S)
R(X)  S(X, _)

SELECTION F1 AND F2 (S)


R(X, Y)  S(X, Y), F1, F2

SELECTION F1 OR F2 (S)


R(X, Y)  S(X, Y), F1
R(X, Y)  S(X, Y), F2
ALGEBRAIC OPERATIONS IN
DATALOG

PRODUCT S  T
R(X, Y)  S(X), T(Y)

JOIN S * T
R(X, Y, Z)  S(X, Y), R(Y, Z)
EXAMPLES

1. Names of employees working at P1


R(E#)  EMP(E#, _, _, _), CON(E#, P#, _, _), P# = ‘P1’

2. Names of employees working at least at one project with


budget > 50 000
R(Ename)  EMP(E#, Ename, _, _), CON(P#, E#, _, _),
PROJ(P#, _, _, Budget), Budget > 50 000

3. Numbers of employees working at least at one project at


which Mr. Blake works
R(E#)  CON(E#, P#, _, _), CON(En#, P#, _, _),
EMP(En#, Ename , _, _), Ename =‘Blake’
RECURSIVE RELATIONSHIPS

PAR(C, P)
C – child, P - parent
GRANDPARENT:
GRPAR (X, Y)  PAR (X, Z), PAR (Z, Y)

In algebra
GRPAR (X, Y) = R(X, Z) (PAR) *  S(Z, Y) (PAR)
RECURSIVE RELATIONSHIPS
FOREFATHER
FOREF (X, Y)  PAR (X, Y)
FOREF (X, Y)  PAR (X, Z), FOREF (Z, Y)

FIXPOINT OPERATOR
FP(FOREF = R(X, Y) ( C, Y (PAR * P=X FOREF)) 

PAR(X, Y)(PAR))
EXAMPLE

b c

d e
a

X Y
b c
a b
d e a c
b d
b c
c d
c e
d e
a X Y Ways

b c
a b
a c ac, abc
d e b d bd, bcd
b c
II step
c d
New connections: c e ce, cde,
abc, abd, acd, ace, bcd, bce, d e
bde, cde a d abd, acd
New tuples: <a,d>, <a, e>, a e ace
<b, e> b e bde, bce
SQL
SQL

• DATA DEFINITION LANGUAGE


• DATA MANIPULATION LANGUAGE
• DATA CONTROL LANGUAGE
CREATE TABLE
CREATE TABLE R
(attributes,
integrity conditions)

Description of attribitues: Name Domain

Integrity Conditions:
Primary Key (list of attributes)
Check(Predicate)
CREATE TABLE

CREATE TABLE ACCOUNTS


(ACCOUNT_ID INTEGER PRIMARY KEY,
NAME CHAR(20),
TYPE INTEGER,
BALANCE FLOAT,
INTEREST_RATE DECIMAL(4,2),
CHECK (INTEREST_RATE > 0))
ALTER and DROP

ALTER TABLE ACCOUNTS


ADD PLENIPOTENTIARY CHAR (15)

DROP TABLE ACCOUNTS


SIMPLE QUERIES

Numbers of employees working at P1 for C1


SELECT E#
FROM CON
WHERE P# = ‘P1’ AND C# = ‘C1’;

Numbers of employees working at least at one project


SELECT DISTINCT E#
FROM CON;
SIMPLE QUERIES

Numbers of projects in descending order of budget


SELECT P#
FROM PROJ
ORDER BY BUDGET DESC;

Full information of all projects


SELECT *
FROM PROJ;
SUBQUERIES

Names of employees working at P1

SELECT Ename
FROM EMP
WHERE E# IN
(SELECT E#
FROM CON
WHERE P# = ‘P1’);
SUBQUERIES
Names of employees working at P1
(another solution)

SELECT Ename
FROM EMP
WHERE ‘P1’ IN
(SELECT P#
FROM CON
WHERE E# = EMP.E#);
ANY and ALL
Numbers of projects that have greater budgets than
at least one project realized in London

SELECT P#
FROM PROJ
WHERE BUDGET > ANY
(SELECT BUDGET
FROM PROJ
WHERE Ptown = ‘London’);
ANY and ALL
Numbers of projects that have greater budgets than
all projects realized in London

SELECT P#
FROM PROJ
WHERE BUDGET > ALL
(SELECT BUDGET
FROM PROJ
WHERE Ptown = ‘London’);
EXISTS
Numbers of employees who do not work at any
project

SELECT E#
FROM EMP
WHERE NOT EXISTS
(SELECT *
FROM CON
WHERE E# = EMP.E#);
SYNONIMS
Names of projects that have budgets greater than P1

SELECT Y.Pname
FROM PROJ X, PROJ Y
WHERE X.P# = ‘P1’AND X.BUDGET<Y.BUDGET;
AGGREGATE FUNCTIONS

• COUNT
• SUM
• AVG
• MAX
• MIN
AGGREGATE FUNCTIONS
Number of projects
SELECT COUNT(*)
FROM PROJ;

Number of employees working at least at one project


SELECT COUNT(DISTINCT E#)
FROM CON;

Number of employees working at P1 for C1


SELECT COUNT(*)
FROM CON
WHERE P# = ‘P1’ AND C# = ‘C1’;
GROUP BY
Numbers of projects and counts of numbers of
employees working at them

SELECT P#, COUNT(DISTINCT E#)


FROM CON
GROUP BY P#;
GROUP BY
Numbers of employees and counts of their contracts

SELECT E#, COUNT(*)


FROM CON
GROUP BY E#;
HAVING
Numbers of employees having more than one contract

SELECT E#
FROM CON
GROUP BY E#
HAVING COUNT (*) > 1;
VIEWS
CREATE VIEW PROJ_LON AS
SELECT *
FROM PROJ
WHERE Ptown = ‘London’;

SELECT P#
FROM PROJ_LON
WHERE Budget > 50000;
INSERT, UPDATE, DELETE
INSERT INTO EMP VALUES (‘E7’, ‘Howard’,
‘Liege’, ‘Lawyer’);

DELETE
FROM PROJ
WHERE Ptown = ‘London’;

UPDATE PROJ
SET Budget = Budget * 1,1;
GRANT and REVOKE
GRANT <priviledge list> on <element of a database>
to <user list>

REVOKE <priviledge list> on <element of a


database> from <user list>

GRANT INSERT, SELECT ON PROJECTS TO U1


EMBEDDED SQL

EXEC SQL <query>;


EXEC SQL DECLARE C CURSOR FOR
<query>;
EXEC SQL OPEN <cursor>;
EXEC SQL CLOSE <cursor>;
EXEC SQL FROM <cursor> INTO <variables>;
EMBEDDED SQL
EXEC SQL DECLARE C1 CURSOR FOR
SELECT Budget
FROM PROJ;

EXEC SQL DECLARE C2 CURSOR FOR


SELECT P#, Budget
FROM PROJ
WHERE Budget >:S;
EMBEDDED SQL
EXEC SQL DECLARE C CURSOR FOR
SELECT Budget
FROM PROJ
WHERE Ptown =‘London’;

S=0;
While NOT END_OF_CURSOR
{EXEC SQL FETCH FROM C INTO :PR_BUD;
S = S + PR_BUD;
}
ENTITY-RELATIONSHIP
MODEL
ENTITY SET

P# Pname Ptown Budget

PROJECTS

R (P#, Pname, Ptown, Budget)


BINARY RELATIONSHIPS

(a,b) (c,d)
A R B

ab cd
MANY TO MANY

A1 … Am X B1 … Bn

(*,*) (*,*)
A R B

A (A1, A2, … , Am), B (B1, B2, …, Bn), R (A1, B1, X)


ONE TO MANY

A1 … Am B1 … Bn

(*,*) (1,1)
A R B

A (A1, A2, … , Am), B (B1, B2, …, Bn, A1)


TERNARY RELATIONSHIPS
B1 … Bn

A1 … Am (0, *) C1 … Cp

(0, *) (0, *)
A R C

A (A1, A2, … , Am), B (B1, B2, …, Bn), C (C1, C2, …, Cp),


R (A1, B1, C1, X)
RECURSIVE RELATIONSHIPS
MANY TO MANY

E1 … En
RA(*,*)

E R X

RB(*,*)

E (E1, E2, … , En), R (RA.E1, RB.E1, X)


RECURSIVE RELATIONSHIPS
ONE TO MANY

E1 … En
RA(0,1)

E R

RB(*,*)

E (RA.E1, E2, … , En , RB.E1)


ISA RELATIONSHIPS
A1 … Am

ISA
B1 … Bn C1 … Cp

B C

A (A1, A2, … , Am), B (A1, B1, …, Bn), C (A1, C1, …, Cp)


DEPENDENCIES
FUNCTIONAL DEPENDENCY
Given a relation R(SCH). Let X, Y  SCH.
Y is functionally dependent on X, X  Y, if
and only if for any two tuples t1 and t2 of R:

t1 (X) = t2 (X)  t1 (Y) = t2 (Y)


FUNCTIONAL DEPENDENCY
Y is fully functionally dependent on X if
and only if X  Y and there does not exist
a proper subset X’ of X , such that X’  Y.

Y is transitively functionally dependent on X


if and only if X  Z and Z  Y.
RELATION KEY

Given a relation R(SCH). Let K  SCH.


K is a superkey of R if and only if K  SCH.

K is a candidate key if and only if K  SCH fully.

An attribute which belongs to any candidate


key is called a prime attribute.
ARMSTONG AXIOMS

• Y  X  X Y (reflexivity)
• X  Y  Z  W  XW  YZ (augmentation)
• X  Y  Y  Z  X  Z (transitivity)
ARMSTONG RULES

• X  Y  X  Z  X  YZ (union)
• X  Y  WY  Z  XW  Z (pseudotransitivity)
• X  Y  Z  Y  X  Z (decomposition)
MULTI-VALUED DEPENDENCY
Given a relation R(X, Y, Z). The multivalued
dependency X  Y holds in R if and only if the set
of Y-values depends only on the X-value and is
independent on the Z-value.
For all pairs of tuples t1 and t2 such that t1 (X) = t2 (X)
there exist tuples t3 and t4 such that:
t1 (X) = t2 (X) = t3 (X) = t4 (X),
t3 (Y) = t1 (Y), t3 (Z) = t2 (Z),
t4 (Y) = t2 (Y), t4 (Z) = t1 (Z)
MULTI-VALUED DEPENDENCY

Given a relation R(X, Y, Z). The multivalued


dependency X  Y holds in R if and only if
R = X,Y (R) * X, Z (R)
DECOMPOSITION
Given a relation schema SCH.
A set of relation schemas SCHi  SCH such that
SCH = SCH1  SCH2  …  SCHn
is a decomposition of SCH.
If for all relations R on SCH
R = SCH1 (R) * SCH2 (R) … * SCHn (R)
{SCH1 , SCH2 , … , SCHn} is a lossless-join
decomposition.
DECOMPOSITION

Given a relation R(SCH). Let X  SCH.


Let A be a single attribute, A  SCH and A  X.
IF X  A then R = SCH – {A} (R) * X, A (R)
JOIN DEPENDENCY

Given a relation schema SCH.


Let {SCH1,SCH2 ,…, SCHn} be a decomposition of
SCH.
A join dependency
*(SCH1, SCH2 , … , SCHn)
holds in SCH if for all relations R on SCH
R = SCH1 (R) * SCH2 (R) … * SCHn (R)
NORMALISATION
NESTED RELATION
A B X
C D E
c1 d2 e3
a1 b1 c1 d2 e4

c1 d2 e4
c1 d2 e5
a2 b2
c2 d3 e5

c4 d4 e3
a1 b3 c4 d4 e4
FIRST NORMAL FORM
A B C D E

a1 b1 c1 d2 e3
a1 b1 c1 d2 e4
a2 b2 c1 d2 e4
a2 b2 c1 d2 e5
a2 b2 c2 d3 e5
a1 b3 c4 d4 e3
a1 b3 c4 d4 e4
FIRST NORMAL FORM

A relation schema SCH is in 1NF, if and only if


every attribute value for any relation of SCH is
atomic.
SECOND NORMAL FORM

A relation schema SCH is in 2NF, if and only if SCH


is in 1NF and every non-prime attribute of SCH is
fully functionally dependent on any candidate key.
THIRD NORMAL FORM

A relation schema SCH is in 3NF, if for any


functional dependency XY, where X, Y  SCH:
1. XY is trivial (Y  X) or
2. X is a superkey or
3. Y is a set of prime attributes.
BOYCE-CODD NORMAL FORM

A relation schema SCH is in BCNF, if for any


functional dependency XY, where X, Y  SCH:
1. XY is trivial (Y  X) or
2. X is a superkey
FOURTH NORMAL FORM

A relation schema SCH is in 4NF, if for any


multivalued dependency X Y,
where X, Y  SCH:
1. X Y is trivial (Y  X or X  Y = SCH ) or
2. X is a superkey
FIFTH NORMAL FORM

A relation schema SCH is in 5NF, if for any


join dependency
*(SCH1, SCH2 , … , SCHn), where SCHi  SCH:
1. *(SCH1, SCH2 , … , SCHn) is trivial (one of SCHi
is SCH) or
2. Every SCHi is a superkey
NESTED ATTRIBUTES
U – the set of all atrributes
SCH – relation schema
A = {A1, A2, …, An}, where Ai
is a single or a nested attribute
Domain of A
P(DOM(A1)  DOM(A2) …  DOM(An))
P(W) – a family of subsets of W (powerset)
L(A) – set of single attributes in A
N(A)- set of nested attributes in A
NESTED RELATIONS

A schema of a nested relation:


SCH = {SCH1, SCH2, …, SCHn}, where SCHi
is a single or a nested attribute.
For each pair of nested attributes
(L(SCHi)  N(SCHi))  (L(SCHj)  N(SCHj)) = 

A nested relation R on SCH is a set


R DOM(SCH1)  DOM(SCH2) …  DOM(SCHn)
PARTIAL NORMAL FORM

Let SCH = {SCH1, SCH2, …, SCHn} be a schema


of a nested relation R.
Pair (SCH, R) is in a partial normal form (PNF) if
1. L(SCH)  SCH
2.  r  R  Y  N(SCH) (Y, r(Y)) is in PNF
FILE ORGANIZATION
RECORDS

• Fixed-length records
• Variable-length records
VARIABLE-LENGTH RECORDS

Constant area Repeating group 1 ... Repeating group n

Reserved-space method
VARIABLE-LENGTH RECORDS

Constant area

Repeating group 1 … Repeating group n

Pointer method
ORGANIZATION OF RECORDS
INTO BLOCKS
block 1

Record 1 Record 2 Record 3 Record 4

Record 5 Record 6 Record 7 Record 8

block 2

Unspanned records
ORGANIZATION OF RECORDS
INTO BLOCKS
block 1

Record 1 Record 2 Record 3 Re-

cord 4 Record 5 Record 6 Record 7

block 2

Spanned records
FILES

• Unordered files
• Hash files
• Ordered files
HASH FILES

0
1

Hash table
INDICES

• Primary Index
• Clustering Index
• Secondary Index
INDEX-SEQUENTIAL FILES

A A

D D
F 
F

H
A H 
K
H K 
P M M

P

P S

S W
W 

Index File
CLUSTERING INDEX
A
A
A
A

A
F
A F
F K
K
P K
W K
K
P

P
P
W
W
SECONDARY INDEX

A F
 
D P
 
F H
 
H A
A  
K K W
 
S M S
 
P D
 
S M
 
W K
 

Index File
7
B-tree P2 11 P1 9
15
B2 7 20
11
29
P4 48 P3 32
B1 52
58
7 B3 29
29 48 P5 63
65
73 63

73
P7 82 P6 75
B4 73 84
82
90 90
Index File P8 93
Insertion 34, 22, 6, 43 6 7 9
6 B2 P1
B1 11 11 15
6 20 P2
29 20 22
P9
B3
29 29 32
B7 34 P3
6 34 43
48 P10
B5 48 52 58
48 P4
B6 63 63 65
48 P5
73 73 75
P6
B4
73 82 84
82 P7
90 P8 90 93
Deletion 52, 73, 65 6
P2 11 P1 7
15 9
B2 6
11 P9 20
20 22
P3 29
B1 32
6 B3 29 34
29 34
48 P10 43
P4
48
P6 75 58
B5 48 82 63
75 84
90 90
P8 93
7
B+-tree P2 11 P1 9
15
B2 20
11
29
P4 48 P3 32
B1 52
58
B3
29 48 P5 63
65
73 63

73
P7 82 P6 75
B4 84
82
90 90
Index File P8 93
QUERY PROCESSING
BASIC STEPS

• Parsing and translation


• Optimization
• Evaluation
EQUIVALENCE RULES

• R  S= S  R, (R  S)  T = R  (S  T),
• R  S= S  R, (R  S)  T = R  (S  T)
• R * S = S * T, (R * S) * T = R * (S * T)
• R * (S  T) = (R * S)  (R * T)
  F (R * S) =  F (R) *  F (S)
  F (R - S) =  F (R) -  F (S)
  F (R  S) =  F (R)   F (S)
  F (R  S) =  F (R)   F (S)
EQUIVALENCE RULES
  F  G (R) =  F ( G (R)) =  G (F (R)) =
 F (R)  G (R)
  F  G (R) =  F (R)  G (R)
  F (R  S) = R *F S
• R *F S = S * F R
  F (R *G S) = R *F  G S
• (R *F S) * G  H T = R * F  G (S *H T)
  F  G (R *H S) =  F (R) * H  G (S)
EQUIVALENCE RULES

 X1(X2(… XN(R)…)) = X1(R)


• Let X1 SCH(R), X2  SCH(S),
X1  X2 (R *F S) = X1(R) *F X2(S)
 X(R  S) = X(R)  X(S)
 X(F (R)) = F (X(R))
TRANSFORMATION OF
RELATIONAL EXPRESSIONS
E# (Ptown = ‘London’ (PROJ * (EMP * CON)))
E#

Ename=‘London’

PROJ *

EMP CON
TRANSFORMATION OF
RELATIONAL EXPRESSIONS
E# (Ptown = ‘London’ (PROJ )* (EMP * CON))
E#

Ename=‘London’ *

PROJ EMP CON


CATALOG INFORMATION

• card(R) – number of tuples in relation R


• f(R) – blocking factor of R
• b(R) – number of blocks in relation R
b(R) = card(R)/f(R),
• val(R[A]) – number of distinct values of the attribute
A
• sel(R[A]) – selectivity of the attribute A
sel(R[A]) = 1 / val(R[A])
• L(I) – number of levels in index I
COSTS OF SEARCHING

• Linear search
C = b(R)
with equality condition: Caverage = b(R)/2, Cmax = b(R)

• Binary search
C = log2 (b(R)) + card(R)*sel(R[A])/f(R) - 1
with equality condition on a key attribute
C = log2 (b(R))
COSTS OF SEARCHING

• Primary index C = L(I) +1

• Clustering index
C = L(I) + card(R)*sel(R[A])/f(R)

• Secondary index
equality C = L(I) + card(R)*sel(R[A])
EXAMPLE
Ptown = ‘London’ (PROJ )
card(PROJ) = 1000, f(R) = 5, val(R[Ptown]) = 20
b(PROJ) = 1000/5 = 200

Linear search C = 200


Binary search
C = log2 (200) + (1000/20)/5 - 1 = 8 + 10 – 1 = 17
Secondary index on Ptown with L(I) = 2
C = 2 + 1000/20 = 52
TRANSACTIONS
TRANSACTION
• A unit of work
• Collection of operations
• Result: a new consistent state od the database
• Operations include
beginning : BEGIN TRANSACTION
ending: END TRANSACTION or ROLLBACK
access to the database: READ(X) and WRITE(X)
PROPERTIES OF TRANSACTIONS

• Atomicity
• Consistency
• Isolation
• Durability
TRANSACTION STATE

• Active
• Partially committed
• Failed
• Aborted
• Committed
SERIALIZABILITY

An execution of concurrent transactions


(schedule) is serializable if it is equivalent to
some serial execution of these transactions.
CONFLICTS
Let us consider operations read(A) and write(A).

Operations I  T1 , J  T2 conflict if they


refer to the same data item and at least one of
them is a write operation.
PRECEDENCE GRAPH
G(V, E) – precedence graph
V – set of vertices, transactions in the schedule
E – set of edges
An edge Ti  Tj exists if operations I  Ti , J  Tj are
in conlict and I precedes J.

If the precedence graph has a cycle, the schedule is


not serializable.
LOCKS
• Shared (S). If transaction T holds a shared lock on
some data item A, then T can only read A.
• Exclusive (X). If transaction T holds an exclusive
lock on some data item A, then T can both read and
write A.

S X
S Yes No
X No No
TWO-PHASE LOCKING
PROTOCOL – 2PL

All transaction in the schedule satisfy the following rules:


• Access to any data item is possible after acquiring a suitable
lock
• After unlocking of any data item the transaction cannot
obtain any new locks

Each transaction has two phases:


Growing phase – obtaining locks, Shrinking phase – unlocking

The two-phase locking protocol ensures serializability.


REFERENCES

1. Beynon-Davies P.: Database Systems, 3rd Edition,


Palgrave, Houndmills, Basingstoke, 2004
2. Connolly T., Begg C.: Database Systems, New
York: Harlow, 2002
3. Date C. J.: An Introduction to Database Systems,
Eighth Edition, Addison Wesley, 2003
4. Delobel C., Adiba M.: Relational Database Systems,
North-Holland, 1985
REFERENCES

5. Elmasri R., Navathe S.B. : Fundamentals of


Database Systems, Fifth Edition, Addison Wesley,
2007
6. Silberschatz A., Korth H.F., Sudarshan S.: Database
System Concepts, McGraw-Hill, New York, NY,
March 1998
7. Ullman J., Widom J.: A First Course in Database
Systems, Prentice-Hall, Englewood Cliffs, NJ, 2002
8. Ullman J, Widom J., Garcia-Molina H.: Database
System Implementation, Prentice Hall 2000

You might also like