Adbm
Adbm
2019
Organization
Weekly lecture
I http://www.dbse.ovgu.de/dbse/en/Lectures/Lehrveranstaltungen/Advanced+Database+Models.html
Previous titles
I Object-oriented Database Systems
I Object-relational Database Systems
Latest addition: semi-structured data models
Previously held and refined by
I Gunter Saake
I Can Türker
I Ingo Schmitt
I Kai-Uwe Sattler
Introduction
Overview
Some Basic Terms
History of Database Models
Problems with RDBMS
Aspects of Advanced Data Models
Data Model
Database Model
Database
Database System
Database Management System
Conceptual Model
Database Schema(s)
A data model
I is a system of concepts and their interrelations
I is the “‘language” used to describe data
I defines syntax and semantic of data
I is the fundamental to other aspects of systems such as integrity
and operations for access and manipulation
Hierarchical model
Network model
Relational model
NF 2 and eNF 2
Object-Relational model
Object-oriented model (or short, object model)
Multi-dimensional model
Semi-structured model
Multimedia data model
Spatio-temporal data model
...
Conceptual Design
Conceptual Models
e.g. ER or UML
Database Models
Logical Design
e.g. RM, ORM, OO
Physical Design
Implementation
Conceptual Design
Implementation
(0,*)
controls
Supervision hours
works_on
(1,1)
dependents_of (1,*)
Project
(1,1)
name number location
dependent
Employee(fname, minit, lname, ssn, bdate, address, sex, salary, superssn, dno)
Dept_locations(dnumber, dlocation)
EMPLOYEE
FNAME MINIT LNAME SSN BDATE ADDRESS SEX SALARY SUPERSSN DNO
John B Smith 123456789 1965-01-09 731 Fondren, Houston, TX M 30000 333445555 5
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Alicia J Zelaya 999887777 1968-07-19 3321 Castle, Spring, TX F 25000 987654321 4
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5
Joyce A English 453453453 1972-07-31 5631 Rice, Houston, TX F 25000 333445555 5
Ahmad V Jabbar 987987987 1969-03-29 980 Dallas, Houston, TX M 25000 987654321 4
James E Borg 888665555 1937-11-10 450 Stone, Houston, TX M 55000 null 1
DEPT_LOCATIONS
DEPARTMENT DNUMBER DLOCATION
DNAME DNUMBER MGRSSN MGRSTARTDATE 1 Houston
Research 5 333445555 1988-05-22 4 Stafford
Administration 4 987654321 1995-01-01 5 Bellaire
Headquarters 1 888665555 1981-06-19 5 Sugarland
5 Houston
implementation abstract
1960 HM
NWM
1970 RM
ER
SQL
1980 NF 2 SDM
eNF2
OODM OEM
1990 (C++)
ODMG
2000 ORM / SQL−99
Database models
I Hierarchical model (HM)
I Network model (NM)
I Relational model (RM)
I (extended) Non-first normal form (NF 2 and eNF 2 )
I Object-relational model (ORM)
I Object-oriented database model (OODM)
Conceptual Models
I Entity-relationship model (ER)
I Semantic data models (SDM)
Standards
I Object Data Management Group (ODMG)
I Structured Query Language (SQL:1999 and SQL:2003)
AP AP AP
AP AP AP
Object−relational SQL
Extension
Extension
DBMS
More
I complex,
I flexible, and
I meaningful
data structures
Integration of behavior
Extensions and extensibility
Solving the “impedance mismatch” problem
Query Processing
Data Representation
Conceptual Schema
Internal Schema
... Relation
Tuple
...
πPLOCATION,DNUM (r (PROJECT ))
PLOCATION DNUM
Bellaire 5
Sugarland 5
Houston 5
Stafford 4
Houston 1
R
A B
1 2 A B C D E
3 4 1 2 5 6 7
1 2 8 9 10
r (R) × r (S) 1 2 11 12 13
S 3 4 5 6 7
C D E 3 4 8 9 10
5 6 7 3 4 11 12 13
8 9 10
11 12 13
Combine tuples from two relations r (R) and r (S) where for
I all attributes a = R ∩ S (defined in both relations)
I is t(R.a) = t(S.a).
Basic operation for following key relationships
If there are no common attributes result is Cartesian product
R ∩ S = ∅ =⇒ r (R) ./ r (S) = r (R) × r (S)
Can be expressed as combination of π, σ and ×
r (R) ./ r (S) = πR∪S (σVa∈R∩S t(R.a)=t(S.a) (r (R) × r (S)))
R
A B
1 2
3 4
5 6
A B C D
r (R) ./ r (S) 3 4 5 6
5 6 7 8
S
B C D
4 5 6
6 7 8
8 9 10
PERSON
ID NAME
1273 Dylan r (PERSON) ./ (βOWNERID→ID (r (CAR)))
3456 Reed
ID NAME BRAND
1273 Dylan Cadillac
CAR
1273 Dylan VW Beetle
OWNERID BRAND
3456 Reed Stutz Bearcat
1273 Cadillac
1273 VW Beetle
3456 Stutz Bearcat
Results all tuples from one relation having a (natural) join partner
in the other relation
r (R) n r (S) = πR (r (R) ./ r (S))
PERSON
PID NAME
1273 Dylan
2244 Cohen r (PERSON) n r (CAR)
3456 Reed
PID NAME
1273 Dylan
CAR
3456 Reed
PID BRAND
1273 Cadillac
1273 VW Beetle
3456 Stutz Bearcat
Rule 6 - The view updating rule: All views that are theoretically updatable
must be updatable by the system.
Rule 7 - High-level insert, update, and delete: Insert, update, and delete
operators can be applied to sets of tuples.
Rule 8 - Physical data independence: Changes to the physical schema level
(how data is stored) must not require a change to the logical
schema.
Rule 9 - Logical data independence: Changes to the logical schema level
must not require a change to an application (external schema
level) based on the structure.
... Table
Row
...
Some significant differences exist
Duplicate rows: The same row can appear more than once in an SQL
table(bag or list), while relations are sets.
Anonymous columns: A column in an SQL table can be unnamed (e.g.
aggregate functions).
Duplicate column names: Two or more columns of the same SQL
table can have the same name.
Column order significance: The order of columns in an SQL table is
defined and significant.
Row order significant: SQL provides operations to provide ordered
results (lists).
NULL-values: can appear instead of a value. Implies the use of
three-valued logic.
SELECT *
FROM EMPLOYEE
WHERE DNO=5 AND SALARY>30000
SELECT LNAME,FNAME
FROM EMPLOYEE
SELECT *
FROM EMPLOYEE, PROJECT
SELECT *
FROM DEPARTMENT
NATURAL JOIN DEPARTMEN_LOCATIONS
SELECT *
FROM EMPLOYEE, DEPARTMENT
WHERE SSN=MGRSSN
SELECT * FROM R
UNION
SELECT * FROM S
SELECT * FROM R
UNION ALL
SELECT * FROM S
UPDATE mytable
SET ...
WHERE ...
Conceptual Design
Conceptual Models
e.g. ER or UML
Database Models
Logical Design
e.g. RM, ORM, OO
Physical Design
Implementation
MatrNr Semester ID
Firstname Title
Lastname
N M
Student attends Course
Alternative notation:
[0..n] [3..m]
Student attends Course
Manager managed by
Employee manages Project
Alternative for:
[1..1] [0..n]
Car has Manufacturer
Self-referential on type-level
Different instances of the same type are related
Wife
Person
Lecturer Course
teaches
Room
Room
N 1
Building has Room
Each entity of one type must relate to at least one of the other
entity type at least once
Graphical notation: thick or double line
Difference to functional relationship: entity may participate in
relationship several time
Student of University
Alternative for:
[1..n] [0..n]
Student of University
Standard case: each entity of one type may or may not relate to
one or more entities of the other entity type
Graphical notation: thin line
Optional N:M-relationship
Alternative for:
[0..n] [0..n]
Person owns Car
Person
SSN
City
Name
Street
Address
ZIPcode
Person
SSN
Name
PhoneNumbers
Person
SSN
Name
BirthDate
Age
SSN
Person
Name
is_a
MatrNr
Student
Faculty
SSN
Person
Name
MatrNr
Student
Faculty
SSN
Person
Name
MatrNr
Student
Faculty
Employee
Manager Apprentice
Person
Man Woman
Person
Lecturer Student
Person
Lecturer Student
Tutor
Person SSN
Name
MatrNr Institute
Student Lecturer
Faculty Salary
Person(SSN, Name)
Employee
+ID: integer
+Name: string
+Department: integer = 7
-Salary
-raiseSalary(percent:integer)
+changeDepartment()
MatrNr Semester ID
Firstname Title
Lastname
Student Course
0..* attends 0..*
MatrNr ID
Firstname Title
Lastname
Semester
Person
Firstname
Lastname
Student Lecturer
MatrNr Institute
Faculty Salary
Existencial dependency:
1 *
Book Chapter
No existencial dependency:
1 5..20
Team Member
Country Australia:Country
Name Australia
Continent Australia
China:Country
China
Asia
Germany:Country
Germany
Europe
Search Courses
Create Timetable
Select Courses
Student
Save Timetable
Create Courses
Update Courses
Employee
Delete Courses
Department Employees
SSN Name Telephones Salary
Telephone
038203-12230
4711 Todd 0381-498-3401 6000
0381-498-3427
5588 Whitman 0391-345677 6000
0391-5592-3800
Computer Science 7754 Miller 550
8832 Kowalski 2800
Mathematics 6834 Wheat 750
∪, −, π, ./ as in relational algebra
σ condition extended to support
I Relations as operands (instead of constants of basic data types)
I Set operations like θ: =, ⊆, ⊂, ⊃, ⊇
Recursively structured operation parameters, e.g.
I π : nested projection attribute lists
I σ : selection conditions on nested relations
Additional operations
I ν (Nu) = Nest
I µ (Mu) = Unnest
A D
A B C −→ B C
1 2 7 νB,C;D (r )
2 7
1 3 6
←− 1 3 6
1 4 5
µD (r 0 ) 4 5
2 1 1
2 1 1
A D
A B C 6−→ B C
1 2 7 νB,C;D (r )
1 2 7
1 3 6
3 6
1 4 5 ←−
µD (r 0 ) 1 4 5
2 1 1
2 1 1
A D
A C
B C
B D
1 2 3
PNF relation: Non-PNF relation: 1 2
4 2
2 3
2 1 1
1 3
4 1
2 4
3 1 1
A D
B C A B C
1 2 3
1 2 3
1 4 2
4 2
2 1 1
2 1 1
2 4 1
4 1
3 1 1
3 1 1
Telephones: set(string)
Price_Supplier: set(tuple(
Supplier: string,
Article: string
Price: integer))
Authors of a document:
UPDATE Contacts
SET PhoneNumbers[3]=’91011’
WHERE Name=’Myers’;
Explicit position
Unnesting of array
Operations:
I MULTISET constructor
I UNNEST as for arrays
I COLLECT: special aggregate function to implement NEST operation
I FUSION: special aggregate function to build union of aggregated
multi-sets
I UNION and INTERSECT
I CARDINALITY as for arrays
I SET to remove duplicates
Predicates:
I MEMBER: containment
I SUBMULTISET: multi-set containment
I IS A SET: test if duplicates exist
UPDATE Departments
SET Buildings=Buildings
MULTISET UNION MULTISET[17]
WHERE Name=’Computer Science’;
Unnesting of a multi-set
SELECT *
FROM Contacts, TABLE(Contacts.PhoneList) tel
WHERE VALUE(tel)=’2’;
SELECT *
FROM Contacts,
TABLE(Contacts.PhoneList)
TABLE(Contacts.AddressList);
SELECT *
FROM Departments,
TABLE(Departments.EmployeeList) emps;
SELECT Companies.Name,COLLECT(Subsidiaries.Town)
FROM Companies NATURAL JOIN Subsidiaries
GROUP BY(Companies.Name);
Based on OOP data models: these ODBMS used the standardized data
models of popular OO programming languages such as C++
(ONTOS, Objectivity, ObjectStore, Versant, Poet) or Smalltalk
(GemStone), later on Java (Jasmine, JD4O), and added DBMS
functionality (persistence, TXNs, collection types, queries, . . . )
Extensions of relational models: introduce object-oriented concepts
(types/classes, inheritance, object identity, methods, . . . ) in a
relational model (Postgres, Illustra) or build on top of existing
DBMS (Oracle, DB2, . . . ) → object-relational DBMS
Innate OO database models: developed independently of existing models
and systems (O 2 , ORION, Itasca)
A type describes the intension, i.e. the internal (the set of encapsulated,
possibly complex attributes) and external structure (the interface of the type
with methods and their signatures).
Subtype Supertype
Smith:Student
Eva
Jones
146444
Mathematics Superclass Subclass
Schema implementation:
ODL File
Database
Schema Executable DBMS
Application
Database
Database Other
Classes .java File javac Classes
.class File
Postprocessor
modified other
.class File .class Files
Database
Schema
Executable DBMS
Database Application
d_Database* db;
d_Ref<Person> p1 = new(db,"Person") Person(SSmith","C
Root Objects:
NY
"NY" Boston
LA
"Paris"
Atlanta
SF
Berlin
Paris
1) Transient Objects
b
a
c
2) makePersistent
b a
c
3) Program Exit
? a
?
1) Transient Objects
b
a
c
2) makePersistent
b a
a) object itself
c
Relational query semantics: returns set, bag, or list of struct (tuple) of literal
data types as in relational query languages
Object selection semantics: selects existing persistent objects from the
database
Object creation semantics: creates new (transient) objects based on values
derived from persistent objects by calling a constructor in the
SELECT clause (π)
class Employee (
extent employees) {
attribute long employeeNr;
attribute struct Name {
string firstname;
string lastname } name;
attribute Date dob;
attribute List<string> tel;
...
void raise_salary (in short amount);
};
float avgGrade ()
raises (no_Grade);
void enlist (in string faculty)
raises (already_enlisted);
};
m:n-Relationship
attends(Student[0,*], Course[0,*])
class Course {
relationship set<Student> attended_by
inverse Student::attends;
...
};
class Student {
relationship set<Course> attends
inverse Course::attended_by;
...
};
employees;
newyork.neighbor_cities;
SELECT-clause
I Allows method calls and sub-queries
I Allows constructor calls
I DISTINCT for result type set<...>
Result type:
set<struct<name: string,
projects: bag<long>>>
ELEMENT (SELECT p
FROM projects p
WHERE projectID = 4711)
SELECT (Employee) a
FROM apprentices a;
FROM-clause
I Class extension, collection-valued attribute or reference, result of a
method call, or sub-query
I Automatic conversion to bag
SELECT e.name
FROM (SELECT p.managed_by
FROM projects p
WHERE p.status = ’finished’) e
WHERE e.salary > 9000
Inner path steps can not be multi-valued, e.g. the following is not possible:
proj.participants.name;
Instead:
SELECT e.name
FROM proj.participants e
EXISTS p IN projects:
p.status = ’finished’
Object tables are tables defined based on object type, i.e. they
contain objects instead of tuples
I Have a ”hidden” OID column
I Always have a set semantic because of unique OID
I Can be scope of a typed reference
I Methods can be used on objects
I Can be part of an table hierarchy based on according type hierarchy
1-to-many relationship between types and tables, i.e. the same
type can be used to define several tables
MULTISET SET
ROW OBJECT
<attributename> REF(<typename>)
[ SCOPE scopedescription ]
Scope is
One specified object table
One specified object view
Any of the tables and views of this type, if not specified
SELECT DEREF(manager)
FROM departments;
I Arrow operator
implicit join !
No multiple inheritance
Control of further sub-typing
I NOT FINAL: definition of sub-types allowed
I FINAL: no sub-types allowed
Table of Type
under under
Sub−Table of Subtype
Super−Table
under
Super−Type
under
oid A1 ... An
Recursion:
Interpretation
I Table with columns for type attributes
I Each row gets assigned OID
Intensional specialization
...
Semi-structured data/documents
I Data with an internally encoded, often changing, and not strongly
typed structured
I Used for data exchange or the WWW
No explicit structure
I Unknown, ambiguous, not required/needed
I Explicit description of structure too complex and expensive
Data not easily representable in tables
I Optional or alternative parts
I Repetitions
I Order is relevant
Graph-based Models
Examples: Object Exchange Model (OEM), Extensible Markup
Language (XML)
Document modeled as graph with
I Edges with element tag names
I Nodes with attribute/value pairs
I Leaves with values (Strings)
I Root node
Node = object
Household
Address Person
Person
Household
Firstname Email
Firstname
Name Lastname
[email protected]
Lou Lastname
John
Nico Cale
Reed
a c a c c d c d c d
b b
a1 b1 c1 a2 b2 c2 c2 d2 c3 d3 c4 d4
<tag>contents </tag>
<address>
<town>Ketchikan</town>
<zip>99950</zip>
<street>Paul Street</street>
</address>
<telephone>56-789012</telephone>
<fax/>
content:
I EMPTY: no further content
I ANY: arbitrary text or elements from this DTD
I #PCDATA: „parsed character data“, i.e. element text without further
structure
I regular expression for nested element type
<town state=”IL”>Chicago</town>
<tel no=”123459876”/>
Declaration
<! ATTLIST name type restrict default>
Alternative 1:
<lot>
<lotnr>42-33-66</lotnr>
<address>...</address>
</lot>
Alternative 2:
<lot lotnr=”42-33-66”>
<address>...</address>
</lot>
Element Attribute
Identification – ID / IDREF
Quantifier 1/?/∗/+ REQUIRED / IMPLIED
√
Alternatives –
√
Default values –
√
Enumerations –
Contents complex atomic
Fixed order yes no
Example
<! ENTITY unimd "University of Magdeburg">
Usage
<description>
He studied at the &unimd; ...
</description>
<xs:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="AddressType">
<xs:sequence>
<xs:element name="street" type="xs:string" />
<xs:element name="town" type="xs:string" />
<xs:element name="zip" type="ZIPType" />
...
</xs:sequence>
</xs:complexType>
...
</xs:schema>
benutzerdefinierte
anySimpleType
komplexe Typen
dateTime
benutzerdefinierte
Listen- und
string date Vereinigungstypen
normalizedString decimal
...
...
...
<xs:simpleType name="AmountType">
<xs:restriction base="xs:integer">
<xs:minInclusive value="1" />
<xs:maxInclusive value="100" />
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="ZIPType">
<xs:restriction base="xs:string">
<xs:pattern
value="[0-9]{5}(-[0-9]{4})?" />
</xs:restriction>
</xs:simpleType>
Type definition
<xs:complexType name="AreaType">
<xs:simpleContent>
<xs:extension base="xs:decimal">
<xs:attribute name="Unit"
type="xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="StreetType">
<xs:sequence>
<xs:element name="Lot"
minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="no" type="xs:integer"/>
<xs:element name="area" type="AreaType"/>
<xs:element name="owner" type="xs:string"/>
</xs:sequence>
<xs:attribute name="name" type="xs:string"
use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
Uniqueness: unique
Key: key
Reference to a key: keyref
Assigned to elements or attributes using XPath (discussed later on)
<xs:element name=”Lot">
<xs:complexType>
<xs:sequence> ... </xs:sequence>
<xs:attribute name=”lotno” type=”xs:integer”/>
</xs:complexType>
<xs:unique name=”LotNumber”>
<xs:selector xpath=”lot/>
<xs:field xpath=”@lotno/>
</xs:unique>
</xs:element>
XSLT XLink
XPath
XPointer
XQuery
Consist of steps
Steps separated by „/“
Processed from left to right
Each step yields sequence of nodes or value
Absolute path starting from root node starts with „/“
Relative path without preceding „/“
Examples:
/book/title
/book/author/lastname
//author
following::
parent::
preceding::
preceding-sibling:: following-sibling::
self::
child::
descendant::
descendant-or-self::
fn:doc("books.xml")
/child::books/child::book
fn:doc("books.xml")/child::books
/child::book[attribute::isbn=’1-2345’]
fn:doc("books.xml")
/descendant-or-self::book
[child::author/child::lastname=’Sattler’]
/child::title
Context position:
child::author[1]
boolean predicate:
child::book[@isbn = ’1234’]
Expressions over sequences:
child::author[fn:position() = { 1, fn:last() }]
Combination of predicates:
child:author[1][child::name=’Heuer’]
fn:doc("books.xml")
/books/book[@isbn=’1-55860-622-X’]
fn:doc("books.xml")
//book/author[1]/lastname
fn:doc("books.xml")
//book[author/lastname=’Sattler’]/title
mit
for-expression ::= for $var in expression
let-expression ::= let $var := expression
Example:
let $b := fn:doc("books.xml")//book
return $b
Evaluation:
1 All book nodes from document books.xml
2 set of values assigned to $b
3 Output of set $b
Example:
for $b in fn:doc("books.xml")//book
return $b
Evaluation:
1 Binding for each element to $b
2 Further clauses, e.g. (where, return, . . . ) are processed for each
element, i.e. return executed for each book
for-clause
for $i in (1, 2, 3)
return <tuple><i>{ $i } </i></tuple>
<tuple><i>1</i></tuple>
<tuple><i>2</i></tuple>
<tuple><i>3</i></tuple>
let-clause
let $i := (1, 2, 3)
return <tuple><i>{ $i } </i></tuple>
<tuple><i>1 2 3</i></tuple>
element price {
attribute currency { "Euro"}, "59.00"}
Result:
<price currency"Euro">59.00</price>
for $b in fn:doc("books.xml")//book,
$r in fn:doc("reviews.xml")//reviews
where $b/title = $r/booktitle
return <bookreview> { $b/title, $r/score }
</bookreview>
let $b := fn:doc("books.xml")//book
return <costs>
{ fn:sum($b/price) }
</costs>
let $b := fn:doc("books.xml")//book
let $avg := fn:avg($b/price)
return $b[price > $avg]
Syntax:
if (expr1) then expr2 else expr3
Example:
for $b in fn:doc("books.xml")//book
return <book> { $b/title }
{ for $a at $i in $b/author
where $i <= 2
return <author>{ string($a/lastname), ",",
string($a/firstname)}</author> }
{ if (count($b/author) > 2)
then <author>et al.</author> else () }
</buch>
for $b in fn:doc("books.xml")//book
where some $a in $b/author/lastname
satisfies $a = "Ullman"
return $b
Aggregate functions
Pre-defined numeric functions: fn:abs(v ), fn:ceiling(v ),
fn:floor(v ), fn:round(v ), . . .
String functions : fn:compare(s1 , s2 ), fn:concat(s1 , s2 ),
fn:string-length(s), fn:upper-case(s),
fn:substring(s, b, e), fn:string-join(s, t), . . .
Regular expressions: fn:matches(s, p)
Functions for date and time: fn:current-date(),
fn:get-day-from-date(d), . . .
fn:ceiling(42.9) (: 43 :)
fn:substring("XQuery", 2, 2) (: "Qu":)
fn:get-day-from-date("2005-06-30") (: 30 :)