Database Management Systems Lecture Note
ANSI-SPARC Architecture
The purpose and origin of the Three-Level database architecture
All users should be able to access same data. This is important since the database is
having a shared data feature where all the data is stored in one location and all users
will have their own customized way of interacting with the data.
A user's view is unaffected or immune to changes made in other views. Since the
requirement of one user is independent of the other, a change made in one user‘s
view should not affect other users.
Users should not need to know physical database storage details. As there are naïve
users of the system, hardware level or physical details should be a black-box for
such users.
DBA should be able to change database storage structures without affecting the
users' views. A change in file organization, access method should not affect the
structure of the data which in turn will have no effect on the users.
Internal structure of database should be unaffected by changes to physical aspects of
storage, such as change of hard disk
DBA should be able to change conceptual structure of database without affecting all
users. In any database system, the DBA will have the privilege to change the
structure of the database, like adding tables, adding and deleting an attribute,
changing the specification of the objects in the database.
All of the above and much more functionalities are possible due to the three level
ANSI-SPARC architecture.
Three-level ANSI-SPARC Architecture of a Database
13
Database Management Systems Lecture Note
ANSI-SPARC Architecture and Database Design Phases
1. External Level: Users' view of the database. It describes that part of database that is
relevant to a particular user. Different users have their own customized view of the
database independent of other users.
2. Conceptual Level: Community view of the database. Describes what data is stored
in database and relationships among the data along with the business constraints.
3. Internal Level: Physical representation of the database on the computer. Describes
how the data is stored in the database.
The following example can be taken as an illustration for the difference between the three
levels in the ANSI-SPARC database Architecture. Where:
The first level is concerned about the group of users and their respective data
requirement independent of the other.
The second level is describing the whole content of the database where one piece of
information will be represented once.
The third level
Differences between Three Levels of ANSI-SPARC Architecture
14
Database Management Systems Lecture Note
Defines DBMS schemas at three levels:
Internal schema: at the internal level to describe physical storage structures and access
paths. Typically uses a physical data model i.e. specific DBMS.
Conceptual schema: at the conceptual level to describe the structure and constraints for
the whole database for a community of users. It uses a conceptual or an implementation
data model.
External schema: at the external level to describe the various user views. Usually uses
the same data model as the conceptual level.
Data Independence
Logical Data Independence:
Refers to immunity of external schemas to changes in conceptual schema.
Conceptual schema changes e.g. addition/removal of entities should not require
changes to external schema or rewrites of application programs.
The capacity to change the conceptual schema without having to change the external
schemas and their application programs.
Physical Data Independence
The ability to modify the physical schema without changing the logical schema
Applications depend on the logical schema
In general, the interfaces between the various levels and components should be well
defined so that changes in some parts do not seriously influence others.
The capacity to change the internal schema without having to change the conceptual
schema
Refers to immunity of conceptual schema to changes in the internal schema
Internal schema changes e.g. using different file organizations, storage
structures/devices should not require change to conceptual or external schemas.
Data Independence and the ANSI-SPARC Three-level Architecture
15
Database Management Systems Lecture Note
The distinction between a Data Definition Language (DDL) and a Data
Manipulation Language (DML)
Database Languages
Data Definition Language (DDL)
Allows DBA or user to describe and name entitles, attributes and relationships
required for the application.
Specification notation for defining the database schema
Data Manipulation Language (DML)
Provides basic data manipulation operations on data held in the database.
Language for accessing and manipulating the data organized by the appropriate
data model
DML also known as query language
Procedural DML: user specifies what data is required and how to get the data.
Non-Procedural DML: user specifies what data is required but not how it is to be
retrieved
Data Control Language (DCL)
Allows a DBA to define access control and privileges for users.
It is a mechanism for implementing security at a database object level.
Uses the Grant and Revoke SQL Statements
SQL is the most widely used non-procedural query language
Fourth Generation Language (4GL)
Query Languages Graphics Generators
Forms Generators Application Generators
Report Generators
A Classification of data models
Data Model
A specific DBMS has its own specific Data Definition Language to define a database schema,
but this type of language is too low level to describe the data requirements of an organization
in a way that is readily understandable by a variety of users.
We need a higher-level language.
Such a higher-level description of the database schema is called data-model.
Data Model: a set of concepts to describe the structure of a database, and certain constraints
that the database should obey.
A data model is a description of the way that data is stored in a database. Data model helps
to understand the relationship between entities and to create the most effective structure to
hold data.
16
Database Management Systems Lecture Note
Data Model is a collection of tools or concepts for describing
Data
Data relationships
Data semantics
Data constraints
The main purpose of Data Model is to represent the data in an understandable way.
Categories of data models include:
Object-based Record-based Physical
Record-based Data Models
Consist of a number of fixed format records.
Each record type defines a fixed number of fields,
Each field is typically of a fixed length.
Hierarchical Data Model
Network Data Model
Relational Data Model
1. Hierarchical Model
The simplest data model
Record type is referred to as node or segment
The top node is the root node
Nodes are arranged in a hierarchical structure as sort of upside-down tree
A parent node can have more than one child node
A child node can only have one parent node
The relationship between parent and child is one-to-many
Relation is established by creating physical link between stored records (each is
stored with a predefined access path to other records)
To add new record type or relationship, the database must be redefined and then
stored in a new form.
Department
Employee Job
Time Card Activity
17
Database Management Systems Lecture Note
ADVANTAGES of Hierarchical Data Model:
Hierarchical Model is simple to construct and operate on
Corresponds to a number of natural hierarchically organized domains - e.g.,
assemblies in manufacturing, personnel organization in companies
Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT,
GET NEXT WITHIN PARENT etc.
DISADVANTAGES of Hierarchical Data Model:
Navigational and procedural nature of processing
Database is visualized as a linear arrangement of records
Little scope for "query optimization"
2. Network Model
Allows record types to have more than one parent unlike hierarchical model
A network data models sees records as set members
Each set has an owner and one or more members
Allow no many to many relationship between entities
Like hierarchical model network model is a collection of physically linked records.
Allow member records to have more than one owner
ADVANTAGES of Network Data Model:
Network Model is able to model complex relationships and represents semantics of
add/delete on the relationships.
Can handle most situations for modeling using record types and relationship types.
Language is navigational; uses constructs like FIND, FIND member, FIND owner,
FIND NEXT within set, GET etc. Programmers can do optimal navigation through
the database.
DISADVANTAGES of Network Data Model:
Navigational and procedural nature of processing
Database contains a complex array of pointers that thread through a set of records.
Little scope for automated "query optimization‖
18
Database Management Systems Lecture Note
3. Relational Data Model
Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational Model for
Large Shared Data Banks')
Terminologies originates from the branch of mathematics called set theory and
predicate logic and is based on the mathematical concept called Relation
Can define more flexible and complex relationship
Viewed as a collection of tables called ―Relations‖ equivalent to collection of record
types
Relation: Two dimensional table
Stores information or data in the form of tables rows and columns
A row of the table is called tuple equivalent to record
A column of a table is called attribute equivalent to fields
Data value is the value of the Attribute
Records are related by the data stored jointly in the fields of records in two tables or
files. The related tables contain information that creates the relation
The tables seem to be independent but are related some how.
No physical consideration of the storage is required by the user
Many tables are merged together to come up with a new virtual view of the
relationship
Alternative terminologies
Relation Table File
Tuple Row Record
Attribute Column Field
The rows represent records (collections of information about separate items)
The columns represent fields (particular attributes of a record)
Conducts searches by using data in specified columns of one table to find
additional data in another table
In conducting searches, a relational database matches information from a field in
one table with information in a corresponding field of another table to produce a
third table that combines requested data from both tables
19