UNIT 1 On Databases
UNIT 1 On Databases
Introduction
Contents
• Introduction to databases – Database system evolution, File Systems,
Database approach, Fundamental Concepts, Components of database
systems, Roles in the database environment, DBMS functions and
components, Advantages and disadvantages of using databases,
• Database System Concepts and Architectures – Data Models, Schemas,
and Instances, DBMS Architecture and Data Independence, Database
Languages and Interfaces, The Database System Environment, Classification
of Database Management Systems, Database Applications
• Database Design: Database application life cycle, Overview of the design
process, Logical and physical database design, Design strategies and
methodologies
Database System Evolution
Flat files (1960s – 1980s)
• Flat file database is a database that stores information in a single file or table. In a text file, every line contains one record where fields
either have fixed length or they are separated by commas, whitespaces, tabs or any other character. In a flat file database, there is no
structural relationship among the records and they cannot contain multiple tables as well.
Advantages:
• Flat file database is best for small databases.
• It is easy to understand and implement. Fewer skills are required to handle a flat file database.
• Less hardware and software skills are required to maintain a flat file database.
Disadvantages:
• A flat file may contain fields which duplicate the data as there is no automation in flat files.
• If one record is to be deleted from the flat file database, then all the relevant information in different fields has to be deleted manually
making the data manipulation inefficient.
• Flat file database waste the computer space by requiring it to keep the information on items that are logically cannot be available.
• Information retrieving is very time consuming in a large database.
Implementation of a flat file database
Flat file database is implemented in: Berkeley DB, SQLite, Mimesis
•
Database System Evolution
Hierarchical database (1970s – 1990s)
• As the name indicates, hierarchical database contains data in a hierarchically-arranged manner. More perceptively it can be visualized as a
family tree where there is a parent and a child relationship. Each parent can have many children but one child can only have one parent i.e.;
one-to-many relationship. Its hierarchical structure contains levels or segments which are equivalent to the file system’s record type. All
attributes of a specific record are listed under the entity type.
• In hierarchical database, the entity type is the main table, rows of a table represent the records and columns represent the attributes.
Advantages:
• In a hierarchical database pace of accessing the information is speedy due to the predefined paths. This increases the performance of a
database.
• The relationships among different entities are easy to understand.
Disadvantages:
• Hierarchical database model lacks flexibility. If a new relationship is to be established between two entities then a new and possibly a
redundant database structure has to be build.
• Maintenance and of data is inefficient in a hierarchical model. Any change in the relationships may require manual reorganization of the
data.
• This model is also inefficient for non-hierarchical accesses.
Database System Evolution
Network database (1970s – 1990s)
The inventor of network model is Charles Bachmann. Unlike hierarchical database model, network database allows multiple parent and child
relationships i.e., it maintains many-to-many relationship. Network database is basically a graph structure. The network database model was
created to achieve three main objectives:
• To represent complex data relationships more effectively.
• To improve the performance of the database.
• To implement a database standard.
In a network database a relationship is referred to as a set. Each set comprises of two types of records, an owner record which is same as
parent type in hierarchical and a member record which is similar to the child type record in hierarchical database model.
Advantages:
• The network database model makes the data access quite easy and proficient as an application can access the owner record and all the
member records within a set.
• This model is conceptually easy to design.
• This model ensures data integrity because no member can exist without an owner. So the user must make an owner entry and then the
member records.
• The network model also ensures the data independence because the application works independently of the data.
Database System Evolution
Network database (1970s – 1990s)
Disadvantages:
• The model lacks structural independence which means that to bring any change in the database structure,
the application program must also be modified before accessing the data.
• A user friendly database management system cannot be established via network model.
Implementation of network database
• Network database is implemented in:
• Digital Equipment Corporation DBMS-10
• Digital Equipment Corporation DBMS-20
• RDM Embedded
• Turbo IMAGE
• Univac DMS-1100 etc.
Database System Evolution
Relational database (1980s – present)
• Relational database model was proposed by E.F. Codd. After the hierarchical and network model, the birth of
this model was huge step ahead. It allows the entities to be related through a common attribute. So in order
to relate two tables (entities), they simply need to have a common attribute. In the tables there are primary
keys and alternative keys. Primary keys form a relation with the alternative keys. This property makes this
model extremely flexible.
• Thus using relational database ample information can be stored using small tables. The accessing of data is
also very efficient. The user only has to enter a query, and the application provides the user with the asked
information.
• Relational databases are established using a computer language, Structured Query Language (SQL). This
language forms the basis of all the database applications available today, from Access to Oracle.
Advantages:
• Relational database supports mathematical set of operations like union, intersection, difference and
Cartesian product. It also supports select, project, relational join and division operations.
Database System Evolution
Relational database (1980s – present)
• Relational database uses normalization structure which helps to achieve data independence more easily.
• Security control can also be implemented more effectively by imposing an authorization control on the
sensitive attributes present in a table.
• Relational database uses a language which is easy and human readable.
Disadvantages:
• The response to a query becomes time-consuming and inefficient if the number of tables between which
the relationships are established increases.
Implementation of Relational Database:
• Oracle, Microsoft, IBM, My SQL, PostgreSQL, SQLite
Database System Evolution
Object-oriented database (1990s – present)
• Object oriented database management system is that database system in which the data or information is
presented in the form of objects, much like in object-oriented programming language.
• Furthermore, object oriented DBMS also facilitate the user by offering transaction support, language for
various queries, and indexing options. Also, these database systems have the ability to handle data
efficiently over multiple servers.
• Unlike relational database, object-oriented database works in the framework of real programming languages
like JAVA or C++.
Advantages:
• If there are complex (many-to-many) relationships between the entities, the object-oriented database
handles them much faster than any of the above discussed database models.
Database System Evolution
Object-oriented database (1990s – present)
• Navigation through the data is much easier.
• Objects do not require assembly or disassembly hence saving the coding and execution time.
Disadvantages:
• Lower efficiency level when data or relationships are simple.
• Data can be accessible via specific language using a particular API which is not the case in relational
databases.
Database System Evolution
Object-relational database (1990s – present)
• Defined in simple terms, an object relational database management system displays a modified object-
oriented user-display over the already implemented relational database management system.
• When various software interact with this modified-database management system, they will customarily
operate in a manner such that the data is assumed to be saved as objects.
• The basic working of this database management system is that it translates the useful data into organized
tables, distributed in rows and columns, and from then onwards, it manages data the same way as done in a
relational database system.
• Similarly, when the data is to be accessed by the user, it is again translated from processed to complex form.
Database System Evolution
Object-relational database (1990s – present)
Advantages:
• Data remains encapsulated in object-relational database.
• Concept of inheritance and polymorphism can also be implemented in this database.
Disadvantages:
• Object relational database is complex.
• Proponents of relational approach believe simplicity and purity of relational model are lost.
• It is costly as well.
Database System Evolution
Web enabled database (1990s – present):
• Web enabled database simply put a database with a web-based interface.
• This implies that there can be a separation of concerns; namely, the web designer does not need to know
the details about the DB’s underlying design. Similarly, the DB designer needs to concern himself with the
DB’s web interface.
• A web enabled database uses three layers to function: a presentation layer, a middle layer and the database
layer.
Advantages:
• A web-enabled database allows users to get the information they need from a central repository on
demand.
Database System Evolution
Web enabled database (1990s – present):
• The database is easy and simple to use.
• The data accessibility is easy via web-enabled database.
Disadvantages:
• Main disadvantage is that it can be hacked easily.
• Web enabled databases support the full range of DB operations, but in order to make them easy to use, they
must be “dumped down”.
File Systems
• A file-based data management system (also called a file system) is a type
of software that allows users to access and organize small groups of data.
• It is usually integrated into a computer’s operating system and is responsible
for storing and retrieving files from a storage medium, such as a hard disk or flash
drive.
• File systems are effectively a digitized version of paper-based filing systems for a
wider range of file types.
ADVANTAGES OF FILE-BASED SYSTEMS
• Advantages of file-based systems include:
• Easy to use
File Systems
• Inexpensive
• Faster performance
• Suitable for personal data management
DISADVANTAGES OF FILE-BASED SYSTEMS
• Disadvantages of file-based systems include:
• Limited capacity
• Limited functionality
• Less security
• Greater data inconsistency
• No backup or recovery capabilities
Database Approach
• A DBMS, on the other hand, is a much larger application that can manipulate large quantities of
data in complex ways. It usually has more advanced security features to protect the data it
contains and offers backup and recovery in the event of data loss, unlike a file system. A DBMS is
usually much more expensive and complicated to implement than a file system, however.
Prominent DBMS products include MySQL, IBM DB2, and Amazon RDS.
• The Database is a shared collection of logically related data, designed to meet
the information needs of an organization. A database is a computer based record keeping system
whose over all purpose is to record and maintain information. The database is a single, large
repository of data, which can be used simultaneously by many departments and users. Instead
of disconnected files with redundant data, all data items are integrated with a minimum amount
of duplication.
• A database implies separation of physical storage from use of the data by an application program
to achieve program/data independence. Using a database system, the user or programmer or
application specialist need not know the details of how the data are stored and such details are
“transparent to the user”.
Fundamental Concepts
Database Properties
A database has the following properties:
• It is a representation of some aspect of the real world or a collection of data
elements (facts) representing real-world information.
• A database is logical, coherent and internally consistent.
• A database is designed, built and populated with data for a specific purpose.
• Each data item is stored in a field.
• A combination of fields makes up a table. For example, each field in an employee
table contains data about an individual employee.
Fundamental Concepts
Building blocks of a Database
• The following three components form the building blocks of a database. They store the data that
we want to save in our database.
• Columns. Columns are similar to fields, that is, individual items of data that we wish to store. A
Student’ Roll Number, Name, Address etc. are all examples of columns. They are also similar to
the columns found in spreadsheets (the A, B, C etc. along the top).
• Rows. Rows are similar to records as they contain data of multiple columns (like the 1, 2, 3 etc. in
a spreadsheet). A row can be made up of as many or as few columns as you want. This makes
reading data much more efficient – you fetch what you want.
• Tables. A table is a logical group of columns. For example, you may have a table that stores details
of customers’ names and addresses. Another table would be used to store details of parts and yet
another would be used for supplier’s names and addresses.
Fundamental Concepts
Characteristics of database
Metadata itself follows a layered architecture, so that when we change data at one layer, it
does not affect the data at another level. This data is independent but mapped to each other.
Data Independence
Logical Data Independence
• Logical data is data about database, that is, it stores information about how data
is managed inside. For example, a table (relation) stored in the database and all
its constraints, applied on that relation.
• Logical data independence is a kind of mechanism, which liberalizes itself from
actual data stored on the disk. If we do some changes on table format, it should
not change the data residing on the disk.
Physical Data Independence
• All the schemas are logical, and the actual data is stored in bit format on the disk.
Physical data independence is the power to change the physical data without
impacting the schema or logical data.
• For example, in case we want to change or upgrade the storage system itself −
suppose we want to replace hard-disks with SSD − it should not have any impact
on the logical data or schemas.
Database Languages
We have 3 types;
• In practice, the data definition and data manipulation languages are not two separate languages. Instead they
simply form parts of a single database language such as Structured Query Language (SQL). SQL represents
combination of DDL and DML, as well as statements for constraints specification and schema evaluation.
Database Interfaces
• A database management system (DBMS) interface is a user interface that allows
for the ability to input queries to a database without using the query language
itself.
• User-friendly interfaces provided by DBMS may include the following:
• In the above student table Std ID, Name and city are called as attributes
and their values. Std ID is a primary key attribute which uniquely identifies
each record in the student table.
Classification of DBMS
Object Oriented Database
• It is a system where information or data is represented in the form of
objects which is used in object-oriented programming.
• It is a combination of relational database concepts and object-oriented
principles.
• Relational database concepts are concurrency control, transactions, etc.
• OOPs principles are data encapsulation, inheritance, and polymorphism.
• It requires less code and is easy to maintain.
• For example − Object DB software.
• The object oriented database is represented in diagram format below −
Classification of DBMS
Object Oriented Database
Classification of DBMS
Hierarchical Database
• It is a system where the data elements have a one to many
relationship (1: N). Here data is organized like a tree which is similar
to a folder structure in your computer system.
• The hierarchy starts from the root node, connecting all the child
nodes to the parent node.
• It is used in industry on mainframe platforms.
• For example− IMS(IBM), Windows registry (Microsoft).
• An example of a hierarchical database is given below −
Classification of DBMS
Hierarchical Database
Classification of DBMS
Network database
• A Network database management system is a system where the data
elements maintain one to one relationship (1: 1) or many to many
relationship (N: N).
• It also has a hierarchical structure, but the data is organized like a
graph and it is allowed to have more than one parent for one child
record.
• Example
• Teachers can teach in multiple departments. This is shown below −
Classification of DBMS
Network database
An Example of a Database Application
“Database application” can mean two things: