0% found this document useful (0 votes)
27 views

ACU MSC Database

Database
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

ACU MSC Database

Database
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 71

What is Data?

Data is a collection of a distinct small unit of information. It can be used in a variety of


forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or
electronic memory, etc.

Word 'Data' is originated from the word 'datum' that means 'single piece of
information.' It is plural of the word datum.

In computing, Data is information that can be translated into a form for efficient
movement and processing. Data is interchangeable.

657
hism in Java | Dynamic Method Dispatch

What is Database?
A database is an organized collection of data, so that it can be easily accessed and
managed.

You can organize data into tables, rows, columns, and index it to make it easier to find
relevant information.

Database handlers create a database in such a way that only one set of software
program provides access of data to all the users.

The main purpose of the database is to operate a large amount of information by


storing, retrieving, and managing data.

There are many dynamic websites on the World Wide Web nowadays that are handled
through databases. For example, a model that checks the availability of rooms in a hotel.
It is an example of a dynamic website that uses a database.

There are many databases available like MySQL, Sybase, Oracle, MongoDB, Informix,
PostgreSQL, SQL Server, etc.

Modern databases are managed by the database management system (DBMS).

SQL or Structured Query Language is used to operate on the data stored in a database.
SQL depends on relational algebra and tuple relational calculus.
A cylindrical structure is used to display the image of a database.

Evolution of Databases
The database has completed more than 50 years of journey of its evolution from flat-file
system to relational and objects relational systems. It has gone through several
generations.

The Evolution

File-Based

1968 was the year when File-Based database were introduced. In file-based databases,
data was maintained in a flat file. Though files have many advantages, there are several
limitations.

One of the major advantages is that the file system has various access methods, e.g.,
sequential, indexed, and random.

It requires extensive programming in a third-generation language such as COBOL,


BASIC.

Hierarchical Data Model

1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database
model was IBM's first DBMS. It was called IMS (Information Management System).

In this model, files are related in a parent/child manner.

Below diagram represents Hierarchical Data Model. Small circle represents objects.
Like file system, this model also had some limitations like complex implementation, lack
structural independence, can't easily handle a many-many relationship, etc.

Network data model


Charles Bachman developed the first DBMS at Honeywell called Integrated Data Store
(IDS). It was developed in the early 1960s, but it was standardized in 1971 by the
CODASYL group (Conference on Data Systems Languages).

In this model, files are related as owners and members, like to the common network
model.

Network data model identified the following components:

o Network schema (Database organization)


o Sub-schema (views of database per user)
o Data management language (procedural)

This model also had some limitations like system complexity and difficult to design and
maintain.

Relational Database

1970 - Present: It is the era of Relational Database and Database Management. In 1970,
the relational model was proposed by E.F. Codd.

Relational database model has two main terminologies called instance and schema.

The instance is a table with rows or columns

Schema specifies the structure like name of the relation, type of each column and name.

This model uses some mathematical concept like set theory and predicate logic.

The first internet database application had been created in 1995.

During the era of the relational database, many more models had introduced like
object-oriented model, object-relational model, etc.

Cloud database
Cloud database facilitates you to store, manage, and retrieve their structured,
unstructured data via a cloud platform. This data is accessible over the Internet. Cloud
databases are also called a database as service (DBaaS) because they are offered as a
managed service.

Some best cloud options are:

o AWS (Amazon Web Services)


o Snowflake Computing
o Oracle Database Cloud Services
o Microsoft SQL server
o Google cloud spanner
Advantages of cloud database

Lower costs

Generally, company provider does not have to invest in databases. It can maintain and
support one or more data centers.

Automated

Cloud databases are enriched with a variety of automated processes such as recovery,
failover, and auto-scaling.

Increased accessibility

You can access your cloud-based database from any location, anytime. All you need is
just an internet connection.

NoSQL Database
A NoSQL database is an approach to design such databases that can accommodate a
wide variety of data models. NoSQL stands for "not only SQL." It is an alternative to
traditional relational databases in which data is placed in tables, and data schema is
perfectly designed before the database is built.

NoSQL databases are useful for a large set of distributed data.

Some examples of NoSQL database system with their category are:

o MongoDB, CouchDB, Cloudant (Document-based)


o Memcached, Redis, Coherence (key-value store)
o HBase, Big Table, Accumulo (Tabular)

Advantage of NoSQL
High Scalability

NoSQL can handle an extensive amount of data because of scalability. If the data grows,
NoSQL database scale it to handle that data in an efficient manner.
High Availability

NoSQL supports auto replication. Auto replication makes it highly available because, in
case of any failure, data replicates itself to the previous consistent state.

Disadvantage of NoSQL
Open source

NoSQL is an open-source database, so there is no reliable standard for NoSQL yet.

Management challenge

Data management in NoSQL is much more complicated than relational databases. It is


very challenging to install and even more hectic to manage daily.

GUI is not available

GUI tools for NoSQL database are not easily available in the market.

Backup

Backup is a great weak point for NoSQL databases. Some databases, like MongoDB,
have no powerful approaches for data backup.

The Object-Oriented Databases


The object-oriented databases contain data in the form of object and classes. Objects
are the real-world entity, and types are the collection of objects. An object-oriented
database is a combination of relational model features with objects oriented principles.
It is an alternative implementation to that of the relational model.

Object-oriented databases hold the rules of object-oriented programming. An object-


oriented database management system is a hybrid application.

The object-oriented database model contains the following properties.

Object-oriented programming properties

o Objects
o Classes
o Inheritance
o Polymorphism
o Encapsulation

Relational database properties

o Atomicity
o Consistency
o Integrity
o Durability
o Concurrency
o Query processing

Graph Databases
A graph database is a NoSQL database. It is a graphical representation of data. It
contains nodes and edges. A node represents an entity, and each edge represents a
relationship between two edges. Every node in a graph database represents a unique
identifier.

Graph databases are beneficial for searching the relationship between data because
they highlight the relationship between relevant data.

Graph databases are very useful when the database contains a complex relationship and
dynamic schema.

It is mostly used in supply chain management, identifying the source of IP telephony.


DBMS (Data Base Management System)
Database management System is software which is used to store and retrieve the
database. For example, Oracle, MySQL, etc.; these are some popular DBMS tools.

o DBMS provides the interface to perform the various operations like creation,
deletion, modification, etc.
o DBMS allows the user to create their databases as per their requirement.
o DBMS accepts the request from the application and provides specific data
through the operating system.
o DBMS contains the group of programs which acts according to the user
instruction.
o It provides security to the database.

Advantage of DBMS
Controls redundancy

It stores all the data in a single database file, so it can control data redundancy.

Data sharing

An authorized user can share the data among multiple users.

Backup

It provides Backup and recovery subsystem. This recovery system creates automatic data
from system failure and restores data if required.

Multiple user interfaces

It provides a different type of user interfaces like GUI, application interfaces.

Disadvantage of DBMS
Size

It occupies large disk space and large memory to run efficiently.

Cost

DBMS requires a high-speed data processor and larger memory to run DBMS software,
so it is costly.

Complexity

DBMS creates additional complexity and requirements.

RDBMS (Relational Database Management System)


The word RDBMS is termed as 'Relational Database Management System.' It is
represented as a table that contains rows and column.

RDBMS is based on the Relational model; it was introduced by E. F. Codd.

A relational database contains the following components:

o Table
o Record/ Tuple
o Field/Column name /Attribute
o Instance
o Schema
o Keys

An RDBMS is a tabular DBMS that maintains the security, integrity, accuracy, and
consistency of the data.

Types of Databases
There are various types of databases used for storing different varieties of data:
1) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts
the users to access the stored data from different locations through several applications.
These applications contain the authentication process to let users access data securely.
An example of a Centralized database can be Central Library that carries a central
database of each library in a college/university.

Advantages of Centralized Database

o It has decreased the risk of data management, i.e., manipulation of data will not affect
the core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data standards.
o It is less costly because fewer vendors are required to handle the data sets.

Disadvantages of Centralized Database

o The size of the centralized database is large, which increases the response time for
fetching the data.
o It is not easy to update such an extensive database system.
o If any server failure occurs, entire data will be lost, which could be a huge loss.
2) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed among
different database systems of an organization. These database systems are connected
via communication links. Such links help the end-users to access the data
easily. Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc.

We can further divide a distributed database system into:

o Homogeneous DDB: Those database systems which execute on the same operating
system and use the same application process and carry the same hardware devices.
o Heterogeneous DDB: Those database systems which execute on different operating
systems under different application procedures, and carries different hardware devices.

Advantages of Distributed Database

o Modular development is possible in a distributed database, i.e., the system can be


expanded by including new computers and connecting them to the distributed system.
o One server failure will not affect the entire data set.
3) Relational Database
This database is based on the relational data model, which stores data in the form of
rows(tuple) and columns(attributes), and together forms a table(relation). A relational
database uses SQL for storing, manipulating, as well as maintaining the data. E.F. Codd
invented the database in 1970. Each table in the database carries a key that makes the
data unique from others. Examples of Relational databases are MySQL, Microsoft SQL
Server, Oracle, etc.

Properties of Relational Database


There are following four commonly known properties of a relational model known as
ACID properties, where:

A means Atomicity: This ensures the data operation will complete either with success
or with failure. It follows the 'all or nothing' strategy. For example, a transaction will
either be committed or will abort.

C means Consistency: If we perform any operation over the data, its value before and
after the operation should be preserved. For example, the account balance before and
after the transaction should be correct, i.e., it should remain conserved.

I means Isolation: There can be concurrent users for accessing data at the same time
from the database. Thus, isolation between the data should remain isolated. For
example, when multiple transactions occur at the same time, one transaction effects
should not be visible to the other transactions in the database.

D means Durability: It ensures that once it completes the operation and commits the
data, data changes should remain permanent.

4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of
data sets. It is not a relational database as it stores data not only in tabular form but in
several different ways. It came into existence when the demand for building modern
applications increased. Thus, NoSQL presented a wide variety of database technologies
in response to the demands. We can further divide a NoSQL database into the following
four types:
a. Key-value storage: It is the simplest type of database storage where it stores every
single item as a key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like
document. It helps developers in storing data by using the same document-model
format as used in the application code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like structure.
Most commonly, social networking websites use the graph database.
d. Wide-column stores: It is similar to the data represented in relational databases. Here,
data is stored in large columns together, instead of storing in rows.

Advantages of NoSQL Database

o It enables good productivity in the application development as it is not required to store


data in a structured format.
o It is a better option for managing and handling large data sets.
o It provides high scalability.
o Users can quickly access data from the database through key-value.
5) Cloud Database
A type of database where data is stored in a virtual environment and executes over the
cloud computing platform. It provides users with various cloud computing services
(SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud platforms,
but the best options are:

o Amazon Web Services(AWS)


o Microsoft Azure
o Kamatera
o PhonixNAP
o ScienceSoft
o Google Cloud SQL, etc.

6) Object-oriented Databases
The type of database that uses the object-based data model approach for storing data
in the database system. The data is represented and stored as objects which are similar
to the objects used in the object-oriented programming language.

7) Hierarchical Databases
It is the type of database that stores data in the form of parent-children relationship
nodes. Here, it organizes data in a tree-like structure.
Data get stored in the form of records that are connected via links. Each child record in
the tree will contain only one parent. On the other hand, each parent record can have
multiple child records.

8) Network Databases
It is the database that typically follows the network data model. Here, the representation
of data is in the form of nodes connected via links between them. Unlike the hierarchical
database, it allows each record to have multiple children and parent nodes to form a
generalized graph structure.

9) Personal Database
Collecting and storing data on the user's system defines a Personal Database. This
database is basically designed for a single user.

Advantage of Personal Database

o It is simple and easy to handle.


o It occupies less storage space as it is small in size.

10) Operational Database


The type of database which creates and updates the database in real-time. It is basically
designed for executing and handling the daily data operations in several businesses. For
example, An organization uses operational databases for managing per day
transactions.

11) Enterprise Database


Large organizations or enterprises use this database for managing a massive amount of
data. It helps organizations to increase and improve their efficiency. Such a database
allows simultaneous access to users.

Advantages of Enterprise Database:

o Multi processes are supportable over the Enterprise database.


o It allows executing parallel queries on the system.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is
used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get their
request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.
1-Tier Architecture

o In this architecture, the database is directly available to the user. It means the user can
directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a
handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.

2-Tier Architecture

o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,


applications on the client end can directly communicate with the database at the server
side. For this interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with the
server side.
Fig: 2-tier Architecture

3-Tier Architecture

o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server.
The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.

Fig: 3-tier Architecture


Three schema Architecture
o The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture.
o This framework is used to describe the structure of a specific database system.
o The three schema architecture is also used to separate the user applications and physical
database.
o The three schema architecture contains three-levels. It breaks the database down into
three different categories.

The three-schema architecture is as follows:

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between various database levels
of architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from external
level to conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from the conceptual to
internal level.

Objectives of Three schema Architecture


The main objective of three level architecture is to enable multiple users to access the
same data with a personalized view while storing the underlying data only once. Thus, it
separates the user's view from the physical structure of the database. This separation is
desirable for the following reasons:K, JRE, and JVM

o Different users need different views of the same data.


o The approach in which a particular user needs to see the data may change over time.
o The users of the database should not worry about the physical implementation and
internal workings of the database such as data compression and encryption techniques,
hashing, optimization of the internal structures etc.
o All users should be able to access the same data according to their requirements.
o DBA should be able to change the conceptual structure of the database without affecting
the user's
o Internal structure of the database should be unaffected by changes to physical aspects of
the storage.

1. Internal Level
o The internal level has an internal schema which describes the physical storage structure
of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in a
block.
o The physical level is used to describe complex low-level data structures in detail.

The internal level is generally is concerned with the following activities:

o Storage space allocations.


For Example: B-Trees, Hashing etc.
o Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers and
sequencing.
o Data compression and encryption techniques.
o Optimization of internal structures.
o Representation of stored fields.

2. Conceptual Level
o The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data structure
are hidden.
o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is interested
and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each other. There
must be correspondence between the three levels i.e. how they actually correspond with
each other. DBMS is responsible for correspondence between the three types of
schema. This correspondence is called Mapping.

There are basically two types of mapping in the database architecture:

o Conceptual/ Internal Mapping


o External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.

Data model Schema and Instance


o The data which is stored in the database at a particular moment of time is called an
instance of the database.
o The overall design of a database is called schema.
o A database schema is the skeleton structure of the database. It represents the logical
view of the entire database.
o A schema contains schema objects like table, foreign key, primary key, views, columns,
data types, stored procedure, etc.
o A database schema can be represented by using the visual diagram. That diagram shows
the database objects and relationship with each other.
o A database schema is designed by the database designers to help programmers whose
software will interact with the database. The process of database creation is called data
modeling.
A schema diagram can display only some aspects of a schema like the name of record
type, data type, and constraints. Other aspects can't be specified through the schema
diagram. For example, the given figure neither show the data type of each data item nor
the relationship among various files.

In the database, actual data changes quite frequently. For example, in the given figure,
the database changes whenever we add a new grade or add a student. The data at a
particular moment of time is called the instance of the database.

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.
There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
Fig: Data Independence

DATA MODELING
ER (Entity Relationship) Diagram in
DBMS
o ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
For example, Suppose we design a school database. In this database, the student will
be an entity with attributes like address, name, id, age, etc. The address can be another
entity with attributes like city, street name, pin code, etc and there will be a relationship
between them.

Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc.


can be taken as an entity.
a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent
an attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents
a primary key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute.


The composite attribute is represented by an ellipse, and those ellipses are connected
with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued
attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another
attribute like Date of birth.

3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is
used to represent the relationship.
Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known
as one to one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then this is known as a one-to-many
relationship.

For example, Scientist can invent many inventions, but the invention is done by the
only specific scientist.
c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an
entity on the right associates with the relationship then it is known as a many-to-one
relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then it is known as a many-to-
many relationship.

For example, Employee can assign by many projects and project can have many
employees.

Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are
used to express the cardinality. These notations are as follows:
Fig: Notations of ER diagram

Cardinality in DBMS (Mapping


Constraints)
DBMS
DBMS stands for Database Management System, which is a tool, or a software used to
do various operations on a Database like the Creation of the Database, Deletion of the
Database, or Updating the current Database. To simplify processing and data querying,
the most popular types of Databases currently in use typically model their data as rows
and columns in a set of tables. The data may then be handled, updated, regulated, and
structured with ease. For writing and querying data, most Databases employ Structured
Query Language (SQL).
Cardinality
Cardinality means how the entities are arranged to each other or what is the relationship
structure between entities in a relationship set. In a Database Management System,
Cardinality represents a number that denotes how many times an entity is participating
with another entity in a relationship set. The Cardinality of DBMS is a very important
attribute in representing the structure of a Database. In a table, the number of rows or
tuples represents the Cardinality.

Cardinality Ratio
Cardinality ratio is also called Cardinality Mapping, which represents the mapping of
one entity set to another entity set in a relationship set. We generally take the example
of a binary relationship set where two entities are mapped to each other.

Difference between JDK, JRE, and JVM

Cardinality is very important in the Database of various businesses. For example, if we


want to track the purchase history of each customer then we can use the one-to-many
cardinality to find the data of a specific customer. The Cardinality model can be used in
Databases by Database Managers for a variety of purposes, but corporations often use it
to evaluate customer or inventory data.

There are four types of Cardinality Mapping in Database Management Systems:

1. One to one
2. Many to one
3. One to many
4. Many to many

One to One
One to one cardinality is represented by a 1:1 symbol. In this, there is at most one
relationship from one entity to another entity. There are a lot of examples of one-to-one
cardinality in real life databases.
For example, one student can have only one student id, and one student id can belong
to only one student. So, the relationship mapping between student and student id will
be one to one cardinality mapping.

Another example is the relationship between the director of the school and the school
because one school can have a maximum of one director, and one director can belong
to only one school.

Note: it is not necessary that there would be a mapping for all entities in an entity set
in one-to-one cardinality. Some entities cannot participate in the mapping.

Many to One Cardinality:


In many to one cardinality mapping, from set 1, there can be multiple sets that can
make relationships with a single entity of set 2. Or we can also describe it as from set 2,
and one entity can make a relationship with more than one entity of set 1.

One to one Cardinality is the subset of Many to one Cardinality. It can be represented
by M:1.

For example, there are multiple patients in a hospital who are served by a single doctor,
so the relationship between patients and doctors can be represented by Many to one
Cardinality.
One to Many Cardinalities:
In One-to-many cardinality mapping, from set 1, there can be a maximum single set that
can make relationships with a single or more than one entity of set 2. Or we can also
describe it as from set 2, more than one entity can make a relationship with only one
entity of set 1.

One to one cardinality is the subset of One-to-many Cardinality. It can be represented


by 1: M.

For Example, in a hospital, there can be various compounders, so the relationship


between the hospital and compounders can be mapped through One-to-many
Cardinality.
Many to Many Cardinalities:
In many, many cardinalities mapping, there can be one or more than one entity that can
associate with one or more than one entity of set 2. In the same way from the end of set
2, one or more than one entity can make a relation with one or more than one entity of
set 1.

It is represented by M: N or N: M.

One to one cardinality, One to many cardinalities, and Many to one cardinality is the
subset of the many to many cardinalities.

For Example, in a college, multiple students can work on a single project, and a single
student can also work on multiple projects. So, the relationship between the project and
the student can be represented by many to many cardinalities.
Appropriate Mapping Cardinality
Evidently, the real-world context in which the relation set is modeled determines the
Appropriate Mapping Cardinality for a specific relation set.

o We can combine relational tables with many involved tables if the Cardinality is one-to-
many or many-to-one.
o One entity can be combined with a relation table if it has a one-to-one relationship and
total participation, and two entities can be combined with their relation to form a single
table if both of them have total participation.
o We cannot mix any two tables if the Cardinality is many-to-many.

Relational Model in DBMS


Relational model can represent as a table with columns and rows. Each row is known as
a tuple. Each table of the column has a name or attribute.

Domain: It contains a set of atomic values that an attribute can take.

Attribute: It contains the name of a column in a particular table. Each attribute Ai must
have a domain, dom(Ai).Diference between JDK, JRE, and JVM

Relational instance: In the relational database system, the relational instance is


represented by a finite set of tuples. Relation instances do not have duplicate tuples.

Relational schema: A relational schema contains the name of the relation and name of
all columns or attributes.

Relational key: In the relational key, each row has one or more attributes. It can identify
the row in the relation uniquely.

Example: STUDENT Relation

NAME ROLL_NO PHONE_NO ADDRESS AGE

Ram 14795 7305758992 Noida 24


Shyam 12839 9026288936 Delhi 35

Laxman 33289 8583287182 Gurugram 20

Mahesh 27857 7086819134 Ghaziabad 27

Ganesh 17282 9028 9i3988 Delhi 40

o In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the attributes.
o The instance of schema STUDENT has 5 tuples.
o t3 = <Laxman, 33289, 8583287182, Gurugram, 20>

Properties of Relations
o Name of the relation is distinct from all other relations.
o Each relation cell contains exactly one atomic (single) value
o Each attribute contains a distinct name
o Attribute domain has no significance
o tuple has no duplicate value
o Order of tuple can have a different sequence

Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to
obtain the result of the query. It uses operators to perform queries.

Types of Relational operation


1. Select Operation:

o The select operation selects tuples that satisfy a given predicate.


o It is denoted by sigma (σ).

1. Notation: σ p(r)

Where:

σ is used for selection prediction


r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and
NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.

For example: LOAN Relation

BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500


Downtown L-14 1500

Mianus L-13 500

Roundhill L-11 900

Perryride L-16 1300

Input:

1. σ BRANCH_NAME="perryride" (LOAN)

Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300

2. Project Operation:

o This operation shows the list of those attributes that we wish to appear in the result. Rest
of the attributes are eliminated from the table.
o It is denoted by ∏.

1. Notation: ∏ A1, A2, An (r)

Where

A1, A2, A3 is used as an attribute name of relation r.

Example: CUSTOMER RELATION


NAME STREET CITY

Jones Main Harrison

Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn

Brooks Senator Brooklyn

Input:

1. ∏ NAME, CITY (CUSTOMER)

Output:

NAME CITY

Jones Harrison

Smith Rye

Hays Harrison

Curry Rye

Johnson Brooklyn

Brooks Brooklyn
3. Union Operation:

o Suppose there are two tuples R and S. The union operation contains all the tuples that
are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.

1. Notation: R ∪ S

A union operation must hold the following condition:

o R and S must have the attribute of the same number.


o Duplicate tuples are eliminated automatically.

Example:
DEPOSITOR RELATION

CUSTOMER_NAME ACCOUNT_NO

Johnson A-101

Smith A-121

Mayes A-321

Turner A-176

Johnson A-273

Jones A-472

Lindsay A-284

BORROW RELATION
CUSTOMER_NAME LOAN_NO

Jones L-17

Smith L-23

Hayes L-15

Jackson L-14

Curry L-93

Smith L-11

Williams L-17

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Johnson

Smith

Hayes

Turner

Jones
Lindsay

Jackson

Curry

Williams

Mayes

4. Set Intersection:

o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in both R & S.
o It is denoted by intersection ∩.

1. Notation: R ∩ S

Example: Using the above DEPOSITOR table and BORROW table

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Smith

Jones

5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in R but not in S.
o It is denoted by intersection minus (-).

1. Notation: R - S

Example: Using the above DEPOSITOR table and BORROW table

Input:

1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Jackson

Hayes

Willians

Curry

6. Cartesian product

o The Cartesian product is used to combine each row in one table with each row in the
other table. It is also known as a cross product.
o It is denoted by X.

1. Notation: E X D

Example:
EMPLOYEE
EMP_ID EMP_NAME EMP_DEPT

1 Smith A

2 Harry C

3 John B

DEPARTMENT

DEPT_NO DEPT_NAME

A Marketing

B Sales

C Legal

Input:

1. EMPLOYEE X DEPARTMENT

Output:

EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME

1 Smith A A Marketing

1 Smith A B Sales

1 Smith A C Legal
2 Harry C A Marketing

2 Harry C B Sales

2 Harry C C Legal

3 John B A Marketing

3 John B B Sales

3 John B C Legal

7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).

Example: We can use the rename operator to rename STUDENT relation to STUDENT1.

1. ρ(STUDENT1, STUDENT)
Note: Apart from these common operations Relational algebra can be used in Join
operation

Join Operations:
A Join operation combines related tuples from different relations, if and only if a given
join condition is satisfied. It is denoted by ⋈.

Example:
EMPLOYEE

EMP_CODE EMP_NAME

101 Stephan
102 Jack

103 Harry

SALARY

Difference between JDK, JRE, and JVM

EMP_CODE SALARY

101 50000

102 30000

103 25000

1. Operation: (EMPLOYEE ⋈ SALARY)

Result:

EMP_CODE EMP_NAME SALARY

101 Stephan 50000

102 Jack 30000

103 Harry 25000

Types of Join operations:


1. Natural Join:

o A natural join is the set of tuples of all combinations in R and S that are equal on their
common attribute names.
o It is denoted by ⋈.

Example: Let's use the above EMPLOYEE table and SALARY table:

Input:

1. ∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)

Output:

EMP_NAME SALARY
Stephan 50000

Jack 30000

Harry 25000

2. Outer Join:
The outer join operation is an extension of the join operation. It is used to deal with
missing information.

Example:

EMPLOYEE

EMP_NAME STREET CITY

Ram Civil line Mumbai

Shyam Park street Kolkata

Ravi M.G. Street Delhi

Hari Nehru nagar Hyderabad

FACT_WORKERS

EMP_NAME BRANCH SALARY

Ram Infosys 10000


Shyam Wipro 20000

Kuber HCL 30000

Hari TCS 50000

Input:

1. (EMPLOYEE ⋈ FACT_WORKERS)

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru nagar Hyderabad TCS 50000

An outer join is basically of three types:

a. Left outer join


b. Right outer join
c. Full outer join

a. Left outer join:

o Left outer join contains the set of tuples of all combinations in R and S that are equal on
their common attribute names.
o In the left outer join, tuples in R have no matching tuples in S.
o It is denoted by ⟕.
Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:

1. EMPLOYEE ⟕ FACT_WORKERS

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

b. Right outer join:

o Right outer join contains the set of tuples of all combinations in R and S that are equal
on their common attribute names.
o In right outer join, tuples in S have no matching tuples in R.
o It is denoted by ⟖.

Example: Using the above EMPLOYEE table and FACT_WORKERS Relation

Input:

1. EMPLOYEE ⟖ FACT_WORKERS

Output:

EMP_NAME BRANCH SALARY STREET CITY

Ram Infosys 10000 Civil line Mumbai

Shyam Wipro 20000 Park street Kolkata


Hari TCS 50000 Nehru street Hyderabad

Kuber HCL 30000 NULL NULL

c. Full outer join:

o Full outer join is like a left or right join except that it contains all rows from both tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S that have
no matching tuples in R in their common attribute name.
o It is denoted by ⟗.

Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:

1. EMPLOYEE ⟗ FACT_WORKERS

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

Kuber NULL NULL HCL 30000

3. Equi join:
It is also known as an inner join. It is the most common join. It is based on matched data
as per the equality condition. The equi join uses the comparison operator(=).

Example:

CUSTOMER RELATION

CLASS_ID NAME

1 John

2 Harry

3 Jackson

PRODUCT

PRODUCT_ID CITY

1 Delhi

2 Mumbai

3 Noida

Input:

1. CUSTOMER ⋈ PRODUCT

Output:

CLASS_ID NAME PRODUCT_ID CITY

1 John 1 Delhi
2 Harry 2 Mumbai

3 Harry 3 Noida

TRANSACTION PROCESSING

Transaction
o The transaction is a set of logically related operations. It contains of a group of tasks.
o A transaction is an action or series of actions. It is performed by a single user to perform
operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers N8000 from X's account to Y's
account. This small transaction contains several low-level tasks:

X's Account

1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 8000
4. X.balance = New_Balance
5. Close_Account(X)

Y's Account

Features of Java - Javatpoint

1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 8000
4. Y.balance = New_Balance
5. Close_Account(Y)
Operations of Transaction:
Following are the main operations of transaction:

Read(X): Read operation is used to read the value of X from the database and stores it
in a buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the
buffer.

Let's take an example to debit transaction from an account which consists of following
operations:

1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);

Let's assume the value of X before starting of the transaction is 4000.

o The first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will contain 3500.
o The third operation will write the buffer's value to the database. So X's final value will be
3500.

But it may be possible that because of the failure of hardware, software or power, etc.
that transaction may fail before finished all the operations in the set.

For example: If in the above transaction, the debit transaction fails after executing
operation 2 then X's value will remain 4000 in the database which is not acceptable by
the bank.

To solve this problem, we have two important operations:

Commit: It is used to save the work done permanently.

Rollback: It is used to undo the work done.


Transaction property
The transaction has the four properties. These are used to maintain consistency in a
database, before and after the transaction.

Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability

Atomicity
o It states that all operations of the transaction take place at once if not, the transaction is
aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated
as one unit and either run to completion or is not executed at all.

Atomicity involves the following two operations:

Abort: If a transaction aborts then all the changes made are not visible.

Features of Java - Javatpoint

Commit: If a transaction commits then all the changes made are visible.

Example: Let's assume that following transaction T consisting of T1 and T2. A consists of
Rs 600 and B consists of Rs 300. Transfer Rs 100 from account A to account B.

T1 T2

Read(A) Read(B)
A:= A-100 Y:= Y+100
Write(A) Write(B)

After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows
the inconsistent database state. In order to ensure correctness of database state, the
transaction must be executed in entirety.

Consistency
o The integrity constraints are maintained so that the database is consistent before and
after the transaction.
o The execution of a transaction will leave a database in either its prior stable state or a
new stable state.
o The consistent property of database states that every transaction sees a consistent
database instance.
o The transaction is used to transform the database from one consistent state to another
consistent state.
For example: The total amount must be maintained before or after the transaction.

1. Total before T occurs = 600+300=900


2. Total after T occurs= 500+400=900

Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.

Isolation
o It shows that the data which is used at the time of execution of a transaction cannot be
used by the second transaction until the first one is completed.
o In isolation, if the transaction T1 is being executed and using the data item X, then that
data item can't be accessed by any other transaction T2 until the transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation property.

Durability
o The durability property is used to indicate the performance of the database's consistent
state. It states that the transaction made the permanent changes.
o They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the event of a system's
failure.
o The recovery subsystem of the DBMS has the responsibility of Durability property.

States of Transaction
In a database, the transaction can be in one of the following states -
Active state

o The active state is the first state of every transaction. In this state, the transaction is being
executed.
o For example: Insertion or deletion or updating a record is done here. But all the records
are still not saved to the database.

Partially committed

o In the partially committed state, a transaction executes its final operation, but the data is
still not saved to the database.
o In the total mark calculation example, a final display of the total marks step is executed in
this state.

Committed
A transaction is said to be in a committed state if it executes all its operations
successfully. In this state, all the effects are now permanently saved on the database
system.

Failed state
o If any of the checks made by the database recovery system fails, then the transaction is
said to be in the failed state.
o In the example of total mark calculation, if the database is not able to fire a query to
fetch the marks, then the transaction will fail to execute.

Aborted

o If any of the checks fail and the transaction has reached a failed state then the database
recovery system will make sure that the database is in its previous consistent state. If not
then it will abort or roll back the transaction to bring the database into a consistent state.
o If the transaction fails in the middle of the transaction then before executing the
transaction, all the executed transactions are rolled back to its consistent state.
o After aborting the transaction, the database recovery module will select one of the two
operations:
1. Re-start the transaction
2. Kill the transaction

Schedule
A series of operation from one transaction to another transaction is known as schedule.
It is used to preserve the order of the operation in each of the individual transaction.

1. Serial Schedule
The serial schedule is a type of schedule where one transaction is executed completely
before starting another transaction. In the serial schedule, when the first transaction
completes its cycle, then the next transaction is executed.

For example: Suppose there are two transactions T1 and T2, which have some
operations. If it has no interleaving of operations, then there are the following two
possible outcomes:

3.3Meatures of Java - Javatpoint

1. Execute all the operations of T1 which was followed by all the operations of T2.
2. Execute all the operations of T1 which was followed by all the operations of T2.

o In the given (a) figure, Schedule A shows the serial schedule where T1 followed by T2.
o In the given (b) figure, Schedule B shows the serial schedule where T2 followed by T1.

2. Non-serial Schedule
o If interleaving of operations is allowed, then there will be non-serial schedule.
o It contains many possible orders in which the system can execute the individual
operations of the transactions.
o In the given figure (c) and (d), Schedule C and Schedule D are the non-serial schedules. It
has interleaving of operations.

3. Serializable schedule
o The serializability of schedules is used to find non-serial schedules that allow the
transaction to execute concurrently without interfering with one another.
o It identifies which schedules are correct when executions of the transaction have
interleaving of their operations.
o A non-serial schedule will be serializable if its result is equal to the result of its
transactions executed serially.
Here,

Schedule A and Schedule B are serial schedule.

Schedule C and Schedule D are Non-serial schedule.


DBMS Concurrency Control
Concurrency Control is the management procedure that is required for controlling
concurrent execution of the operations that take place on a database.

But before knowing about concurrency control, we should know about concurrent
execution.

Concurrent Execution in DBMS


o In a multi-user system, multiple users can access and use the same database at one time,
which is known as the concurrent execution of the database. It means that the same
database is executed simultaneously on a multi-user system by different users.
o While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case,
concurrent execution of the database is performed.
o The thing is that the simultaneous execution that is performed should be done in an
interleaved manner, and no operation should affect the other executing operations, thus
maintaining the consistency of the database. Thus, on making the concurrent execution
of the transaction operations, there occur several challenging problems that need to be
solved.

Problems with Concurrent Execution


In a database transaction, the two main operations are READ and WRITE operations. So,
there is a need to manage these two operations in the concurrent execution of the
transactions as if these operations are not performed in an interleaved manner, and the
data may become inconsistent. So, the following problems occur with the Concurrent
Execution of the operations:

Hello Java Program for Beginners

Problem 1: Lost Update Problems (W - W Conflict)


The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent
execution) that makes the values of the items incorrect hence making the database
inconsistent.

For example:

Consider the below diagram where two transactions T X and TY, are performed on
the same account A where the balance of account A is $300.

o At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
o At time t2, transaction TX deducts $50 from account A that becomes $250 (only deducted
and not updated/write).
o Alternately, at time t3, transaction T Y reads the value of account A that will be $300 only
because TX didn't update the value yet.
o At time t4, transaction TY adds $100 to account A that becomes $400 (only added but
not updated/write).
o At time t6, transaction T X writes the value of account A that will be updated as $250 only,
as TY didn't update the value yet.
o Similarly, at time t7, transaction T Y writes the values of account A, so it will write as done
at time t4 that will be $400. It means the value written by T X is lost, i.e., $250 is lost.

Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)


The dirty read problem occurs when one transaction updates an item of the database,
and somehow the transaction fails, and before the data gets rollback, the updated
database item is accessed by another transaction. There comes the Read-Write Conflict
between both transactions.

For example:

Consider two transactions TX and TY in the below diagram performing read/write


operations on account A where the available balance in account A is $300:

o At time t1, transaction TX reads the value of account A, i.e., $300.


o At time t2, transaction TX adds $50 to account A that becomes $350.
o At time t3, transaction TX writes the updated value in account A, i.e., $350.
o Then at time t4, transaction TY reads account A that will be read as $350.
o Then at time t5, transaction TX rollbacks due to server problem, and the value changes
back to $300 (as initially).
o But the value for account A remains $350 for transaction T Y as committed, which is the
dirty read and therefore known as the Dirty Read Problem.

Unrepeatable Read Problem (W-R Conflict)


Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two
different values are read for the same database item.

For example:

Consider two transactions, TX and TY, performing the read/write operations on


account A, having an available balance = $300. The diagram is shown below:

o At time t1, transaction TX reads the value from account A, i.e., $300.
o At time t2, transaction TY reads the value from account A, i.e., $300.
o At time t3, transaction TY updates the value of account A by adding $100 to the available
balance, and then it becomes $400.
o At time t4, transaction TY writes the updated value, i.e., $400.
o After that, at time t5, transaction T X reads the available value of account A, and that will
be read as $400.
o It means that within the same transaction T X, it reads two different values of account A,
i.e., $ 300 initially, and after updation made by transaction T Y, it reads $400. It is an
unrepeatable read and is therefore known as the Unrepeatable read problem.

Thus, in order to maintain consistency in the database and avoid such problems that
take place in concurrent execution, management is needed, and that is where the
concept of Concurrency Control comes into role.

Concurrency Control
Concurrency Control is the working concept that is required for controlling and
managing the concurrent execution of database operations and thus avoiding the
inconsistencies in the database. Thus, for maintaining the concurrency of the database,
we have the concurrency control protocols.

Concurrency Control Protocols


The concurrency control protocols ensure the atomicity, consistency, isolation,
durability and serializability of the concurrent execution of the database transactions.
Therefore, these protocols are categorized as:

o Lock Based Concurrency Control Protocol


o Time Stamp Concurrency Control Protocol
o Validation Based Concurrency Control Protocol

You might also like