Dbms Module I
Dbms Module I
Introduction:
Data:
Data is defined as collection of raw facts about a place, person, thing or
object involving in the transactions of an organization.
Data can be represented in various forms like text, numbers, images, audio,
video, graphs, document files, etc.
Data constitutes the building blocks of information.
Data is one of the important assets of the modern business.
Data becomes relevant based on the context.
Information
Information can be defined as processed data that increases the knowledge
of end user.
Information is used to reveal the meaning of data.
Good, accurate and timely information is used in decision making.
The quality of data influences the quality of information.
Information can be presented in the tabular form, bar graph or an image.
Metadata
Metadata is a special data that describes the characteristics or properties of
the data.
Metadata consists of name, data type, length, min, max, description, special
constraints.
Metadata allows the database designers and users understand what data
exists and what data means.
Metadata is generally stored in a repository.
Database:
Database can be defined as organized collection of logically related data.
Database can be of any size and complexity.
Data are structured so as to be easily stored, manipulated, and retrieved by
users.
Example: Sales person can store customers contacts on his laptop that
consist of few megabytes of data or A big company can store the data of all
activities in the organization which helps in decision making.
DBMS:
Database management system can be defined as reorganized collection of
logically related data and set of programs used for creating, storing, updating
and retrieval of data from the database.
DBMS acts as a mediator between end-user and the database.
Database management system (DBMS): can be defined as collection of
programs that manages database structure and controls access to data.
DBMS enables data to be shared.
DBMS integrates many users’ views of the data.
Repository
A Database Management System (DBMS) is a software system that allows users to
create, maintain, and manage databases. It is a collection of programs that enables
users to access and manipulate data in a database. A DBMS is used to store,
retrieve, and manipulate data in a way that provides security, privacy, and
reliability.
Repository Vs Database: A repository is a centralized storehouse for all data
definitions, data relationships, and other system components, while a database is an
organized collection of logically related data.
Data warehouse: An organization often needs to build a separate database that
contains historical and summarized information. Such a database is usually called a
data warehouse, or in some cases a data mart.
Analysts need specialized decision support tools to query and analyze the database.
One class of tools used for this purpose is called on-line analytical processing
tools (OLAP)
Database Systems
Database system consists of logically related data stored in a single logical
data repository.
Database system may be physically distributed among multiple storage
facilities.
DBMS eliminates most of file system’s problems.
Current generation stores data structures, relationships between structures,
and access paths. Also defines, stores, and manages all access paths and
components
Purpose of Database System:
A database system serves several key purposes in managing and organizing data.
Here’s a rundown of its primary functions:
1. Data Storage and Management: It provides a structured way to store large
amounts of data efficiently. This includes managing data types, relationships,
and integrity.
2. Data Retrieval: It allows for quick and efficient retrieval of data through
queries. This helps users access specific pieces of information quickly,
which is crucial for decision-making and reporting.
3. Data Manipulation: Users can insert, update, and delete data easily. This
ensures that the database can be kept current and relevant as new
information becomes available or as old information changes.
4. Data Integrity: Database systems enforce rules to ensure that data is accurate
and consistent. This includes constraints like primary keys (unique
identifiers for records), foreign keys (relationships between tables), and
other validation rules.
5. Data Security: They provide mechanisms to protect data from unauthorized
access and breaches. This includes user authentication, access controls, and
encryption.
6. Concurrent Access: They support multiple users accessing and manipulating
the data simultaneously without conflicts. This is crucial for collaborative
environments where many people may need to interact with the data at once.
7. Backup and Recovery: They offer solutions for data backup and recovery to
prevent data loss in case of hardware failures, corruption, or other issues.
This ensures data durability and availability.
8. Data Relationships: They enable the modeling of complex relationships
between different data entities. This is useful for applications that require
multi-dimensional data analysis and reporting.
9. Scalability: They can handle increasing amounts of data and users by scaling
up (more powerful hardware) or scaling out (adding more servers), ensuring
that performance remains acceptable as demands grow.
10. Data Analysis: They support analytical functions and reporting tools that
help users analyze data trends, patterns, and insights. This is essential for
business intelligence and strategic planning.
Overall, a database system is fundamental for organizing, maintaining, and
utilizing data effectively in a variety of applications, from small-scale personal
projects to large-scale enterprise systems.
Applications of Database Management Systems
Important sectors where DBMS finds application include:
Reservation Systems
Planning travel and making reservations for trains and flights are simplified with
DBMS. DBMS applications help manage the schedule and track delays, boarding,
and departure of flights or trains.
Online Shopping
With the accelerated growth of the eCommerce sector, web-based shopping has
increased on platforms such as Amazon, eBay, Walmart, etc. Data management
systems store and track all the important information related to the orders, such as
invoices, shipping details, refund status, etc.
Healthcare System
DBMSs are of critical importance in the healthcare system as they help organize,
track and extract patient information such as appointments, treatment records,
payments, doctor schedules, invoices, etc.
Finance Sector
In addition to its immense application in banking, DBMS is important in storing
and managing accounting and finance-related information such as sales reports,
financial statements, financial assets, stocks, bonds, etc.
HR Management System
Database management systems help HR executives store and manage employee
information such as name, designation, salary, tax, insurance details, etc.
Scientific Database
Researchers utilize DBMS to store scientific research data, track projects, conduct
comparative studies, develop experiment protocols, and more.
Advantages of DBMS
Data organization: A DBMS allows for the organization and storage of data
in a structured manner, making it easy to retrieve and query the data as
needed.
Data integrity: A DBMS provides mechanisms for enforcing data integrity
constraints, such as constraints on the values of data and access controls that
restrict who can access the data.
Concurrent access: A DBMS provides mechanisms for controlling
concurrent access to the database, to ensure that multiple users can access
the data without conflicting with each other.
Data security: A DBMS provides tools for managing the security of the
data, such as controlling access to the data and encrypting sensitive data.
Backup and recovery: A DBMS provides mechanisms for backing up and
recovering the data in the event of a system failure.
Data sharing: A DBMS allows multiple users to access and share the same
data, which can be useful in a collaborative work environment.
Disadvantages of DBMS
Complexity: DBMS can be complex to set up and maintain, requiring
specialized knowledge and skills.
Performance overhead: The use of a DBMS can add overhead to the
performance of an application, especially in cases where high levels of
concurrency are required.
Scalability: The use of a DBMS can limit the scalability of an application,
since it requires the use of locking and other synchronization mechanisms to
ensure data consistency.
Cost: The cost of purchasing, maintaining and upgrading a DBMS can be
high, especially for large or complex systems.
Limited Use Cases: Not all use cases are suitable for a DBMS, some
solutions don’t need high reliability, consistency or security and may be
better served by other types of data storage.
Views of data –
In a database system, a "view" is a virtual table that provides a specific
representation of data from one or more tables. Views are used to simplify data
access, improve security, and present data in a way that meets the needs of
different users or applications. Here’s a breakdown of how views work and their
various types and purposes:
Types of Views
1. Simple View:
o Definition: Represents data from a single table.
o Purpose: Simplifies data access by focusing on specific columns or
rows from that table.
o Example: A view that shows only employee names and job titles from
an employee table.
2. Complex View:
o Definition: Combines data from multiple tables using joins, unions, or
other operations.
o Purpose: Provides a consolidated view of data from different tables,
often used for reporting or data analysis.
o Example: A view that joins customer and order tables to show
customers along with their order history.
3. Materialized View:
o Definition: A physical copy of the data stored on disk, created from a
query.
o Purpose: Improves performance by storing the results of complex
queries, which can be refreshed periodically.
o Example: A view that aggregates sales data by month and stores the
results for quick access.
4. Dynamic View:
o Definition: Always reflects the current data in the underlying tables
because it is not physically stored but generated on-the-fly.
o Purpose: Ensures that the data displayed is always up-to-date.
o Example: A view that shows the latest inventory levels for products.
Purposes of Views
1. Data Abstraction:
o Purpose: Hides the complexity of the underlying database schema
from users. Users interact with views rather than the raw tables, which
simplifies data manipulation and presentation.
o Example: Presenting a simplified view of customer information with
only relevant details, like name and contact info, while hiding
sensitive data.
2. Security and Access Control:
o Purpose: Restricts access to specific subsets of data. By using views,
administrators can limit the visibility of sensitive data.
o Example: Creating a view that excludes salary information from an
employee table when accessed by non-HR personnel.
3. Data Consistency:
o Purpose: Ensures consistent presentation of data. Views can enforce
certain data formats or aggregations, providing a standardized way of
accessing data.
o Example: A view that formats dates in a consistent manner for all
users.
4. Simplified Querying:
o Purpose: Simplifies complex queries by encapsulating them in a
view. Users can query the view as if it were a single table.
o Example: A view that combines several joins and filters to provide a
single, easy-to-query dataset for analysis.
5. Data Aggregation and Reporting:
o Purpose: Facilitates reporting and data analysis by presenting
aggregated data or pre-processed results.
o Example: A view that shows sales totals by region and product
category, ready for reporting.
6. Performance Optimization:
o Purpose: Materialized views can improve query performance by
precomputing and storing results, especially for complex queries.
o Example: A materialized view that pre-aggregates data for frequent
reporting queries.
Updating: Some views are updatable, meaning you can perform INSERT,
UPDATE, and DELETE operations on them if they are based on a single
table and meet certain criteria.
In summary, views are a powerful feature in database systems that provide
flexibility, security, and efficiency in how data is accessed and presented. They
help users interact with data in a more meaningful and controlled way, tailored to
specific needs and roles.
Data models
Data models are frameworks used to structure, organize, and manage data in a
database system. They define how data is connected, stored, and manipulated. Here
are the primary types of data models and their key characteristics:
1. Hierarchical Data Model
Structure: Represents data in a tree-like structure where each record has a
single parent and can have multiple children.
Example: An organizational chart where each employee has one direct
manager but may manage multiple employees.
Pros: Simple and easy to understand for certain use cases. Suitable for
applications with a natural hierarchical relationship.
Cons: Rigid structure; difficult to handle many-to-many relationships.
Inflexible for queries that require multiple paths.
2. Network Data Model
Structure: Uses a graph structure with nodes (records) and edges
(relationships), allowing more complex relationships than the hierarchical
model.
Example: A network of nodes representing interconnected cities in a
transportation system, where each city can have multiple routes to and from
other cities.
Pros: More flexible than the hierarchical model. Supports many-to-many
relationships and complex querying.
Cons: More complex to design and navigate. Less intuitive compared to
relational models.
3. Relational Data Model
Structure: Represents data in tables (relations) where each table consists of
rows and columns. Tables can be related to each other using keys.
Example: An employee database where one table contains employee details,
and another contains department details. These tables can be linked by a
common key (e.g., department ID).
Pros: Highly flexible and intuitive. Supports powerful querying using SQL.
Well-suited for a wide range of applications.
Cons: Performance can be an issue with very large datasets or complex
queries. Requires normalization to avoid redundancy.
4. Entity-Relationship (ER) Model
Structure: Uses entities (objects) and relationships to represent data.
Entities are objects with attributes, and relationships define how entities
interact with each other.
Example: A university database with entities like Students, Courses, and
Professors, and relationships like “enrolls in” and “teaches”.
Pros: Provides a high-level conceptual view of the data. Useful for
designing and documenting database schemas.
Cons: Needs to be translated into a relational or other physical data model
for implementation.
5. Object-Oriented Data Model
Structure: Represents data using objects, similar to object-oriented
programming languages. Objects encapsulate both data and behaviors
(methods).
Example: A graphics application where shapes (objects) have properties
(e.g., color, size) and methods (e.g., draw, resize).
Pros: Aligns closely with object-oriented programming concepts. Good for
applications with complex data interactions.
Cons: Less mature and standardized compared to relational models. Can be
complex to implement and manage.
6. Document Data Model
Structure: Represents data as documents, typically in formats like JSON,
XML, or BSON. Documents can be nested and have flexible schemas.
Example: A NoSQL database for an e-commerce application where each
product is stored as a JSON document with varying attributes.
Pros: Flexible schema. Suitable for hierarchical and semi-structured data.
Often used in NoSQL databases.
Cons: Less structured querying compared to relational models. Can lead to
data redundancy if not managed carefully.
7. Key-Value Data Model
Structure: Represents data as a collection of key-value pairs, where each
key is unique and maps to a value.
Example: A caching system where keys are user session IDs, and values are
session data.
Pros: Extremely fast for lookups. Simple and scalable.
Cons: Limited querying capabilities beyond basic key-based retrieval. Less
suitable for complex relationships.
8. Column-Family Data Model
Structure: Stores data in columns rather than rows. Data is organized into
column families, each containing rows of data.
Example: A column-family NoSQL database for a large-scale analytics
application, where columns can be added dynamically.
Pros: Efficient for reading and writing large volumes of data. Scales well
horizontally.
Cons: More complex data modeling compared to relational databases. Can
be less intuitive for developers familiar with relational models.
9. Graph Data Model
Structure: Uses graph structures with nodes (entities) and edges
(relationships) to represent and query data. Excellent for handling complex
and interconnected data.
Example: A social network database where users are nodes and their
relationships (friends, followers) are edges.
Pros: Highly effective for traversing and querying relationships. Ideal for
applications involving complex networks.
Cons: Can be less performant for non-graph-based queries. Requires
specialized graph database systems.
Choosing a Data Model
The choice of data model depends on the specific requirements of the application,
including data complexity, querying needs, scalability, and performance
considerations. Each model has its strengths and is suited to different types of use
cases, from traditional relational databases to modern NoSQL systems.
Database management system
A Database Management System (DBMS) is software that facilitates the creation,
management, and manipulation of databases. It provides an interface between users
or applications and the database, ensuring that data is stored, retrieved, and
managed efficiently and securely. Here’s an overview of the key features, types,
and functions of DBMS:
Key Features of a DBMS
1. Data Storage and Retrieval: Manages how data is stored on disk and
retrieves it efficiently when needed. It provides mechanisms for querying
and retrieving data.
2. Data Integrity and Security: Ensures data accuracy and consistency
through constraints, rules, and validation. It also provides security features to
restrict unauthorized access and protect data.
3. Data Manipulation: Supports operations such as inserting, updating, and
deleting data. It typically uses a query language like SQL for these
operations.
4. Transaction Management: Ensures that database transactions are processed
reliably and adhere to ACID properties (Atomicity, Consistency, Isolation,
Durability). This helps in maintaining data integrity during concurrent
access and system failures.
5. Concurrency Control: Manages simultaneous data access by multiple users
to prevent conflicts and ensure data consistency.
6. Backup and Recovery: Provides tools and processes to back up data
regularly and restore it in case of loss or corruption.
7. Data Independence: Abstracts the physical storage details from the user,
allowing changes in the database structure without affecting the application
programs.
8. User Management: Handles user roles and permissions, controlling access
levels and ensuring that users have appropriate access to the data.
Types of DBMS
1. Relational DBMS (RDBMS):
o Description: Uses a table-based structure where data is organized into
rows and columns. Tables can be related to each other through keys.
o Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server.
o Pros: Strong querying capabilities using SQL. Supports complex
transactions and data integrity.
2. NoSQL DBMS:
o Description: Designed for non-relational data models and handles a
variety of data types (document, key-value, column-family, graph).
o Examples: MongoDB (Document), Redis (Key-Value), Cassandra
(Column-Family), Neo4j (Graph).
o Pros: Flexible schema, scalable, and optimized for specific use cases
like big data and real-time analytics.
3. Object-Oriented DBMS (OODBMS):
o Description: Integrates object-oriented programming principles with
database management. Data is stored as objects similar to how it is
represented in object-oriented languages.
o Examples: db4o, ObjectDB.
o Pros: Seamless integration with object-oriented programming
languages. Useful for applications requiring complex data
representations.
4. Hierarchical DBMS:
o Description: Organizes data in a tree-like structure with parent-child
relationships.
o Examples: IBM's Information Management System (IMS).
o Pros: Simple and efficient for certain hierarchical data models.
o Cons: Less flexible for complex relationships and querying.
5. Network DBMS:
o Description: Uses a graph structure where data entities can have
multiple relationships.
o Examples: Integrated Data Store (IDS), UNIFY.
o Pros: More flexible than hierarchical models and supports many-to-
many relationships.
o Cons: More complex to design and manage compared to relational
models.
Functions of a DBMS
1. Data Definition: Allows users to define the structure of the database,
including tables, fields, and relationships. This is typically done using Data
Definition Language (DDL).
2. Data Manipulation: Facilitates the querying and modification of data. This
is achieved through Data Manipulation Language (DML) commands such as
SELECT, INSERT, UPDATE, and DELETE.
3. Data Administration: Provides tools for database administration tasks such
as performance monitoring, tuning, and optimization.
4. Data Integrity: Enforces rules to ensure that the data is accurate and
consistent. This includes implementing constraints like primary keys,
foreign keys, and unique constraints.
5. Data Security: Manages user access and permissions, ensuring that only
authorized users can perform certain actions on the data.
6. Data Recovery: Offers mechanisms to recover data in case of failure or
corruption, including backup and restore functions.
7. Data Backup: Regularly creates copies of the database to prevent data loss
and ensure recovery in case of hardware failures or other issues.
Choosing a DBMS
When selecting a DBMS, consider the following factors:
Data Model: The type of data and its relationships determine whether a
relational, NoSQL, or other type of DBMS is appropriate.
Scalability: Assess the system's ability to handle increasing data volumes
and user loads.
Performance: Evaluate how the DBMS handles query performance and
transaction throughput.
Ease of Use: Consider the ease of administration, querying, and integration
with other systems.
Cost: Factor in licensing costs, maintenance, and operational expenses.
In summary, a DBMS is a crucial component of modern data management,
providing a structured environment for data storage, manipulation, and security.
The choice of DBMS depends on the specific needs of the application and the
nature of the data being managed.
Three-schema architecture of DBMS
The three-schema architecture is a framework for designing and managing a
database system that aims to separate the user's view of the data from the physical
storage of the data. It provides a way to ensure data independence, which means
that changes to the database schema at one level do not necessarily affect other
levels. The three-schema architecture consists of three different levels of schema
abstraction:
It is this model, that is used in the requirement-gathering process i.e. before the
Database Designers start making a particular database. One such popular model is
the entity/relationship model (ER model). The E/R model specializes in entities,
relationships, and even attributes that are used by database designers.
Characteristics of a conceptual data model
Offers Organization-wide coverage of the business concepts.
This type of Data Models are designed and developed for a business
audience.
The conceptual model is developed independently of hardware
specifications like data storage capacity, location or software specifications
like DBMS vendor and technology. The focus is to represent data as a user
will see it in the “real world.”
Entity-Relationship Model (ER Model): It is a high-level data model which is
used to define the data and the relationships between them. It is basically a
conceptual design of any database which is easy to design the view of data.
Components of ER Model:
Entity: An entity is referred to as a real-world object. It can be a name, place,
object, class, etc. These are represented by a rectangle in an ER Diagram. An entity
can be a real-world object, either animate or inanimate, that can be easily
identifiable.
An entity set is a collection of similar types of entities. An entity set may contain
entities with attribute sharing similar values. For example, a student set may
contain all the students of a college; likewise, a teacher set may contain all the
teachers of a college from all faculties. Entity sets need not be disjoint.
Entities are represented by means of rectangles. Rectangles are named with the
entity set they represent.
For example, in a college database, students, teachers, classes, and courses offered
can be considered as entities. All these entities have some attributes or properties
that give them their identity.
Attributes: An attribute can be defined as the description of the entity. An attribute
is a characteristic of an entity. Entities are represented by means of their properties,
called attributes. All attributes have values.
For example, a student entity may have name, class, and age as attributes. There
exists a domain or range of values that can be assigned to attributes.
A student's name cannot be a numeric value. It has to be alphabetic. It can be Age,
Roll Number, or Marks for a Student.
A student's age cannot be negative, etc.
Attribute can be represented by an oval.
Relationship: Relationships are used to define relations among different entities.
Diamonds and Rhombus are used to show Relationships.
Relationship
The association among entities is called a relationship. For example, an employee
works at a department, a student enrolls in a course. Here, works at and Enrolls
are called relationships.
Relationship can be represented by diamond shape.
Relationship Set-A set of relationships of similar type is called a relationship set.
Like entities, a relationship too can have attributes. These attributes are called
descriptive attributes.
Types of Entities
Weak Entity: Weak entity is an entity that depends on another entity. Weak entity
doesn't have key attribute (primary key) of their own. In other words, the entity set
which does not have sufficient attributes to form a primary key is called as Weak
entity set. Double rectangle represents weak entity.
Strong Entity: An entity which have an independent existence is called strong
entity. A strong entity set have their primary keys.
Types of Attributes:
Simple attribute − Simple attributes are atomic values, which cannot be divided
further.
For example, a student's phone number is an atomic value of 10 digits.
Composite attribute − Composite attributes are made of more than one simple
attribute.
For example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the
physical database, but their values are derived from other attributes present in the
database. For example, average_salary in a department should not be saved directly
in the database, instead it can be derived. For another example, age can be derived
from data_of_birth.
Single-value attribute − Single-value attributes contain single value. For
example − Social_Security_Number.
Multi-value attribute − Multi-value attributes may contain more than one
values. For example, a person can have more than one phone number,
email_address, etc.
When one attribute of a particular entity refers to another attribute of another
entity, some relationship exists between both entities. How the elements of one
entity are related to another entity can be defined using the Mapping Cardinality.
There are four types of Mapping cardinality in a database, which are:
One-to-One cardinality
One-to-Many cardinality
Many-to-One cardinality
Many-to-Many cardinality
1. One-to-One Cardinality
An entity in A is associated with at most one entity in B, and an entity in B is
associated with at most one entity in A.
.
2. One-to-Many Cardinality
An entity in A is associated with any number (zero or more) of entities in B.
An entity in B, however, can be associated with at most one entity in A.
3. Many-to-One Cardinality
An entity in A is associated with at most one entity in B. An entity in B,
however, can be associated with any number of entities in A.
4. Many-to-Many Cardinality
An entity in A is associated with any number of entities in b, and an entity in