0% found this document useful (0 votes)
25 views31 pages

Database Unit 1

The document provides an overview of databases and database management systems (DBMS), highlighting their importance in organizing data, ensuring data integrity, and facilitating efficient access. It distinguishes between data and information, explains various types of databases, and outlines the advantages of using a DBMS over traditional file systems. Additionally, it discusses issues related to file systems, such as data redundancy and structural dependence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views31 pages

Database Unit 1

The document provides an overview of databases and database management systems (DBMS), highlighting their importance in organizing data, ensuring data integrity, and facilitating efficient access. It distinguishes between data and information, explains various types of databases, and outlines the advantages of using a DBMS over traditional file systems. Additionally, it discusses issues related to file systems, such as data redundancy and structural dependence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

DATABASE

The collection of data usually referred to as the database which contains information
relevant to an enterprise
DBMS
A Database management system (DBMS) is a collection of interrelated data and a
set of programs to access those data
Goals of DBMS
 Providing a way to store and retrieve database information that is both convenient and
efficient.

 Ensuring the safety of the information.

1.1 WHY DATABASE?

Databases provide better data organization, reduced redundancy, easier data access, and
improved data integrity compared to simple file storage.

Every organization needs a good database. Databases support internal business processes
and record communications with suppliers and customers. They include more specialized data,
including economic or technical models, as well as administrative data. Systems for digital
libraries, vacation reservations, and inventory are a few examples. Databases are important for
the reasons listed below:

✧ Efficient scaling: Database applications may scale to billions of data, making them crucial
for digital data storage.

✧ Data integrity: Data consistency can be maintained via database rules and conditions.

✧ Data security: Databases support privacy and compliance requirements associated with any
data.

✧ Data analytics: Modern software systems use databases to analyze data.


1.2 DATA VS INFORMATION

Data are raw facts and figures without context, while information is processed data that is
meaningful and useful for decision-making.

1. Data

Data refers to raw, unprocessed facts and figures without context or meaning. Data can be in
the form of numbers, text, images, or sounds, but on its own, it doesn’t provide any clear
understanding.

Examples of Data:
✧ A list of numbers: 30,35,77,70
✧ A set of names: sakshi, yogi, ruthvik, arun
✧ A series of dates: 2023-12-15, 2023-12-16, 2023-12-17
✧ Individual sensor readings (e.g., temperature in degrees or heartbeat in beats per minute).
Types of Data Stored in a Database

1. Structured Data
Data that is organized into predefined formats, such as rows and columns in tables.

Examples include customer names, order details, and product prices.

2. Unstructured Data
Data that doesn’t fit neatly into tables.
Examples include images, videos, documents, and social media posts.

3. Semi-Structured Data
Data that doesn’t fit fully into a table structure but contains tags or markers to separate
elements, such as XML files or JSON.

2. Information

Information is data that has been processed or organized in a way that it is meaningful and
useful to the recipient. Information provides context, relevance, and purpose to data, enabling
people to make decisions, draw conclusions, or take actions.

Examples:

 The age of prakash is 25


 Data becomes information when it is processed, interpreted, and organized in a
meaningful way.

Raw Data → Processed Data → Information

1.3 INTRODUCING TO DATABASE

A database is a structured collection of data that is stored and managed in a way that allows
efficient retrieval, updating, and manipulation.

1. Database:
A database is a structured collection of data that is stored and managed in a way that
allows efficient retrieval, updating, and manipulation.
2. DBMS:
Database Management System (DBMS) is software that provides a systematic and
efficient way to store, retrieve, and manage data in databases.
3. Data Model:
A framework that defines how data is structured, stored, and related (e.g., relational,
hierarchical).
4. Schema:
The logical structure or blueprint of a database, defining tables, fields, and relationships.
5. Instance:
The actual data stored in the database at a particular moment.
6. Data Independence:
The ability to change the schema at one level without affecting other levels. Includes:
o a. Physical Data Independence
o b. Logical Data Independence
7. Keys:
Attributes that uniquely identify records, e.g., primary key, foreign key.
8. Normalization:
Process of organizing data to reduce redundancy and improve data integrity.
9. Query Language:
Language used to interact with the database, such as SQL (Structured Query Language).
10.Transaction:
A unit of work performed on the database ensuring ACID properties (Atomicity,
Consistency, Isolation, Durability).
11.Concurrency Control:
Mechanisms to manage simultaneous data access without conflicts.
12.Backup and Recovery:
Methods to safeguard data and restore it after failures.
13. Data Integrity:
Ensuring accuracy and correctness of data using constraints.
14.Views:
Virtual tables derived from base tables to provide customized data access.
15.Security:
Protecting data from unauthorized access through authentication and authorization.

1.3.1 Types of Databases

 Hierarchical databases
 Network databases
 Object-oriented databases
 Relational databases
 NoSQL databases

1. Hierarchical Databases:

Just as in any hierarchy, this database follows the progression of data being categorized in
ranks or levels, wherein data is categorized based on a common point of linkage. As a result,
two entities of data will be lower in rank and the commonality would assume a higher rank.
2. Network Databases:

A network database is a hierarchical database, but with a major tweak. The child records
are given the freedom to associate with multiple parent records. As a result, a network or net of
database files linked with multiple threads is observed.

3. Object-Oriented Databases:

Those familiar with the Object-Oriented Programming Paradigm would be able to relate to
this model of databases easily. Information stored in a database is capable of being represented
as an object which responds as an instance of the database model. Therefore, the object can
be referenced and called without any difficulty. As a result, the workload on the database is
substantially reduced.

4. Relational Databases:

Considered the most mature of all databases, these databases lead in the production line
along with their management systems. In this database, every piece of information has a
relationship with every other piece of information. This is on account of every data value in the
database having a unique identity in the form of a record.
5. NoSQL Databases:

A NoSQL originally referring to non SQL or non-relational is a database that provides a


mechanism for storage and retrieval of data. This data is modeled in means other than the
tabular relations used in relational databases.
A NoSQL database includes simplicity of design, simpler horizontal scaling to clusters of
machines, and finer control over availability. The data structures used by NoSQL databases are
different from those used by default in relational databases which makes some operations faster
in NoSQL. The suitability of a given NoSQL database depends on the problem it should solve.
Data structures used by NoSQL databases are sometimes also viewed as more flexible than
relational database tables.
MongoDB falls in the category of NoSQL document-based database.

Advantages of NoSQL –

There are many advantages of working with NoSQL databases such as MongoDB and
Cassandra. The main advantages are high scalability and high availability.

Disadvantages of NoSQL –

NoSQL has the following disadvantages:

 NoSQL is an open-source database.


 GUI is not available
 Backup is a weak point for some NoSQL databases like MongoDB.
 Large document size.
Advantages of DBMS

DBMS helps in efficient organization of data in a database, which has the following
advantages over a typical file system:

 Minimized redundancy and data inconsistency:


Data is normalized in DBMS to minimize the redundancy, which helps in keeping data
consistent. For example, student information can be kept at one place in DBMS and
accessed by different users. This minimized redundancy is due to primary key and
foreign keys.
 Simplified Data Access:
A user needs only the name of the relation, not the exact location, to access data, so the
process is very simple.
 Multiple data views:
Different views of the same data can be created to cater to the needs of different users.
For example, faculty salary information can be hidden from student view of data but
shown in admin view.
 Data Security:
Only authorized users are allowed to access the data in DBMS. Also, data can be
encrypted by DBMS, which makes it secure.
 Concurrent access to data:
Data can be accessed concurrently by different users at the same time in DBMS.
 Backup and Recovery mechanism:
DBMS backup and recovery mechanism helps to avoid data loss and data inconsistency
in case of catastrophic failures.

INTRODUCING THE DATABASE


A database is a shared, integrated computer structure that stores a collection of the
following:

• End-user data—that is, raw facts of interest to the end user.


• Metadata, or data about data, through which the end-user data are integrated and managed

database management system (DBMS) is a collection of programs that manages the database
structure and controls access to the data stored in the database.
The number of users determines whether the database is classified as single user or multiuser.

A single-user database supports only one user at a time. In other words, if user A is using the
database, users B and C must wait until user A is done. A single-user database that runs on a
personal computer is called a desktop database.

Multiuser database supports multiple users at the same time. When the multiuser database
supports a relatively small number of users (usually fewer than 50) or a specific department
within an organization, it is called a workgroup database.

When the database is used by the entire organization and supports many users (more than 50,
usually hundreds) across many departments, the database is known as an enterprise database.

Location might also be used to classify the database. For example, a database that supports data
located at a single site is called a centralized database.

A database that supports data distributed across several different sites is called a distributed
database.

A cloud database is a database that is created and maintained using cloud data services, such
as Microsoft Azure or Amazon’s AWS

Using this criterion, databases are grouped into two categories:

1. general-purpose databases
2. discipline-specific databases

General-purpose databases

General-purpose databases contain a wide variety of data used in multiple disciplines—for


example census database that contains general demographic data, and the LexisNexis and
ProQuest databases that contain newspaper, magazine, and journal articles for a variety of
topics.
Discipline-specific databases

Discipline-specific databases contain data focused on specific subject areas. The data in this
type of database are used mainly for academic or research purposes within a small set of
disciplines. Examples of discipline-specific databases include medical databases that store
confidential medical history data.
A database that is designed primarily to support a company’s day-to-day operations is
classified as an operational database, also known as an online transaction processing
(OLTP), transactional, or production database.

an analytical database focuses primarily on storing historical data and business metrics used
exclusively for tactical or strategic decision making.

The data warehouse is a specialized database that stores data in a format optimized for
decision support. The data warehouse contains historical data obtained from the operational
databases as well as data from other external sources

Online analytical processing (OLAP) is a set of tools that work together to provide an
advanced data analysis environment for retrieving, processing, and modeling data from the
data warehouse.
Extensible Markup Language (XML) is a special language used to represent and manipulate
data elements in a textual format.

An XML database supports the storage and management of semi structured XML data.
1.4 FILE SYSTEMS

A file system is a way of storing and organizing files on storage devices, where data is stored in separate files
without any structured relationship between them.

1.4.1 Manual File Systems

A manual file system is a way of organizing and storing data that does not involve the use of automated
tools or software. The file system can be saved to a local or external hard disk, flash drive, or other storage
device.

Examples:

 Filing Cabinets
 Address Books
 Yellow Pages
 Telephone Directories
 Diaries
 Guest Lists
 Portfolios, etc.

Advantages:

 It is very simple and straightforward.


 It requires minimal technical knowledge.
 It gives users complete control over the organization of their files.

Disadvantages:

 It can be time-consuming and error-prone if a large number of files need to be managed.


 It does not offer the same level of automation, backup and recovery, and security as an automated file
system.

1.4.2 Computerized File Systems

Data and information can be stored and arranged on a computer using computerized file systems. They offer a
method for classifying, searching, and retrieving data, which simplifies handling large volumes of data.
There are several types of Computerized file systems, including:

 Hierarchical File System (HFS): Apple’s Macintosh operating system uses a file system called HFS.
Each directory can hold additional subdirectories and files, and it arranges files and directories in a
structure like a tree.
 New Technology File System (NTFS): Microsoft Windows operating systems use the NTFS file
system. It offers advanced functions like encryption, compression, and permissions for files and
folders.
 Extended File System (EXT): Many Linux-based operating systems employ the EXT file system type.
Because it is a journaling file system, it keeps track of all file system modifications, enhancing stability
and data integrity.

1.4.3 Basic File Terminology

 Data – Raw facts, such as a telephone number, a birth date, a customer name, and a year-to-date (YTD)
sales value.
 Field – A character or group of characters (alphabetic or numeric) that has a specific meaning. A field
is used to define and store data.
 Record – A logically connected set of one or more fields that describes a person, place, or thing. For
example, the fields that constitute a record for a customer might consist of the customer’s name,
address, phone number, date of birth, credit limit, and unpaid balance.
 File – A collection of related records. For example, a file might contain data about the students
currently enrolled at Gigantic University.

1.5 PROBLEMS WITH FILE SYSTEM DATA PROCESSING

The file system method serves two purposes:

1. Understanding the development of modern databases.


2. Many file systems arise problems not unique. Even if database technology makes it possible to avoid
such problems, failure to fully understand them may lead to their replication.

The following problems are associated with file systems:

 Lengthy development times


 Difficulty of getting quick answers
 Complex system administration
 Lack of security and limited data sharing
 Extensive programming
1.5.1 Structural and Data Dependence

 Structural dependence: Changing the database schema requires changes to all access programs.
 Structural independence: Changing the database schema does not affect data access.
 Data dependence: A data condition in which data representation and manipulation are dependent on
the physical data characteristics.
 Data independence: A condition in which data access is unaffected by changes in the physical data
storage characteristics.

The two types are:

o Physical Data format: Modify physical schema without affecting the schema or logical data.
o Logical Data format: Modify logical schema without affecting the programs.

1.5.2 Data Redundancy

The term data redundancy describes the unnecessary duplication of information in a database. This usually
happens when the same piece of data is kept in several tables or locations.

Redundant data can result in:

 Data anomalies
 Lower query performance
 Raised storage costs
 Difficulty maintaining consistency

Reason for Data Redundancy

 Poor database design: When tables are not normalized properly, the same data can end up in multiple
tables.
 Lack of proper relationship between tables: For example, storing customer information in multiple
places (orders, payments, and customer profiles).

Example of Data Redundancy

Imagine a database that stores information about customers, orders, and products.

CustomerID Name Address ProductName OrderDate

1 Akshay 123 Main St. Laptop 2024-12-15

1 Akshay 123 Main St. Smartphone 2024-12-15

2 Banu 456 Old St. Laptop 2024-12-16


CustomerID Name Address ProductName OrderDate

2 Banu 456 Old St. Tablet 2024-12-16

In this example, the customer information (Name, Address) is repeated for each order they place. This is a
classic case of data redundancy, where Akshay’s name and address appear twice, and Banu’s name and
address appear twice.

Note:
Data integrity refers to the accuracy and consistency of data. In other words, data integrity means that:

 Data is accurate — there are no data inconsistencies.


 Data is verifiable — the data will always yield consistent results.

1.5.3 Data Anomalies

Data anomalies refer to inconsistencies, inaccuracies, or unexpected results that occur when performing
operations like insert, update, or delete on a database. Data anomalies are often caused by redundant data or
poor database design.

There are three primary types of data anomalies:

1. Insert Anomaly
2. Update Anomaly
3. Delete Anomaly

1. Insert Anomaly

An insert anomaly occurs when we are unable to add data to the database due to how tables are designed.
This often happens when the database schema does not allow certain records unless other data is also present.

Example of Insert Anomaly:

In the unnormalized table below, adding a new customer who has not placed any order would be problematic,
as the customer's data would still need to be inserted alongside an empty order:

CustomerID Name Address ProductName OrderDate

1 vino 123 college road Laptop 2024-12-15

2 Banu 456 Old St. Laptop 2024-12-16

If a new customer, Chandru, is added but has not yet placed any order, we would have to insert dummy values into
the ProductName and OrderDate columns, which is problematic and results in insert anomalies
Solution:

Normalization eliminates this problem by ensuring that only necessary data is entered in each table.
For example, you can add a Customers table and an Orders table. If Chandru has no orders yet, his record can
be added in the Customers table without needing to enter data in the Orders table.

2. Update Anomaly

An update anomaly occurs when data is not updated consistently in all places where it appears.
This typically happens in situations where data is redundant.

Example of Update Anomaly:

Consider the unnormalized table:

CustomerID Name Address ProductName OrderDate

1 Akshay 123 Main St. Laptop 2024-12-15

1 Akshay 123 Main St. Smartphone 2024-12-15

If Akshay moves to a new address and we only update one of the records, the other record will still have his
old address, leading to an inconsistent state in the database.

Solution:

To resolve this, the database should be normalized so that each customer’s information is stored only once.

Customers Table:

CustomerID Name Address

1 Akshay 456 New St.

Orders Table:

OrderID CustomerID ProductName OrderDate

1 1 Laptop 2024-12-15

2 1 Smartphone 2024-12-15
Updating Akshay’s address in the Customers table ensures that all of Akshay’s orders reflect the correct
address.

3. Delete Anomaly

A delete anomaly occurs when deleting a piece of data inadvertently leads to the loss of other valuable
data.

Example of Delete Anomaly:

Consider the unnormalized table:

CustomerID Name Address ProductName OrderDate

1 Akshay 123 Main St. Laptop 2024-12-15

1 Akshay 123 Main St. Smartphone 2024-12-15

If we delete Akshay’s Laptop order (OrderID 1), we also delete Akshay’s address and name because the
customer’s information is tied to the order in the same table.
In this case, deleting one order could remove valuable information about Akshay.

Solution:

Normalization prevents this problem by ensuring that customer data and order data are stored in separate
tables.

Customers Table:
CustomerID Name Address

1 Akshay 123 Main St.

Orders Table:
OrderID CustomerID ProductName OrderDate

1 1 Laptop 2024-12-15

2 1 Smartphone 2024-12-15

Now, deleting Akshay’s Laptop order (OrderID 1) does not affect his name and address in the Customers
table.
1.6 DATABASE SYSTEMS
1.6.1 The Database System Environment

A database system is a grouping of elements that specify and control how data is gathered, stored, managed,
and used in a database environment.

The database system is made up of five main components:


👉 data, procedures, people, software, and hardware.

1. Hardware

 Includes:
Computers (PCs, tablets, workstations, servers, and supercomputers),
Storage devices, printers, network devices (hubs, switches, routers, fiber optics),
and other devices (automated teller machines, ID readers, etc.)
 All are considered hardware.

2. Software

Three different types of software are required for a database system to function properly.
Even though DBMS software is the most recognized, the full list includes:

 Operating system software


 DBMS software
 Application programs and utilities

Operating System Software

 Controls all hardware components


 Enables the operation of all other software on computers

Examples:
Microsoft Windows, Linux, Mac OS, UNIX, and MVS
DBMS Software

 DBMS software manages the database within the database system.

Examples:
Microsoft’s SQL Server, Oracle Corporation’s Oracle, Oracle’s MySQL, and IBM’s DB2.

Application Programs and Utility Software

 The most popular way to access data in a database and create reports, tabulations, and other
information for decision-making is through application programs.

Utilities

 Utilities are software tools used to help manage the database system’s computer components.

For example:
Major DBMS vendors now provide graphical user interfaces (GUIs) to:

 Create database structures


 Control database access
 Monitor database operations

3. People

This component includes all users of the database system.

There are five types of users:

1. System Administrators
o Manage the general operations of the database system.
2. Database Administrators (DBAs)
o Manage the DBMS
o Ensure the database operates correctly
o Work with database designers to create the database structure
o Often referred to as database architects
3. Database Designers (mentioned in the DBAs section)
o Responsible for designing the structure of the database.
4. System Analysts and Programmers
o Create and carry out application programs
o Design and develop:
 Procedures
 Reports
Data-entry screens used by end users

5. End Users (mentioned indirectly in context)
o Use applications and interfaces developed by analysts/programmers to interact with the database

End Users

 End users are individuals who use application programs to carry out the day-to-day operations of
the organization.
 Examples include:
o Managers
o Directors
o Supervisors
o Sales clerks

These individuals are all considered end users.

4. Procedures

 Procedures are the guidelines and directives that control how the database system is designed and
used.
 While sometimes overlooked, procedures are:
o Essential to system functionality
o Enforce business standards
o Ensure systematic operations

They help:

 Guarantee systematic approaches to auditing


 Monitor the data entering the system
 Control information output from the database

5. Data

 Data refers to the collection of facts kept in the database.


 The process of:
o Choosing what data to include
o Arranging it properly

...is a key responsibility of the database designer.

Data is the foundation from which information is created.


1.6.2 DBMS Functions

A Database Management System (DBMS) ensures:

 Integrity
 Consistency

...of data through several core functions, which are usually transparent to end users:

✧ Data Dictionary Management

 Involves maintaining and controlling metadata (data about data).


 Includes:
o Definitions
o Relationships
o Formats of data elements

✧ Data Storage Management

 Concerns the processes and technologies used to:


o Store
o Organize
o Protect
o Retrieve data
 Collectively referred to as data storage management.

✧ Data Transformation and Presentation

 The process of converting data to required data structures.


 Involves distinguishing between:
o Logical data format
o Physical data format

✧ Security Management

 Ensures data privacy and user security


 Controlled by the security system created by the DBMS
 Manages:
o Who can access the database
o What data items can be accessed
o Operations allowed (read, add, delete, modify)
o Enforced through security rules

✧ Multiuser Access Control

 Uses complex algorithms to:


o Allow multiple users to access the database simultaneously
o Maintain data integrity during concurrent access

✧ Backup and Recovery Management

 Ensures data safety and integrity


 Provides backup and recovery services
 The DBA (Database Administrator) performs:
o Standard and special backup
o Restore operations
 Recovery management is about restoring system functionality after failure (e.g. disk error, power loss)

✧ Data Integrity Management

 Reduces data redundancy


 Maximizes data consistency
 Uses integrity rules and data relationships defined in the data dictionary
 Especially important in transaction-oriented systems

✧ Database Access Languages

 Specialized programming languages for interacting with databases


 Most common:
o SQL (Structured Query Language)
o DML (Data Manipulation Language)

✧ Application Programming Interfaces (APIs)

 Sets of rules and tools for enabling software applications to communicate with one another
✧ Database Communication Interfaces

 Protocols for applications to interact with databases


 Enable:
o Users/software to send requests
o Retrieve data from the database
 Common examples:
o ODBC (Open Database Connectivity)
o JDBC (Java Database Connectivity)

1.6.3 Managing the Database System


✔️Overview

 Implementing a database system (vs. a file system) allows for the application of rigorous policies and
guidelines.
 Focus shifts from:
o Programming tasks (in file systems)
o To resource management and software administration in database systems

✔️Advantages

 Enables more complex applications to be addressed efficiently


 Performance depends on:
o Types of data structures
o Relationships among those structures

✔️Disadvantages of Database Systems

Despite their benefits, database systems also bring challenges:

 ⛔ Increased costs
 ⛔ Management complexity
 ⛔ Maintaining currency
 ⛔ Vendor dependence
 ⛔ Frequent upgrade/replacement cycles

1.7 DATA MODELS


✔️Definition

 Data modeling (process): The act of creating a structured data model


 Data model (result): A visual representation or blueprint from the modeling process
✔️Purpose

 Helps understand complex real-world systems


 Represents:
o Data structures
o Attributes
o Relationships
o Constraints
o Transformations, etc.

✔️Characteristics

 Supports specific problem domains


 Modeling is a progressive and iterative process
 Starts with a basic understanding and grows in detail and complexity as comprehension improves

1.8 THE IMPORTANCE OF DATA MODELS

Data models are essential for effective database design and implementation. Their importance is highlighted in
the following ways:

 ✦ Clear Communication: Enhances understanding between technical and non-technical teams.


 ✦ Data Consistency & Integrity: Ensures the data is reliable and accurate.
 ✦ Efficient Database Design: Leads to optimized data structures for better performance.
 ✦ System Development: Acts as a blueprint for developers to build applications.
 ✦ Decision-Making: Enables data-driven decisions in business processes.
 ✦ Maintenance & Scalability: Simplifies future updates and growth.
 ✦ Error Prevention: Minimizes risks of data errors and system failures.
 ✦ Query Performance: Enhances speed and efficiency in retrieving data.
 ✦ Advanced Technologies: Supports integration with technologies like AI and Big Data.
 ✦ Compliance & Security: Assists in regulatory compliance and ensures secure data management.

1.9 BASIC BUILDING BLOCKS

The four fundamental building blocks of all data models are:

1. Entities
2. Attributes
3. Relationships
4. Constraints

These components form the foundation of data modeling and help define how data is structured and connected
within a database.
1.9.1 Entities

 Definition: An entity is a key concept representing a distinct object, concept, or thing that exists in the
domain being modeled.
 Example (E-Commerce System):
o Customer: Represents a customer in the system.
o Product: Represents a product available for purchase.
o Order: Represents a customer order.

1.9.2 Attributes

 Definition: Attributes are the properties or characteristics of the entity.


 Example (E-Commerce System):
o Customer: CustomerID, FirstName, LastName, Email
o Product: ProductID, ProductName, Price
o Order: OrderID, OrderDate, CustomerID (foreign key)

1.9.3 Relationships

 Definition: Relationships in data models define how two or more entities are connected.
 Example: A Customer can place multiple Orders → This is a 1:N (One-to-Many) relationship.

Types of Relationships:

1. One-to-One (1:1):
o One entity is related to only one other entity.
o Example: One person has one passport.
2. One-to-Many (1:N):
o One entity is related to many other entities.
o Example: One department has many employees.

3. Many-to-Many (M:N) Relationship

 Definition: Many entities are related to many others.


 Example: Students can enroll in many courses, and each course can have many students.

1.9.4 Constraints

 Definition: A restriction placed on data, usually expressed in the form of rules.


 Purpose: Ensures data integrity.
 Example: “A student’s GPA must be between 0.00 and 4.00.”
1.10 BUSINESS RULES

 Definition: A brief, simple, and unambiguous description that defines a policy, procedure, or idea
within an organization.

Examples:

 A customer may generate many invoices.


 An invoice is generated by only one customer.

Importance:

 Ensures:
o Data integrity
o Consistency
o Compliance with organizational processes
 Helps the database reflect real-world operations accurately.

1.10.1 Discovering Business Rules

 Objective: To ensure the database properly reflects business logic, restrictions, and relationships.

Methods for Discovering Business Rules:

 Interviews with Stakeholders


 Reviewing Documentation
 Observation of Business Operations
 Analyzing Existing Systems
 Use Cases and Scenarios

1.10.2 Translating Business Rules into Data Model Components

 Business rules help identify:


o Entities
o Attributes
o Relationships
o Constraints
 Key Concepts:
o Nouns in a business rule typically become entities.
o Verbs represent relationships between those entities.
 Example:
o Business Rule: "A customer may generate many invoices."
 Nouns: Customer, Invoices → become entities.
 Verb: Generate → becomes the relationship between them.

From this, we deduce:

o Customer and Invoice should be modeled as entities.


o There is a generate relationship from Customer to Invoice.

1.10.3 Naming Conventions

Entity Names:

 Use singular, descriptive names.


 ✅ Example: Customer, Order, Product
 ❌ Not: Customers, Orders

Attribute Names:

 Use clear, self-explanatory names that reflect their purpose.


 ✅ Example: Customer_ID, First_Name, Last_Name, Order_Date

Relationship Names:

 Should clarify the nature of the relationship.


 ✅ Example: A relationship between Customer and Order can be named Places, indicating that a
customer places an order.

Avoid Abbreviations:

 Do not use overly shortened names.


 ✅ Use: Employee
 ❌ Avoid: Emp

Consistency:

Follow a consistent pattern throughout the database schema. If you start using underscores (e.g., First_Name),
continue using them throughout your database.

1.11 THE EVOLUTION OF DATA MODELS

The evolution of the major data models are:

1. Hierarchical and Network Models


2. The Relational Data Models
3. The Entity Relationship Models
4. The Object-Oriented (OO) Models
5. Object/Relational and XML
6. Emerging Data Models: Big Data and NoSQL

1.11.1 Hierarchical Models

 It was developed by Rockwell and IBM in the 1970s.


 Early database system.
 This model is based on an upside-down tree structure in which each record is called a segment.
 The top record is the root segment. Each segment has a 1:M relationship to the segment directly
below it.
 It allows repeating information using parent/child relationships in such a way that it cannot have too
many relationships.
 Example: IBM’s IMS (Information Management System)

Advantages (of Hierarchical Model)

1. It promotes data sharing.


2. Parent/child relationship promotes conceptual simplicity.
3. Database security is provided and enforced by DBMS.
4. Parent/child relationship promotes data integrity.
5. It is efficient with 1:M relationships.
Disadvantages

1. Understanding the physical properties of data storage is necessary for complex implementation.
2. Navigational systems require an understanding of the hierarchical path and result in complex
application development, management, and use.
3. All application programs must adapt to structural changes.
4. There are restrictions on implementation (no M:N or multiparent relationships).
5. The DBMS lacks a language for data definition and data manipulation.
6. Standards are lacking.

1.11.2 Network Models

 An early data model that represented data as a collection of record types in 1:M relationships.
 The network model allows a record to have more than one parent. It is a graph-like structure.
 The schema is the conceptual organization of the entire database as viewed by the database
administrator.
 The subschema defines the portion of the database “seen” by the application programs that actually
produce the desired information from the data within the database.
 A data manipulation language (DML) defines the environment in which data can be managed and is
used to work with the data in the database.

Advantages

1. It promotes data sharing.


2. Parent/child relationship promotes conceptual simplicity.
3. Database security is provided and enforced by DBMS.
4. Parent/child relationship promotes data integrity.
5. It is efficient with 1:M relationships.

Disadvantages

1. Understanding the physical properties of data storage is necessary for complex implementation.
2. Navigational systems require an understanding of the hierarchical path and result in complex
application development, management, and use.
3. All application programs must adapt to structural changes.
4. There are restrictions on implementation (no M:N or multiparent relationships).
5. The DBMS lacks a language for data definition and data manipulation.
6. Standards are lacking.
1.11.2 Network Models

 An early data model that represented data as a collection of record types in 1:M relationships.
 The network model allows a record to have more than one parent. It is a graph-like structure.
 The schema is the conceptual organization of the entire database as viewed by the database
administrator.
 The subschema defines the portion of the database “seen” by the application programs that actually
produce the desired information from the data within the database.

 A data manipulation language (DML) defines the environment in which data can be managed and
is used to work with the data in the database.

  A schema data definition language (DDL) enables the database administrator to define the schema
components.
  Example: CODASYL DBTG model.

Advantages

1. The hierarchical model is at least as simple conceptually.


2. It manages more relationship types, including multiparent and M:N.
3. Compared to file system and hierarchical models, data access is more adaptable.
4. The relationship between data owner and member encourages data integrity.
5. Standards are being followed.
6. It contains the DBMS’s Data Definition Language (DDL) and Data Manipulation Language (DML).

Disadvantages

1. The system is still a navigational system, but its efficiency is limited by its complexity.
2. Complex application development, management, and implementation result from navigational systems.
3. All application programs must be modified in order to implement structural change
1.11.3 Relational Models

Developed by E. F. Codd of IBM in 1970, the relational model is based on mathematical set theory and
represents data as independent relations.
Each relation (table) is conceptually represented as a two-dimensional structure of intersecting rows and
columns.
The relations are related to each other through the sharing of common entity characteristics (values in
columns).

table (relation):
A logical construct perceived to be a two-dimensional structure composed of intersecting rows (entities) and
columns (attributes) that represents an entity set in the relational model.

tuple:
In the relational model, a table row.

Relational Database Management System (RDBMS):

A collection of programs that manages a relational database.


The RDBMS software translates a user’s logical requests (queries) into commands that physically locate and
retrieve the requested data.

Advantages

 Structural independence is promoted by the use of independent tables.


Changes in a table’s structure do not affect data access or application programs.
 Tabular view substantially improves conceptual simplicity,
thereby promoting easier database design, implementation, management, and use.
 Ad hoc query capability is based on SQL.
 Powerful RDBMS isolates the end user from physical-level details and improves implementation and
management simplicity.
Disadvantages

 The RDBMS requires substantial hardware and system software overhead.


 Conceptual simplicity gives relatively untrained people the tools to use a good system poorly, and if
unchecked, it may produce the same data anomalies found in file systems.
 It may promote islands of information problems as individuals and departments can easily develop
their own applications.

You might also like