0% found this document useful (0 votes)
2 views

Chapter 1 Introduction to DB

The document discusses the fundamentals of database systems, emphasizing the limitations of traditional flat file databases, such as poor accessibility, data redundancy, and security issues. It outlines the importance of Database Management Systems (DBMS) for efficient data management, highlighting the advantages of databases over file systems, including controlled redundancy and improved data integrity. Additionally, it covers database terminology, types, components, and the significance of proper database design to prevent data anomalies.

Uploaded by

floydannold
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 1 Introduction to DB

The document discusses the fundamentals of database systems, emphasizing the limitations of traditional flat file databases, such as poor accessibility, data redundancy, and security issues. It outlines the importance of Database Management Systems (DBMS) for efficient data management, highlighting the advantages of databases over file systems, including controlled redundancy and improved data integrity. Additionally, it covers database terminology, types, components, and the significance of proper database design to prevent data anomalies.

Uploaded by

floydannold
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

SSE 2106 Fundamentals of

Databases Systems
PROFESSIONAL CAREERS IN DATABASES
SCENARIO
▪ Consider the KyU which has over 10 000 students.
▪ The University has been keeping a large collection of information of its
campuses employees, students and resources. Imagine if they were
keeping them on traditional flat files?
▪ What do you think will be the state of the files?

▪ Imagine, This da ta should ncurrently by several employees


in be accessed their day to day work.
Questions about the data must co be answered quickly, changes made
to the data by different users must be applied consistently, and
access to certain parts of the data (e.g., salaries) must be restricted.
▪ What will be the limitations of using files?
OBJECTIVES
▪ By the end of this chapter you should be able to answer
the following questions:

▪ What is a Database Management System (DBMS)?


▪ Why is DBMS important in managing data?
▪ What are the main components of a DBMS?
▪ Why is database design is important?
▪ What is the difference between Database Design and the DBMS?
FILE TERMINOLOGY
▪ Character: a character is the most
basic element of data that can
be observed and
manipulated. A character
is a single symbol such as a
digit, letter, or other special
character (e.g., $, #, and ?).
▪ Field: a character or group of
characters (alphabetic or
numeric) that has a specific
meaning. A field is used to
define and store data.
▪ Record: a logically connected set of
one or more fields that describes
a person, place, or thing.
▪ File: a collection of related records.
FILES AND FILE SYSTEMS
▪ We are going to start by looking at file systems because:
▪ Complexity of database design is easier to understand after
understanding the file system
▪ Understanding file system problems helps to avoid problems the
same problems when using Database Management System (DBMS)
software
▪ Knowledge of file system is useful for converting file system to
database system

▪ Manual file system is typically composed of file


folders, each tagged and kept in a cabinet. The folders are:
▪ Organised by expected use
▪ Contents of each file folder are logically related

▪ This manual file system served as a data repository for small


data collections but it is cumbersome for large collections.
FILES AND FILE SYSTEMS
FILES AND FILE SYSTEMS
A flat file database is a database that stores data in a plain
text file.
▪ Each line of the text file holds one record, with fields separated by
delimiters, such as commas or tabs.
▪ While it uses a simple structure, a flat file database cannot contai n
multiple tables like a relational database can.
▪ A flat file database is a database designed around a single table.

In other words a flat file is a simple text file containing no


structure at all:
▪ A flat file may contain many fields, often, with duplicate data that
are prone to data corruption.
▪ If you decide to merge data between two flat files, you need to
copy and paste relevant information from one file to the other.
FLAT FILE DATABASE VS RELATIONAL DATABASE
FILES AND FILE SYSTEMS
▪ Flat-file databases (also known as a file system data management)
served as the only method of file storage and retrieval before the
advent of database management systems (such as relational
databases).
▪ While retaining some use, flat-file databases suffer from:
▪ poor accessibility
▪ data redundancy
▪ lack of standard file access
▪ the inability to organize data
▪ Security restrictions
PROBLEMS WITH FILE SYSTEM DATA PROCESSING
▪ Lack of Transactions:
▪ Data-retrieval task requires extensive programming. Requesting and retrieving data
from various files at the same time (called a "transaction") is impossible.
▪ Lack of Storage and Access Standards:
▪ Changes in an existing structure can be difficult in a file system environment.
▪ Limited User Access: multiple users at different workstations cannot access the same
data simultaneously.
▪ Security Restrictions:
▪ Features are difficult to program and are, therefore, often omitted in a file system
environment. Security features are likely to be inadequate. System administration
becomes more difficult. A file system cannot restrict a user to the type of data or
information they can access. Once the file is opened, all data or information is
revealed.
PROBLEMS WITH FILE SYSTEM DATA PROCESSING
▪ Data Redundancy: exists when the same data are stored unnecessarily at different
places. Uncontrolled data redundancy leads to poor data security, data inconsistency
and anomalies. For example, there is no method to validate the insertion of duplicate
data in file system.
▪ Data Inconsistency: exists when different and conflicting versions of the same data appear in
different places.

▪ Data anomalies: abnormalities when all changes in redundant data are not made
correctly:
▪ Update anomalies – is a data inconsistency that results from a partial update of data
redundancy
▪ Insertion anomalies – data needs to be entered more than once if located in multiple
file systems
▪ Deletion anomalies – is the unintended loss of data due to deletion of other data.
PROBLEMS WITH FILE SYSTEM DATA PROCESSING
▪ Structural Dependence: means that access to a file is dependent on its
structure and any changes in the file structure affects the application
program’s ability to access the data.
▪ Data Dependence: after changes to the data storage characteristics (i.e.
data type), it is compulsory to make changes to the application
program accessing the data.
▪ Data Mapping and Access: Although all the related information are
grouped and stored in different files, there is no mapping between
multiple files.
PROBLEMS WITH FILE SYSTEM DATA PROCESSING

▪ Limitations of the file system includes:


▪ Requires extensive programming
▪ Cannot perform ad hoc queries
▪ System administration is complex and difficult
▪ Difficult to make changes to existing structures
▪ Security features are likely to be inadequate
DATA VS INFORMATION
▪ Good decisions require good information that is derived from
raw facts. These raw, meaningless, unprocessed, and
unorganised facts are known as data e.g. text, graphics,
audio, video, animation, etc.

▪ Data are likely to be managed most efficiently when they


are stored in a database.

▪ Information is organised and meaningful data OR is the result


of processing raw data to reveal its meaning. A more
simpler explanation is that information is used to reveal the
meaning of data.

▪ Data: can be described as raw facts or building blocks


of information or uunprocessed information

▪ Information in this regard is data that has been processed to


reveal meaning

▪ Accurate, relevant, and timely information is key to good


decision making.
EXAMPLE: DATA VS INFORMATION
Information in graphical format
90
Raw Data 85

StudNo Course Test 1 Test2 Test 3 80 77

70
S1 CSC224 70 45 60 70
64
60
S2 CSC224 85 25 45 60 55

S3 CSC224 77 35 64 50 45 45 TEST 1
31 TEST 2
S4 CSC224 27 7 40 35 TEST 3
31
S5 CSC224 0 0 0 30 25
27
30

S6 CSC224 15 55 30 20 15

10 7

0 0 0 0
S1 S2 S3 S4 S5 S6
INTRODUCTION
TO
DATABASE
S
DATABASE TERMINOLOGY (2)
What is a Database?
▪ A database is a collection of data that is structured so that it can easily be
accessed, managed, and updated.
▪ It is a shared, integrated computer structure that stores a collection of:
▪ End-user data, that is, raw facts of interest to the end user.
▪ Metadata are data that describe the properties or characteristics of end-user data
and the context of that data.
▪ Metadata is data that provides a description of the data characteristics and the set of relationships that link
the data found within the database.
▪ Metadata present a more complete picture or description of the data in the database.
▪ Metadata include: data elements, data type (date, text, or numeric), relationship description, etc.

▪ A database resembles a very well-organized electronic file or filing cabinet


e.g. text file.
▪ A database is an organized collection of logically related data
TYPES OF DATABASES
▪ Databases are classified in terms of:
▪ the number of users to access the database (single-user or multi-user),
▪ the database location, and
▪ the expected type and extent of use
▪ the structure of Data
Classification by Users:
▪ Single-user Database:
▪ Only one user is allowed to access the database at a time.
▪ Desktop Database is a single-user database that runs on a personal computer.
▪ Multi-user Database:
▪ Multiple users are allowed are allowed to access the database at the same time.
▪ Workgroup Database: supports a relatively small number of users (usually fewer than 50) or a
specific department within an organization e.g. employees of a small department.
▪ Enterprise Database: supports many users (more than 50, usually hundreds) across many
departments e.g. employees of an entire organisation.
TYPES OF DATABASES (2)
Classification by Location:
▪ Centralized Database:
▪ Is a database that supports data located at a single site.
▪ Distributed Database:
▪ Is a database that supports data distributed across several different sites.
Classification by Use:
▪ Operational/Transactional/Production Database:
▪ The database supports a company’s day-to-day e.g. for banks, retail, etc.
▪ It is designed to support a company’s day-to-day operations
▪ Data Warehouse:
▪ Stores data used to generate information required to make tactical or strategic
decisions.
▪ It is often used to store historical data.
TYPES OF DATABASES (3)
Databases can also be classified to reflect the degree to which the
data are structured:
▪ Unstructured Data: are data that exist in their original (raw) state, that is, in the
format in which they were collected.
▪ Unstructured data doesn’t fit neatly into the traditional row and column structure of
relational databases.
▪ Examples of unstructured data include: emails, videos, audio files, web pages, and
social media messages.
▪ Structured Data: are the result of taking unstructured data and formatting
(structuring) it to facilitate storage, use, and the generation of information.
▪ Structured data refers to information with a high degree of organization, such that
inclusion in a relational database is seamless and readily searchable by simple,
straightforward search engine algorithms or other search operations.
ADVANTAGES OF DATABASES
▪ Redundancy can be controlled; i.e. each given fact in the real world corresponds to one data
entry in the database.
▪ Inconsistency can be avoided: (to some extent) through controlled redundancy.
▪ Data can be shared: new applications can operate with the same data (with only a new
view).
▪ Standards can be enforced: In database systems, data being stored at one central place,
standards can easily be enforced by the DBA. This ensures standardised data formats to facilitate
data transfers between systems. Applicable standards might include any or all of the following—
departmental, installation, organizational, industry, corporate, national or international..
▪ Security restrictions can be implemented. Illegal access to data can be avoided.
▪ Integrity can be maintained; data correspond to facts of the real world.
▪ Conflicting requirements (access speed, availability, reliability, trustworthy) can be balanced;
best overall performance can be found.
DISADVANTAGES OF DATABASES
▪ The data approach is a costly due to higher hardware and software requirements.
▪ Database systems are complex (due to data independence), difficult, and time-
consuming to design.
▪ Damage to database affects virtually all applications programs.
▪ Extensive conversion costs in moving from a file-based system to a database
system.
▪ Initial training required for all programmers and users.
▪ It increases opportunity for person or groups outside the organization to gain access
to information about the firm’s operation.
▪ It increases opportunity for fully training person within the organization to misuse
the data resources intentionally.
DATABASE SYSTEMS
▪ The problems inherent in file systems makes using a database system
very desirable.
▪ Unlike the file system, with its many separate and unrelated files, the
database system consists of logically related data stored in a single
logical data repository.
▪ The “logical” label reflects the fact that the data repository appears to be a single
unit to the end user, even though data might be physically distributed among
multiple storage facilities and locations.

▪ Because the database’s data repository is a single logical unit,


the database represents a major change in the way end-user data
are stored, accessed, and managed.
DATABASE SYSTEMS
▪ The term database system refers to an organization of components that
define and regulate the collection, storage, management, and use of
data within a database environment.
▪ A database system is a computer based record keeping system
whose overall purpose is to record and maintain information that is
relevant to the organization and necessary for making decisions
DATABASE SYSTEM
CHARACTERISTICS OF DATABASE SYSTEMS
▪ Many data sets are stored in a single database (all data sets of a firm).
▪ Each data set has a unique description (stored in a data dictionary).
▪ Access to data can be done via Database Management System only (like
a "Firewall").
▪ Relationships between data are defined in the database.
▪ The Database System offers tools for back-up, screen-design etc.
DATABASE SYSTEM
COMPONENTS OF DATABASE SYSTEM
▪ Hardware:
▪ Refers to all the DBMS physical devices e.g. workstations, servers, storage devices, network
devices, etc.
▪ Software:
▪ Refers to the software required for the DBMS to function e.g. Operating System, DBMS
software (MySQL server), Application program and utilities (MySQL Workbench). The DBMS acts as
a bridge between the user and the database.
▪ People/Users:
▪ Refers to all user of the DBMS (e.g. System Administrator, DBA, Database Designer, System
Analysts and Programmers, End-Users). Users are those persons who need the information from the
database to carry out their primary business responsibilities.
▪ Procedures:
▪ Refers to the instructions and rules that govern the design and use of the DBMS.
COMPONENTS OF DATABASE SYSTEM
▪ Data:
▪ Refers to the data that will be stored in the database. Most of the organizations
generate, store and process large amount of data. The data acts a bridge
between the machine parts i.e. hardware and software and the users which
directly access it or access it through some application programs (end-user data
and metadata).
▪ User Data - It consists of a table(s) of data called Relation(s) where Column(s) are
called fields of attributes and rows are called Records for tables. A Relation must be
structured properly.
▪ Metadata - Metadata present a more complete picture or description of the data in
the database. System Tables store the Metadata which includes:
▪ data elements, data type (date, text, or numeric), relationship description
▪ Number of Tables and Table Names
▪ Number of fields and field Names
▪ Primary Key Fields
IMPORTANCE OF DATABASE DESIGN
▪ A database design defines the database structure that is used to store,
retrieve and manage data. It focuses on how the database structure will
be used to store and manage end-user data.
▪ A bad database design leads to data redundancies and data anomalies
▪ The following are basically the reasons for doing database design:
▪ Good application programs can’t overcome bad database designs.
▪ The existence of a DBMS does not guarantee good data management, nor
does it ensure that the database will be able to generate correct and timely
information.
▪ A database created without the benefit of a detailed blueprint (data
model) is unlikely to be satisfactory. Would you think it smart to build a
house without the benefit of a blueprint (house plans)?
IMPORTANCE OF DATABASE DESIGN
▪ Database design focuses on how the database structure will be used to
store and manage end-user data. In short database design:
▪ Defines the database’s expected use
▪ Different approach needed for different types of databases
▪ Avoid redundant data
▪ Poorly designed database generates errors  leads to bad decisions

can lead to failure of organization
IMPORTANCE OF DATABASE DESIGN
▪ A well-designed database:
▪ Facilitates data management and generates accurate and valuable
information.
▪ Ensures consistent data.
▪ Eliminates data redundancy.
▪ Improves the operational speed.
INTRODUCTION TO DBMS
▪ The database management system (DBMS), is a computer software
program that is designed as the means of managing all databases that are
currently installed on a system hard drive or network.
▪ Alternatively, database management system (DBMS) is a collection of
programs that manages the database structure and controls access to the
data stored in the database.
▪ Different types of database management systems exist, with some of them
designed for the oversight and proper control of databases that are
configured for specific purposes.
▪ In database management system (DBMS), data files are the files that store
the database information, whereas other files, such as index files and data
dictionaries, store administrative information, known as metadata.
INTRODUCTION TO DBMS
▪ Examples of DBMS includes:
▪ Microsoft Access
▪ MySQL
▪ Oracle
▪ Microsoft SQL
Server
▪ The DBMS receives all application requests and translates them into the
complex operations required to fulfil those requests.
▪ It hides much of the database’s internal complexity from the application
programs and users.
DBMS FUNCTIONS
▪ Data Dictionary Management:
▪ To store definitions of the data elements and their relationships (metadata).
▪ Data Storage Management:
▪ To store not only for the data, but also related data entry forms or screen definitions,
report definitions, data validation rules, procedural code, structures to handle video and picture
formats, and so on.
▪ Data Transformation and Presentation:
▪ To translate logical requests into commands to physically locate and retrieve the requested data.
▪ Security Management:
▪ To create a security system that enforces user security and data privacy.
▪ Multiuser Access Control:
▪ To provide data integrity and data consistency, the DBMS uses sophisticated algorithms to
ensure that multiple users can access the database concurrently without compromising the integrity
of the database.
DBMS FUNCTIONS (2)
▪ Backup and Recovery Management:
▪ To provide backup and data recovery to ensure data safety and integrity.
▪ Data Integrity Management:
▪ To promote and enforces integrity rules, thus minimizing data redundancy and
maximizing data consistency. Data Integrity is defined as the condition in
which all of the data in the database are consistent with the real-
world events and conditions. Data integrity means data are accurate and
▪ Database
verifiable.
Access Languages and Application Programming Interfaces:
▪ To provide data access through a query language.
▪ Database Communication Interfaces:
▪ To allow database to accept end-user requests via multiple, different network
environments.
FIVE COMPONENTS OF THE DBMS
▪ DBMS Engine: accepts logical request from the various other DBMS
subsystems, converts them into physical equivalent, and actually
accesses the database and data dictionary as they exist on a storage
device.
▪ Data Definition Subsystem helps user to create and maintain the data
dictionary and define the structure of the files in a database.
▪ Data Administration Subsystem helps users to manage the overall
database environment by providing facilities for backup and recovery,
security management, query optimization, concurrency control, and
change management
FIVE COMPONENTS OF THE DBMS
▪ Data Manipulation Subsystem helps user to add, change, and delete
information in a database and query it for valuable information.
Software tools within the data manipulation subsystem are most often
the primary interface between user and the information contained in a
database. It allows users to specify its logical information requirements.
▪ Application Generation Subsystem contains facilities to help users to
develop transactions-intensive applications. It usually requires that
users perform a detailed series of tasks to process a transaction. It
facilities easy-to-use data entry screens, programming languages, and
interfaces.
DBMS ADVANTAGES
▪ Reduces Data Redundancy:
▪ exists when the same data are stored unnecessarily at different places.
▪ Reduces Data Inconsistency:
▪ Data Inconsistency exists when different versions of the same data appear in
different places.
▪ Minimizing the existence of different versions of the same data in different places.
▪ Better Data Integration:
▪ In DBMS, data in database is stored in tables. A single database contains multiple
tables and relationships can be created between tables (or
associated data entities). This makes easy to retrieve and update data.
▪ Improved Data Sharing:
▪ The DBMS helps create an environment in which end users have better access to
more data and better-managed data.
DBMS ADVANTAGES (2)
▪ Improved Data Security:
▪ A DBMS provides a framework for better enforcement of data privacy and security
policies.
▪ Improved Data Access:
▪ The DBMS makes it possible to produce quick answers to ad hoc queries.
▪ Improved Decision Making:
▪ Better-managed data and improved data access make it possible to generate better
quality information, on which better decisions are based.
▪ Increased End-user Productivity:
▪ Empowers end users to make quick, informed decisions that can make the difference
between success and failure.
▪ Data Independence: The separation of data structure of database from the
application program that uses the data is called data independence. In DBMS, you
can easily change the structure of database without modifying the application
program.
DBMS ADVANTAGES (2)
▪ Providing Backup & Recovery:
▪ Data should be restored to a consistent state at the time system crash and
changes being made, If hardware or software fails in the middle of the
update program, the recovery subsystem of DBMS ensures that update
program is resumed at the point of failure.
▪ Complex Relationships:
▪ A DBMS must have the capability to represent a variety of complex relationship
among the data, to define new relationships as they arise, and to
retrieve and update the related data easily and efficiently.
DBMS DISADVANTAGES
▪ Increased Costs:
▪ Database systems require sophisticated hardware and software and highly skilled personnel (e.g. training,
licensing, and regulation compliance costs).
▪ Management Complexity:
▪ Adoption of a database system must be properly managed to ensure that they help advance the company’s
objectives (e.g. security issues must be assessed constantly).
▪ Maintaining Currency:
▪ To maximize the efficiency of the database system, you must keep your system current. Therefore, you must
perform frequent updates and apply the latest patches and security measures to all components.
▪ Vendor Dependence:
▪ Due to heavy investment in technology and personnel training, companies might be reluctant to change database
vendors.
▪ Frequent Upgrade:
▪ DBMS vendors frequently upgrade their products by adding new functionality. Such new features often come
bundled in new upgrade versions of the software. Some of these versions require hardware upgrades and
personnel training to properly use and manage the new features.
SUMMARY
PEOPLE WHO WORK WITH DATABASES
▪ Database implementers – people who build the DBMS software. These
people work for vendors such as Oracle and Microsoft.
▪ Application programmers – application programmers develop packages
that facilitate data access for end users, who are usually not computer
professionals, using the host or data languages and software tools that
DBMS vendors provide.
▪ Database administrators (DBA) – Individuals responsible for the
maintenance and operation of databases. Database administrators
usually utilized DBMS to assist with database management and
troubleshooting. Most often database administrators have some sort of
certification or a degree relating to computer -driven database systems.
▪ End users – people who wish to store and use data in a DBMS
ROLES OF A DATABASE ADMINISTRATOR
▪ Design of the conceptual and physical schemas
▪ Security and authorization
▪ Data availability and recovery from failures
▪ Database Tuning
DATABASE DESIGN PROCESS
1. Requirement analysis
2. Conceptual Database Design
3. Logical Database Design
4. Schema Refinement
5. Physical database design
6. Application and Security Design
SYSTEM DEVELOPMENT LIFE CYCLE
▪ The traditional methodology used to develop, maintain, and replace information systems is called the
Systems Development Life Cycle (SDLC). The database development activities during the SDLC are
shown in the following diagram:
END OF CHAPTER ONE

You might also like