0% found this document useful (0 votes)
4 views

ch1_database-system-concepts

Chapter 1 introduces database systems, defining data and the role of Database Management Systems (DBMS) in managing large bodies of information efficiently. It discusses the disadvantages of traditional file processing systems, such as data redundancy, difficulty in data access, and integrity issues, while highlighting the advantages of DBMS, including data independence, centralized control, and security. The chapter also covers applications of databases across various sectors and explains concepts like data abstraction, schemas, and the overall structure of database systems.

Uploaded by

aherpushkar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

ch1_database-system-concepts

Chapter 1 introduces database systems, defining data and the role of Database Management Systems (DBMS) in managing large bodies of information efficiently. It discusses the disadvantages of traditional file processing systems, such as data redundancy, difficulty in data access, and integrity issues, while highlighting the advantages of DBMS, including data independence, centralized control, and security. The chapter also covers applications of databases across various sectors and explains concepts like data abstraction, schemas, and the overall structure of database systems.

Uploaded by

aherpushkar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

SNJB’s S.H.H.J.B.

Polytechnic, Chandwad

CH.1 Database System Concept .

Chapter 1 Marks 12
Database System Concept

1.1 An introduction to database:

Data is distinct pieces of information, usually formatted in a special way. Data can exist in a
variety of forms -- as numbers or text on pieces of paper, as bits and bytes stored in
electronic memory, or as facts stored in a person's mind.

Strictly speaking, data is the plural of datum, a single piece of information. In practice,
however, people use data as both the singular and plural form of the word.

(2) The term data is often used to distinguish binary machine-readable information from
textual human-readable information. For example, some applications make a distinction
between data files (files that contain binary data) and text files (files that contain ASCII
data).

(3) In database management systems, data files are the files that store the database
information, whereas other files, such as index files and data dictionaries, store
administrative information, known as metadata.

A database management system (DBMS) is a collection of interrelated data and a set of


programs to access those data. The collection of data, usually referred to as the database,
contains information relevant to an enterprise.
The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.
Database systems are designed to manage large bodies of information. Management of
data involves both defining structures for storage of information and providing mechanisms
for the manipulation of information. In addition, the database system must ensure the safety
of the information stored, despite system crashes or attempts at unauthorized access, If data
are to be shared among several users, the system most avoid possible anomalous results.

Prof P R Sali 1 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


1.1.2 Disadvantages of file processing system:
Before the advent of DBMS, organization typically stored information using file
processing system. Permanent records are stored in various files and different application
programs are written to extract records from, and to add records to, the appropriate files.
Keeping organizational information in a file-processing system has a number of major
disadvantages:

1. Data redundancy and inconsistency.


Since different programmers create the files and application programs over a long
period, the various files are likely to have different structures and the programs may be
written in several programming languages. Moreover, the same information may be
duplicated in several places (files). For example, the address and telephone number of a
particular customer may appear in a file that consists of savings-account records and in a file
that consists of checking-account records. This redundancy leads to higher storage and access
cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data
may no longer agree. For example, a changed customer address may be reflected in
savings-account records but not elsewhere in the system.

2. Difficulty in accessing data.


Suppose that one of the bank officers needs to find out the names of all customers who
live within a particular postal-code area. The officer asks the data-processing department to
generate such a list. Because the designers of the original system did not anticipate this
request, there is no application program on hand to meet it. There is, however, an application
program to generate the list of all customers. The bank officer has now two choices: either
obtain the list of all customers and extract the needed information manually or ask a system
programmer to write the necessary application program. Both alternatives are obviously
unsatisfactory, suppose that such a program is written, and that, several days later, the same
officer needs to trim that list to include only those customers who have an account balance of
$10,000 or more. As expected, a program to generate such a list does not exist. Again, the
officer has the preceding two options, neither of which is satisfactory.
The point here is that conventional file-processing environments do not allow needed data
to be retrieved in a convenient and efficient manner. More responsive data-retrieval systems
are required for general use.

3. Data isolation.
Because data are scattered in various files, and files may be in different formats, writing
new application programs to retrieve the appropriate data is difficult.

4. Integrity problems.
The data values stored in the database must satisfy certain types of consistency
constraints. For example, the balance of certain types of bank accounts may never fall below a

Prof P R Sali 2 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


prescribed amount (say, Rs 2500). Developers enforce these constraints in the system by
adding appropriate code in the various application programs. However, when new constraints
are added, it is difficult to change the programs to enforce them. The problem is compounded
when constraints involve several data items from different files.

5. Atomicity problems.
A computer system, like any other mechanical or electrical device, is subject to failure.
In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent
state that existed prior to the failure. Consider a program to transfer Rs 50 from account A to
account B. If a system failure occurs during the execution of the program, it is possible that the
Rs 50 was removed from account A but was not credited to account B, resulting in an
inconsistent database state. Clearly, it is essential to database consistency that either both the
credit and debit occur, or that neither occur. That is, the funds transfer must be atomic-it must
happen in its entirety or not at all. It is difficult to ensure atomicity in a conventional
file-processing system.

6. Concurrent- access anomalies.


For the sake of overall performance of the System and faster response, many systems
allow multiple users to update the data simultaneously. Indeed, today, the largest Internet
retailers may have millions of accesses per day to their data by shoppers. In such an
environment, interaction of concurrent updates is possible and may result in inconsistent data.
Consider bank account A, containing Rs 500. If two customers withdraw funds (say Rs 50 and
Rs 100, respectively) from account A at about the same time, the result of the concurrent
executions may leave the account in an incorrect (or inconsistent) state. Suppose that the
programs executing on behalf of each withdrawal read the old balance, reduce that value by
the amount being withdrawn, and write the result back. If the two programs run concurrently,
they may both read the value Rs 500, and write back Rs 450 and Rs 400, respectively.
Depending on which one writes the value last, the account may contain either Rs 450 or Rs
400, rather than the correct value of $350. To guard against this possibility, the system must
maintain some form of supervision. But supervision is difficult to provide because data may be
accessed by many different application programs that have not been coordinated previously.

7. Security problems.
Not every user of the database system should be able to access all the data. For
example, in a banking system, payroll personnel need to see only that part of the database
that has information about the various bank employees. They do not need access to
information about customer accounts. But, since application programs are added to the
file-processing system in an ad hoc manner, enforcing such security constraints is difficult.

Prof P R Sali 3 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


1.1.3 Advantages of DBMS
The facilities offered by DBMS vary a great deal, depending on their level of sophistication. In
general, however, a good DBMS should provide the following advantages over a conventional
system:

1. Independence of Data and Program: This is a prime advantage of a database. Both the
database and the user program can be altered independently of each other thus saving time
and money which would be required to retain consistency.

2. Data Share Ability and Non- redundancy of Data: The ideal situation is to enable
applications to share an integrated database containing all the data needed by the
applications and thus eliminate as much as possible the need to store data redundantly.

3. Integrity: With many different users sharing various portions of the database, it is
impossible for each user to be responsible for the consistency of the values in the database
and for maintaining the relationships of the user data items to all other data item, some of
which may be unknown or even prohibited for the user to access.

4. Centralized Control: With central control of the database, the DBA can ensure that
standards are followed in the representation of data.

5. Security: Having control over the database the DBA can ensure that access to the database
is through proper channels and can define the access rights of any user to any data items or
defined subset of the database. The security system must prevent corruption of the existing
data either accidentally or maliciously.

6. Performance and Efficiency : In view of the size of databases and of demanding database
accessing requirements, good performance and efficiency are major requirements, Knowing
the overall requirements of the organization, as opposed to the requirements of any individual
user, the DBA can structure the database system to provide an overall service that is 'best for
the enterprise'.

1.1.4 Database- System Applications

Databases are widely used. Here are some representative applications:

•Banking: For customer information, accounts, loans, and banking transactions.

• Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner.

•Universities: For student information, course registrations, and grades.

Prof P R Sali 4 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .

•Credit card transactions: For purchases on credit cards and generation of monthly
statements.

•Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.

•Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds; also for storing real- time market data to enable on-line trading by
customers and automated trading by the firm.

•Sales: For customer, product, and purchase information.


•On-line retailers: For sales data noted above plus on-line order tracking, generation of
recommendation lists, and maintenance of on-line product evaluations.

•Manufacturing: For management of the supply chain and for tracking production of items in
factories, inventories of items in warehouses and stores, and orders for items.

•Human resources: For information about employees, salaries, payroll taxes, benefits, and for
generation of paychecks.

1.3 View of Data


A database system is a collection of interrelated data and a set of programs that allow
users to access and modify these data. A major purpose of a database system is to provide
users with an abstract view of the data. That is, the system hides certain details of how the
data are stored and maintained.

1.3.1 Data Abstraction


For the system to be usable, it must retrieve data efficiently. The need for efficiency has
led designers to use complex data structures to represent data in the database. Since many
database-system users are not computer trained, developers hide the complexity from users
through several levels of abstraction, to simplify users' interactions with the system:

1. Physical level. The lowest level of abstraction describes how the data are actually
stored. The physical level describes complex low-level data structures in detail.

2. Logical level. The next- higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The logical level thus
describes the entire database in terms of a small number of relatively simple structures.
Although implementation of the simple structures at the logical level may involve
complex physical- level structures, the user of the logical level does not need to be

Prof P R Sali 5 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


aware of this complexity. Database administrators, who must decide what information
to keep in the database, use the logical level of abstraction.

Figure 1.1 The three levels of data abstraction.

3. View level. The highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity remains because of the
variety of information stored in a large database. Many users of the database system do
not need all this information; instead, they need to access only a part of the database.
The view level of abstraction exists to simplify their interaction with the system. The
system may provide many views for the same database.
Figure 1.1 shows the relationship among the three levels of abstraction.

1.3.2 Instances and Schemas


Databases change over time as information is inserted and deleted. The collection of
information stored in the database at a particular moment is called an instance of the
database. The overall design of the database is called the database schema. Schemas are
changed infrequently, if at all.
The concept of database schemas and instances can be understood by analogy to a
program written in a programming language. A database schema corresponds to the variable
declarations (along with associated type definitions) in a program. Each variable has a
particular value at a given instant. The values of the variables in a program at a point in time
correspond to an instance of a database schema.
Database systems have several schemas, partitioned according to the levels of
abstraction. The physical schema describes the database design at the physical level, while the
logical schema describes the database design at the logical level. A database may also have

Prof P R Sali 6 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


several schemas at the view level, sometimes called sub schemas that describe different views
of the database.

1.3.3 Data Independence


The ability to modify a schema definition in one level without affecting a schema
definition in the next higher level is called data independence. There are two levels of data
independence:
1. Physical data independence is the ability to modify the physical schema without causing
application programs to be rewritten. Modifications at the physical level are occasionally
necessary to improve performance.
2. Logical data independence is the ability to modify the logical schema without causing
application programs to be rewritten. Modifications at the logical level are necessary
whenever the logical structure of the database is altered.
Logical data independence is more difficult to achieve than is physical data
independence, since application programs are heavily dependent on the logical structure of
the data that they access.

1.4 Overall System Structure


A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. Some of the functions of the database system may be provided by the
computer's operating system. In most cases, the computer's operating system provides only
the most basic services, and the database system must build on that base. Thus, the design of
a database system must include consideration of the interface between the database system
and the operating system.
The functional components of a database system can be broadly divided into query
processor components and storage manager components. The query processor components
include
1. DML compiler, which translates DML statements in a query language into low-level
instructions that the query evaluation engine understands. In addition, the DML compiler
attempts to transform a user's request into an equivalent but more efficient form, thus finding
a good strategy for executing the query,
2. Embedded DML pre-compiler, which converts DML statements, embedded in an application
program to normal procedure calls in the host language. The pre-compiler must interact with
the DML compiler to generate the appropriate code.
3. DDL interpreter, which interprets DDL statements and records them in a set of tables
containing metadata.
4. Query evaluation engine, which executes low-level instructions generated by the DML
compiler.

Prof P R Sali 7 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


The storage manager components provide the interface between the low level data
stored in the database and the application programs and queries submitted to the system. The
storage manager components include

1. Authorization and integrity manager, which tests for the satisfaction of integrity constraints
and checks the authority of users to access data.

2. Transaction manager, which ensures that the database remains in a consistent (correct)
state despite system failures, and that concurrent transaction executions proceed without
conflicting.

3. File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.

4. Buffer manager, which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in memory.
In addition, several data structures are required as part of the physical system
implementation:
1. Data files, which store the database itself.
2. Data dictionary, which stores metadata about the structure of the database. The data
dictionary is used heavily. Therefore, great emphasis should be placed on developing a good
design and efficient implementation of the dictionary.
3. Indices which provide fast access to data items that hold particular values.
4. Statistical data, which store statistical information about the data in the database. This
information is used by the query processor to select efficient ways to execute a query.

Following figure 1.2 shows these components and the connections among them.

Prof P R Sali 8 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .

Fig 1.2 system Structure

Prof P R Sali 9 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


1.5 Data Models
Underlying the structure of a database is the data model: a collection of conceptual
tools for describing data, data relationships, data semantics, and consistency constraints. The
various data models that have been proposed fall into three different groups: object based
logical models, record based logical models, and physical models.

1.5.1 Object Based Logical Models


Object based logical models are used in describing data at the logical and view levels.
They are characterized by the fact that they provide fairly flexible structuring capabilities and
allow data constraints to be specified explicitly. There are many different models, and more
are likely to come. Several of the more widely known ones are

• The entity relationship model


• The object oriented model
• The semantic data model
• The functional data model

1.5.2 Record- Based Logical Models


Record- based logical models are used in describing data at the logical and view levels, In
contrast to object- based data models, they are used both to specify the overall logical
structure of the database and to provide a higher- level description of the implementation.
The three most widely accepted record- based data models are the relational, network,
and hierarchical models

1.5.2.1 Relational Model


The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each column has a
unique name. Figure 1.3 presents a sample relational database comprising of two tables: one
shows bank customers, and the other shows the accounts that belong to those customers.

Cust_no Name City acct_no


101 Manoj Chandwad A-101
215 Deepak Nashik A-215
101 Manoj Chandwad A-305
401 Vinod Nashik A-201
402 Mahesh Mumbai A-305
402 Mahesh Mumbai A-207

Prof P R Sali 10 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


Acct_no Balance
A-101 500
A-215 700
A-305 400
A-201 900
A-207 750

Figure : 1.3 A sample relational database.


It shows, for example, that customer Manoj lives in Chandwad and has two a accounts:
A-101 with a balance of Rs 500 and A-305 with a balance of Rs 400. Note that customers
Manoj and Mahesh share account number A-305, which has a balance of Rs 400 (they may
share).

1.5.2.2 Network Model


Data in the network model are represented by collections of records and relationships
among data are represented by links, which can be viewed as pointers. The records in the
database are organized as collections of arbitrary graphs. Figure 1.4 presents a sample
network database using the same information as in Figure 1.4

101 Manoj Chandwad A-101 500

215 Deepak Nashik A-215 700

401 Vinod Nashik A-305 400

402 Mahesh Mumbai A-201 900

A-207 750

Figure: 1.4 A sample network database.

1.5.2.3 Hierarchical Model


The hierarchical model is similar to the network model in the sense that data and
relationships among data are represented by records and links, respectively. It differs from the
network model in that the records are organized as collections of trees rather than arbitrary
graphs. Figure 1.5 presents a sample hierarchical database with the same information as in
Figure 1.3

Prof P R Sali 11 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .

101 Manoj Chandwad

215 Deepak Nashik

401 Vinod Nashik

A-101 500 402 Mahesh Mumbai


A-201 900
A-305 400

A-305 400 A-207 750


A-215 700

Figure 1.5 A sample hierarchical database.

1.5.3 Physical Data Models


Physical data models are used to describe data at the lowest level. In contrast to logical
data models, there are few physical data models in use. Two of the widely known ones are the
unifying model and the frame-memory model.

1.6 The Entity- Relationship Model


The entity- relationship (E-R) data model is based on a perception of a real world that
consists of a collection of basic objects, called entities, and of relationships among these
objects.

Cust_no. Name City Acct_no Balance

Customer Depositor Account

Figure 1.6 A sample E-R diagram.

Prof P R Sali 12 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


An entity is a "thing" or "object" in the real world that is distinguishable from other objects.
For example, each person is an entity, and bank accounts can be considered to be entities.
Entities are described in a database by a set of attributes, For example, the attributes account
-number and balance, describe one particular account in a bank. A relationship is an
association among several entities. For example, a Depositor relationship associates a
customer with each account that she has. The set of all entities of the same type, and the set
of all relationships of the same type, are termed an entity set and relationship set,
respectively.
In addition to entities and relationships the E-R model represents certain constraints to
which the contents of a database must confirm. One important constraint is mapping
cardinalities, which express the number of entities to which another entity can be associated
via a relationship set.
The overall logical structure of a database can be expressed graphically by an E-R
diagram which is built up from the following components:

• Rectangles, which represent entity sets


• Ellipses, which represent attributes
• Diamonds, which represent relationships among entity sets
• Lines, which link attributes to entity sets and entity sets to relationships
Each component is labelled with the entity or relationship that it represents.

1.6.2 Weak Entity And Strong Entity :-


An entity set may not have sufficient attribute to form a primary key ,such as entity set
is termed as weak entity set. An entity set that has a primary key is termed as strong entity set.

1.6.3. Types of ATTRIBUTES:


An entity is represented by a set of attributes. The properties possess by each member
at on entity set for each attributes there is set of domain.
Following are the types of attributes

1. Simple And Composite Attributes:-


Simple attributes are those attributes that are not divided into subparts. Composite
attributes on the other hand can be divided into subparts.
CUSTOMER (CUST_NO, CUST_NAME, ADDRESS)
Here CUST_NO is a simple attribute & ADDRESS is a composite attribute which can be
divided into state, city, country.

Prof P R Sali 13 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .


2. Single Valued & Multi-valued Attributes:-
Single valued attribute has single value for a particular entity.
eg. Cust_No attribute or a specific customer refers to only one person.
A customer may have one or more address and hence address is a multi-valued
attribute. Thus multi-valued attribute are those attributes that take a multiple values.

3. Null Attributes:-
A null value is used when an entity does not have a value for an attribute. A null value is
an unknown value which may be either missing or not know.
eg street attribute may have null value.

4. Derived Attributes:-
The value for this type of attribute can be derived from the values of other related
attribute or entity.
Relation Ship:-It is an association among several entity
Relation Ship Set:-It is a set of relations of the same types.

Entity-Relation (E-R) diagram:-The overall logical structure a d/b can be express graphically by
E-R diagram which is built by following components.
1. Rectangle:-which represent entity set.
2. Ellipse:-which represent attributes.
3. Diamonds:-which represent relationship among entity set.
4. Lines:- which link attributes to entity set and entity set to relationship set.
5. Dashed ellipse:-which represent derived attribute.
6. Double ellipse :- which represent multi-valued attribute.
7. Double lines:-which represent total participation of entity in a relationship set.
8. Double rectangle :- which represent weak entity set.

Prof P R Sali 14 DMS (22319)


SNJB’s S.H.H.J.B. Polytechnic, Chandwad

CH.1 Database System Concept .

Cust_name Cust_City Loan_no Amount

Customer Borrower Loan

Figure: 1.7- E-R diagram corresponding to customer and loan

Thus we have two entities.


CUSTOMER(CUST_NO,CUST_NAME,ADDRESS),
loan(loan_no, amount),and a relationship borrower.
The relationship set borrower may be one to one, many to one, one to many and many to
many. To distinguish among these types we draw a directed line ( ) or an undirected
line ( ) between the relationship set and the entity set. A directed line from the relationship
set borrower to the entity set loan specified that borrower is either a one to one or many to
one relationship from customer to loan as shown below.

Prof P R Sali 15 DMS (22319)

You might also like