0% found this document useful (0 votes)
21 views19 pages

MIS UNIT-II

Uploaded by

gadwalnisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views19 pages

MIS UNIT-II

Uploaded by

gadwalnisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIT-2

DATA RESOURCE MANAGEMENT (DATA


ADMINISTRATION)
1. INTRODUCTION:

DATA RESOURCE MANAGEMENT (DRM):

DRM involves the management of files and computer data for businesses and
companies.

➢ DRM is also known as data administration deals with computer science and
information systems.
➢ Workers in this filed help design, control, protect, store, administer and organize
saved data.
➢ Normally, this information is stored on data base with Data Base Management
Systems (DBMS) or software.
➢ DRM is a managerial activity that applies IT and software tools to the task of
managing an organizations data resources.
➢ Earlier, we use traditional file processing approach, which is too difficult ,costly
and inflexible to supply the information.
➢ Thus DRM approach was developed to solve the problems of file processing
systems.
➢ Data is an important input in an IS(Information System)
➢ DATA RESOURCE is also called the database.
➢ DATA BASE: Data is processed and converted into information to satisfy the
needs of the organization.
➢ Now-a-days internal and external information was increasing rapidly so database
was necessary in any organization.
➢ The business environment has forced the businesses to take quick and right
decisions for which databases are required to be queried frequently.
➢ QUERIES may be varied,

EXAMPLES
1. One manager may be interested to know the names of all those products for which
sales in the current year exceed that of the previous year.
2. One may require information on the total amount outstanding.
3. One may require the list of products having a market share greater than 30% and
soon.
To correctly process varied types of queries and to ensure a fast response time, the use of
computer based IS has become a necessity of any business.

1.1. DATA BASE CONCEPTS


Entity: A thing distinct and independent existence.

OR
Anything of interest to the user about which data is to be collected / stored is called
entity.

Entity

Tangible object non-tangible object

Employee, a student, a spare part or a place event, a job title, a customer account

An entity can be described by its CHARACTERISTICS/FEATURES such as name, age,


designation etc.

Attributes:

• The characteristics/features of entity are called attributes.


• Data is generally organized into characters, fields, records, files and databases,
which is called The Logical Data Elements.

Explanation:

1. CHARACTER:
➢ It consists of a single alphabetic, numeric, or other symbol, which is represented
by Bit or Byte.
➢ Character is the most BASIC ELEMENT of data
2. FIELD :
➢ A collection of characters is called field.
➢ A field is a physical space on the storage device.

For Example – the field in an employee may be employee name, sex, address etc.

Data Item – It is the data stored in the field.

Example – employee age, name is field.


The values in these field (Sandeep 26 years) – data items.

Character Fig: Data Base Concept

Field Class field (name field)

Record Name Class Course


Vikky MBA- MIS
I

File
Course File
Name Class Course
1.Vikky MBA MIS
1
2.Rahul MBA ITM
1

Data Base Course File Faculty File

Administrative File

3. RECORD:

Collection of various related fields is called record

For example – student – name, address, roll-no, marks etc., will be a record of the
student.

4. FILE:

A collection/ group of various records is known as a file.

OR

Any collection of related records in the form of rows and columns (tabular form) is
called a file.

For example – If there are many students in a class, then a group of related records
would form student – file.
5. DATA BASE

A collection of various related files is known as database.

OR
It is an organized collection of data, stored and accessed electronically.

An Information System (IS) application may have several related files and all related
files would constitute a database for that application.

For example – In a salary processing system, the files may be employee-file, provident-
fund-file, income-tax-file etc.

All these files, which are related to the application, are combined in a database.

1.2. THE TRADITIONAL APPROCHES:


Traditionally, data files were developed and maintained separately for individual
applications .Every functional unit like marketing, finance, production etc. Used to
maintain their own set of application programs and data files.

Problems with Traditional File Processing:

Traditional approach was rendered inadequate especially when organizations started


developing organization-wide integrated applications.

1. Data duplication
2. Data inconsistency
3. Lack of data integration
4. Data dependence
5. Program dependence

1. DATA DUPLICATION:

Each application has its own data file, the same data may have to be recorded and stored
in several files.

Example – payroll application, and personnel application, both will have data on
employee name, designation etc. This results in unnecessary duplication/redundancy of
common data items.
2. DATA INCONSISTENCY

➢ Data duplication leads to data inconsistency especially when data is to be updated.


➢ It occurs because the same data items which appear in more than one file do not get
updated simultaneously in all the data files.
➢ For example – employees designation, which is immediately updated in the payroll
system may not necessarily be updated in the personnel application.
➢ This result in two different designations of an employee at the same time.

3. LACK OF DATA INTEGRATION


➢ Because of independent data files, users face difficulty in getting information on any
adhoc query (a non-standard inquiry).
➢ Thus, either complicated programs have to be developed to retrieve data from each
independent data file or users have to manually collect the required information
from various outputs of separate applications.

4. DATA DEPENDENCE

The applications in file processing systems are data dependence.

For example – In order to process applications, it needs files organized on customers


records sorted on their last name, which implies that retrieval of any customer’s record
have to be through his/her last name only.

5. PROGRAM DEPENDENCE

➢ The reports produced by the file processing system are program dependent, implies
that if any change in the format/structure of data and records in the file to be made, a
corresponding change in the programs have to be made.
➢ Similarly, if any new report is to be produced, new programs will have to be
developed.
It is because of all these drawbacks in the traditional files approach of
organizing data that led to the development of data bases.

2.3. THE MODERN APPROACHES (DATA BASE MANGEMENT


APPROACHES/SYSTEM) DBMS
A database is a collection of various related files.

In a data base system – a common data is shared by a number of applications as it is data


and program independent.
Data duplication

Data inconsistency

File processing system Lack of data integration over come in DBMS

Data dependence

Program dependence

Application program End users

Fig; simplified view of a database systems

Student Financial
administration management

Fig: Data base approach Data base

Course Faculty
administration administration
2.3.1 DBMS Definition:

The software that allows an organization to centralize data, manage it efficiently, and
provides access to the database by application programs is known as DBMS.

• The DBMS thus solves the problems of the traditional file processing environment.
• The DBMS is the software that interacts with end users, applications and the
database itself to capture and analyze data.

2.3.2 Objectives of DBMS

1. Controlled data redundancy


2. Enhanced data consistency
3. Data independence
4. Ease of use
5. Economical
6. Application independence
7. Recovery from failure

2.3.4 Advantages of DBMS:

1. Redundancy control
2. Data consistency
3. Management queries
4. Data independence
5. Enforcement of standards
1. REDUNDANCY CONTROL
✓ In a file management system, each application has its own data, which causes
duplication of common data items in more than one file.
✓ This data duplication needs more storage space as well as multiple updations for a
single transaction.
✓ This problem is overcome in database approach where data is stored only once.
2. DATA CONSISTANCY

In data base approach, the problem of inconsistent data is automatically solved with the
control of redundancy.

3. MANAGEMENT QUERIES
The database approach, in most of IS(Information System), pools the organization-
wide files at one place known as CENTRAL DATABASE and thus is capable of
answering queries of the management, relating to more than one functional area.
4. DATA INDEPENDENCE
✓ File management system-data dependent
o Database approach – data independent
✓ The database approach provides independence between file structure and program
structure.
✓ Such system provides an interface between the programs and the database and
takes care of the storage, retrieval and update of data in the database.
✓ It allows applications to be written as general programs to operate on files whose
structures can be made available to the program.
✓ DBMS – generalized file processing system.
5. ENFORCEMENT OF STANDARDS
✓ In the database approach, data being stored at one central place, standards can
easily be enforced.
✓ This ensures standardized data formats to facilitate data transfers between
systems.

2.3.5 Disadvantages of Data Base

1. Centralized database
2. More disk space
3. Operationally of the system
4. Security risk

1. CENTRALIZED DATABASE
• The data structure may become quite complex because of the centralized database
supporting many applications in an organization.
• This may leads to difficulties in its management and may require a professional/
an experienced database designer and sometimes extensive training for users.
2. MORE DISK SPACE
Data base approach generally requires more processing than file management
system and thus needs more disk space for program storage.
3. OPERATIONALITY OF THE SYSTEM
Since the database is used by many users in the organization, any failure in it,
whether due to a system fault, database corruption etc, will affect the
operationality of the system as it would render all users unable to access the
database.
4. SECURITY RISK
Being a centralized database, it is more prone to security disasters.
2.3.6 Functions of DBMS

1. Data organization
2. Data integration
3. Physical/logical – level separation
4. Data control
5. Data protection

1. DATA ORGANIZATION
DBMS organizes data items as per the specifications of the data definition
language.
Data base administrator decides about the data specifications that are most-
suited to each application.
2. DATA INTEGRATION
Data is inter-related together at the element level and can be manipulated in
many combinations during execution of a particular application program.
DBMS facilitates collection, combination and retrieval of the required data to
the user.
3. PHYSICAL/LOGICAL – LEVEL SEPARATION
It separates application programs and their associated data.
DBMS separates the logical description and relationships of data from the way
in which the data is physically stored.
4. DATA CONTROL
DBMS receives requests for storing data from different programs.
It controls how and where data is physically stored.
Similarly it locates and returns requested data to the program.
5. DATA PROTECTION
DBMS protects the data against access by unauthorized users, physical damage,
operating system failure etc.
DBMS is equipped with a facility to backup data and restore it automatically in
the case of any system failure.
Other security features include password protection and sophisticated encryption
schemes.

2.4. DATA MODELS / DATA BASE STRUCTURES


✓ Several logical data models are used to build the conceptual structure.
✓ These data models describe the relationship among the many individual data
elements stored in databases.
The various data models are,

1. Hierarchical model / tree model


2. Network model
3. Relational model
4. Object-oriented model
5. Multi-dimensional model

1. HIERARCHICAL MODEL
❖ In the hierarchical structure, the relationship between records are stored in the
form of a hierarchy or a tree (inverted tree, with the root at the top and branches
below)
❖ In this model, all records are dependent and arranged in a multi-level structure,
thus the root may have a number of branches and each branch may have a number
of sub-branches and soon.
❖ The lower most record is known as the ‘child’ of the next higher level record,
whereas the higher level record is called the ‘parent’ of its child records.
❖ Thus in this approach, all the relationship among records are one-to-many.
❖ Early mainframe DBMS package used hierarchical model.
❖ A hierarchical approach is simple to understand and design but cannot represent
data items that may simultaneously appear at two different levels of hierarchy

PARENT(ROOT)

Customer (Branches) CHILD

Order

Items

Fig: Hierarchial Data Model

2. NETWORK MODEL
• The network model allows more complex 1:M(one to many) or M:M (many to
many) logical relationships among entities.
• The relationships are stored in the form of linked list structure in which
subordinate records, called members, can be linked to more than one owner
(parent)
• This approach does not place any restrictions on the number of relationships.
• However, to design and implement, the network model is the most complicated
one, and is used only in special type of applications.

Customer

Order

One warehouse many items


Many items one warehouse

Warehouse
Item

Many many

a. The Entity Items Participate In More Than One Relationship

ONE - MANY
MANY - MANY

b. A Member Of A Network Database Can Have Multiple Owners.


FIG: Network Model

3. RELATIONAL DATA MODEL(Proposed by Dr E.F.Codd in 1970.)

• In a relational structure, data is organized in two-dimensional tables, called


Relations, each of which is implemented as a File.
ROW – Tuple ,set of data item values relating to one entity

In relational model
COLUMN – Attribute ,set of values of one data item

employee attributes (column titles heads)

Employee Name Date of Birth DESG DEPT salary


no
1 KIRAN 12/04/91 Finance Finance 50.000
manager
Tup
2 MANJULA 11/02/1985 Vice Administration 30.000
le
principle
row
valu
es

Domain (a column value)

FIGURE: RELATIONAL DATA MODEL

DOMAIN

• A column consisting of a set of values of one data item is called domain


• A relation consisting of two-domains is called a relation of degree-2 (binary)
• Similarly degree-3 is called ternary and degree N as N-ary.
To avoid redundancy, the database is not designed only in one big table, generally
called a flat file; rather it is designed as many related tables
C.NO C TITLE CREDITS STD NO T CODE
CS 101 MIS 6 25 07
CS201 SAD 4 25 15
CS304 Software 4 25 30
engineering
CS406 IT 3 20 06

Table: data table course


a. Variety of data elements on each of a course offered by a business school
T-CODE NAME DEPT DESIG PHONE
07 Goyal BM Professor 9052694210
06 Sager CS Professor 7012351249
15 Sangeetha BM Professor 6214321510
30 Govind CS Professor 9440280570

Table: data table teacher


b. Data about the faculty of the school.
• Relational model – FLEXIBILITY AND EASY TO USE
• But in large-scale databases, because of many inter-related tables, the overall
design may get complicated which may lead to slower searches and thus affecting
the access time.
• However, such processing inefficiencies area continually being reduced through
database design and programming.

4. OBJECT-ORIENTED MODEL
❖ Object-oriented model is an approach to data management that stores both data
and the operations that can be performed upon the data as OBJECTS.
❖ While traditional DBMS are designed for HOMOGENEOUS DATA, object-
oriented database area capable of manipulating HETEROGENOUS DATA that
include drawings, images, photographs, voice and full-motion video.
❖ Object oriented database, stores the data and procedures as objects that can be
automatically retrieved and shared.
❖ These days, object-oriented model is gaining popularity and many modern
database systems support this model.
5. MULTI-DIMENSIONAL MODEL
❖ This model is an extension of the relational model.
❖ In this model, data is organized using multi-dimensional structure.
❖ Multi-dimensional structure can be visualized as cubes of data and cubes within
cubes of data.
❖ Different sides of the cube are considered different dimensions of the data.
❖ This model enables a user to selectively extract and view data in one or more
number of different dimensions, such as time, geographic region, product,
organizational department, customer, or other factors.
❖ This model has become the most popular data model for the analytical databases
that support OnLine Analytical Processing (OLAP) applications.
2.5 .DATA WAREHOUSING AND DATA MINING
2.5.1 DATA WAREHOUSE

✓ A data warehouse is a logical collection of information, gathered from many


different databases.
✓ Thus data warehouse may be called as a large database containing historical
transactions and other data.
✓ For example – if we take department store dealing in buying and selling grocery
items.
✓ The data ware house would deal with granular data, information in its rawest form,
within data ware house, each transaction may be recorded.
✓ The PURPOSE OF DATA WAREHOUSE is permanent storage of detailed
information.
✓ Data entered into a data warehouse needs to be processed to ensure that it is clean,
complete and in a proper format.
✓ Many a times, a data warehouse is subdivided in to smaller repositories called ‘Data
Marts.’
✓ A data mart is a subset of a data warehouse, in which only the required portion of
the data warehouse information is kept.
Key Date Custo Name City Prod Item Qua Lot no Pric source
mer uct no ntity e
ID
1001 12.5. FA Karthi PatialSuga E 5 AXY00 35.0 RAJ
18 456 k a r 019 2 0 TRADING
1002 12.5. KF Govin delhi Dal A 2 TBA04 98.5 SHIVAM
18 459 d 001 7 0
1003 12.5. FL Swath Chen Dal C 2 AB682 78.0 MITTAL &
18 476 i nai 019 0 CO
1004 12.5. FA Rani Ajme Atta B 10 SF431 25.0 RAJ
18 457 r 008 0 TRADING
7340 12.5. C 241 Kuma Bangl Tea A 0.25 RT694 290 RAJ
18 r ore leav 007 0 TRADING
es
6234 12.5. SF Sandy Sager Rice D 10 SC321 500 RAJ
18 356 a 129 PADMA
1007 12.5. AZ koush kolka salt E 1 DO413 20 BALAJI
18 123 al ta 037 TRADERS

IMPORTANT CHARACTERISTICS OF DATA WAREHOUSE

Subject-oriented integrated historical perspective non-volatile

1. SUBJECT-ORIENTED
• It focuses on modeling and analysis of data relating to a specific area.
• The data warehouse is organized around subject such as product, customer, sales
etc.
2. INTEGRATED
It is an integration of data from various different applications like ERP systems,
CRM system etc.
3. HISTORICAL PERSPECTIVE
The time variant for a data warehouse has a historical perspective in its approach,
For example – past 5-10 years.
4. NON-VOLATILE
It means data is stored permanently i.e. data once stored cannot be updated.
Data warehouses are capable of storing vast quantities of data, but there is a
challenge in implementing data warehousing applications.
For successful implementation, organizations need to be very careful about the
data quality.
Missing and miscoded data has to be cleaned up, and variables often come in a
variety of types, such as nominal data with no numeric content, dates, counts,
averages etc.
Thus, organizations must ensure the data quality in a data warehouse.
To make data warehouses useful, organizations must use BI
(business intelligence) tools to process data into meaningful information.
These databases are used for data mining and online analytical processing (OLAP)
The organizations that develop business intelligence (BI) tools create interfaces
that help the managers to quickly grasp business situations.
Such an interface is simple to understand and the interpretation by the managers
becomes easy.
Example – one such interface is called dash board ,because it looks similar to a car
dash board visual images like speedometer – like indicators for periodic revenues,
profits, and other financial information ;plus bar charts, line graphs, and other
graphical representations are used in dashboards.

2.5.2 DATA MINING / KNOWLEDGE DISCOVERY IN DATA (KDD)


Definition

It is defined as a process used to extract usable data from a larger set of any raw data.

It is the process of discovering or mining knowledge from a large amount of data.


It attempts to extract hidden patterns and trends from large databases.
It also support automatic exploration of data.
Data mining queries are more advanced and sophisticated than those of traditional
queries.

For example – a typical traditional query may be” what is the relationship between the
amount of product A and the amount of product B that an organization sold over the past
week?”.

Where as in Data Mining, the manager would be interested to know the products that
would be in demand on the coming weekend and thus the query from the data mining
may be” find out the products most likely to have the maximum demand on the coming
weekend.”
The combination of data-warehousing techniques and data mining software makes it
easier to predict future outcomes based on patterns discovered within historical data.

Objectives of Data Mining

1. SEQUENCE / PATH ANALYSIS - Finding patterns where one event leads to


another.
2. CLASSIFICATION – finding whether certain facts fall into predefined groups.
3. CLUSTERING – finding groups of related facts not previously known and
4. FORECASTING – discovering patterns in data that can lead to reasonable
predictions.

Sequence of Steps of Data Mining

1. DATA CLEANING – to remove noise and inconsistent data.


2. DATA INTEGRATION – where multiple data sources may be combined.
3. DATA SELECTION – data relevant to the analysis task are retrieved from the
database.
4. DATA TRANSFROMATION – data area transformed into forms appropriate for
mining by performing summary or aggregation operations.
5. DATA MINING – process where intelligent methods are applied in order to
extract data patterns.
6. PATTERN EVALUATION – to identify the truly interesting patterns
representing knowledge based on some interestingness measure. patterns are
selected on interestingness basis.
7. KNOWLEDGE PRESENTATION – Visualization and knowledge presentation
technique are used to present the mined knowledge to the user.

Applications of Data Mining

1. Retail or marketing
2. Banking
3. Insurance and health care
4. Transportation and
5. Medicine
DIFFERENCE BETWEEN DATA HAREHOUSING AND DATA MINING
DATA WAREHOUSING DATA MINING
• Data warehousing is the process of • Data mining is the process of extracting
competing and organizing data into meaning full data from that database.
common database.
• Helps in identifying the certain data in a • Helps in figuring out a certain pattern of a
collection of data. data.
• Data is stored periodically. • Data is analyzed regularly
• Stores a huge amount of data • Analyses a sample of data.
• Provides a mechanism to store a huge • Discover patterns in data for better
amount of data decision making.
DIFFERENCE BETWEEN DATABASES AND DATA WAREHOUSES
DATA BASE DATA WAREHOUSE
• Collection of files • Collection of databases in a
qualitative way.
• An organized collection of data. • A central repository of integrated
data from one or more sources.
• Primarily insert/write data. • Primary read/retrieve data.
• Current/point-in-time data. • Historical data.
• Online transactional processing • Online analytical processing
• Provides a detailed relation all view. • Provides a summarized multi-
dimensional view.
• For many concurrent transactions. • Not for a large amount of
concurrent transactions.
DIFFERENCE BETWEEN TRADITIONAL APROACHES/FILE SYSTEM AND
MODERN APPROACHES
TRADITIONAL APPROACHES/FILE MODERN APPROACHES/DBMS
SYSTEM
• Data redundancy/duplication. • Controlled data redundancy.
• Data dependency • Data independence
• Program dependency • Program independence.
• No security • Have security
• No access control • Access control
• Lack of integration • Integrated system
• It is for small system like C++ • It is used in large systems like oracle
• These are relatively cheap • These are expensive
• They are very simple structure. • Very complex structure.
• It requires very low design • Designing is important
• Not secure • Secure
• They are used for single user • Multi-user
• Isolated data • Shared data
• Very simple back up mechanism • Backup complex.

You might also like