0% found this document useful (0 votes)
48 views

Programmed and Non-Programmed Decisions

The document discusses decision support systems (DSS), including what they are, their components and characteristics, types of decisions they support, and how they are classified. DSS are interactive software tools that help managers make decisions by accessing and analyzing large amounts of data from various organizational systems.

Uploaded by

Himanshi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Programmed and Non-Programmed Decisions

The document discusses decision support systems (DSS), including what they are, their components and characteristics, types of decisions they support, and how they are classified. DSS are interactive software tools that help managers make decisions by accessing and analyzing large amounts of data from various organizational systems.

Uploaded by

Himanshi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Decision support systems (DSS) are interactive software-based systems

intended to help managers in decision-making by accessing large volumes


of information generated from various related information systems involved
in organizational business processes, such as office automation system,
transaction processing system, etc.

DSS uses the summary information, exceptions, patterns, and trends using
the analytical models. A decision support system helps in decision-making
but does not necessarily give a decision itself. The decision makers compile
useful information from raw data, documents, personal knowledge, and/or
business models to identify and solve problems and make decisions.

Programmed and Non-programmed Decisions


There are two types of decisions - programmed and non-programmed
decisions.

Programmed decisions are basically automated processes, general routine


work, where:

 These decisions have been taken several times.

 These decisions follow some guidelines or rules.

For example, selecting a reorder level for inventories, is a programmed


decision.

Non-programmed decisions occur in unusual and non-addressed situations,


so:

 It would be a new decision.

 There will not be any rules to follow.

 These decisions are made based on the available information.

 These decisions are based on the manger's discretion, instinct, perception and
judgment.

For example, investing in a new technology is a non-programmed decision.

Decision support systems generally involve non-programmed decisions.


Therefore, there will be no exact report, content, or format for these
systems. Reports are generated on the fly.
Attributes of a DSS
 Adaptability and flexibility

 High level of Interactivity

 Ease of use

 Efficiency and effectiveness

 Complete control by decision-makers

 Ease of development

 Extendibility

 Support for modeling and analysis

 Support for data access

 Standalone, integrated, and Web-based

Characteristics of a DSS
 Support for decision-makers in semi-structured and unstructured problems.

 Support for managers at various managerial levels, ranging from top executive
to line managers.

 Support for individuals and groups. Less structured problems often requires the
involvement of several individuals from different departments and organization
level.

 Support for interdependent or sequential decisions.

 Support for intelligence, design, choice, and implementation.

 Support for variety of decision processes and styles.

 DSSs are adaptive over time.

Benefits of DSS
 Improves efficiency and speed of decision-making activities.

 Increases the control, competitiveness and capability of futuristic decision-


making of the organization.

 Facilitates interpersonal communication.


 Encourages learning or training.

 Since it is mostly used in non-programmed decisions, it reveals new approaches


and sets up new evidences for an unusual decision.

 Helps automate managerial processes.

Components of a DSS
Following are the components of the Decision Support System:

 Database Management System (DBMS): To solve a problem the necessary


data may come from internal or external database. In an organization, internal
data are generated by a system such as TPS and MIS. External data come from
a variety of sources such as newspapers, online data services, databases
(financial, marketing, human resources).

 Model Management System: It stores and accesses models that managers


use to make decisions. Such models are used for designing manufacturing
facility, analyzing the financial health of an organization, forecasting demand of
a product or service, etc.

Support Tools: Support tools like online help; pulls down menus, user
interfaces, graphical analysis, error correction mechanism, facilitates the user
interactions with the system.

Classification of DSS
There are several ways to classify DSS. Hoi Apple and Whinstone classifies
DSS as follows:

 Text Oriented DSS: It contains textually represented information that could


have a bearing on decision. It allows documents to be electronically created,
revised and viewed as needed.

 Database Oriented DSS: Database plays a major role here; it contains


organized and highly structured data.

 Spreadsheet Oriented DSS: It contains information in spread sheets that


allows create, view, modify procedural knowledge and also instructs the system
to execute self-contained instructions. The most popular tool is Excel and Lotus
1-2-3.
 Solver Oriented DSS: It is based on a solver, which is an algorithm or
procedure written for performing certain calculations and particular program
type.

 Rules Oriented DSS: It follows certain procedures adopted as rules.

 Rules Oriented DSS: Procedures are adopted in rules oriented DSS. Export
system is the example.

 Compound DSS: It is built by using two or more of the five structures


explained above.

Types of DSS
Following are some typical DSSs:

 Status Inquiry System: It helps in taking operational, management level, or


middle level management decisions, for example daily schedules of jobs to
machines or machines to operators.

 Data Analysis System: It needs comparative analysis and makes use of


formula or an algorithm, for example cash flow analysis, inventory analysis etc.

 Information Analysis System: In this system data is analyzed and the


information report is generated. For example, sales analysis, accounts
receivable systems, market analysis etc.

 Accounting System: It keeps track of accounting and finance related


information, for example, final account, accounts receivables, accounts
payables, etc. that keep track of the major aspects of the business.

 Model Based System: Simulation models or optimization models used for


decision-making are used infrequently and creates general guidelines for
operation or management.

expert system
An expert system is a computer program that uses artificial
intelligence (AI) technologies to simulate the judgment and behavior of a
human or an organization that has expert knowledge and experience in a
particular field.

Typically, an expert system incorporates a knowledge base containing


accumulated experience and an inference or rules engine -- a set of rules for
applying the knowledge base to each particular situation that is described to
the program. The system's capabilities can be enhanced with additions to the
knowledge base or to the set of rules. Current systems may include machine
learning capabilities that allow them to improve their performance based on
experience, just as humans do.

The concept of expert systems was first developed in the 1970s by Edward
Feigenbaum, professor and founder of the Knowledge Systems Laboratory at
Stanford University. Feigenbaum explained that the world was moving from
data processing to "knowledge processing," a transition which was being
enabled by new processor technology and computer architectures.

Expert systems have played a large role in many industries including in


financial services, telecommunications, healthcare, customer service,
transportation, video games, manufacturing, aviation and written
communication. Two early expert systems broke ground in the healthcare
space for medical diagnoses: Dendral, which helped chemists identify organic
molecules, and MYCIN, which helped to identify bacteria such as bacteremia
and meningitis, and to recommend antibiotics and dosages.

Software architecture

An expert system is an example of a knowledge-based system. Expert


systems were the first commercial systems to use a knowledge-based
architecture. A knowledge-based system is essentially composed of two sub-
systems: the knowledge base and the inference engine.[25]
The knowledge base represents facts about the world. In early expert systems
such as Mycin and Dendral, these facts were represented mainly as flat
assertions about variables. In later expert systems developed with commercial
shells, the knowledge base took on more structure and used concepts from
object-oriented programming. The world was represented as classes,
subclasses, and instances and assertions were replaced by values of object
instances. The rules worked by querying and asserting values of the objects.

The inference engine is an automated reasoning system that evaluates the


current state of the knowledge-base, applies relevant rules, and then asserts
new knowledge into the knowledge base. The inference engine may also
include abilities for explanation, so that it can explain to a user the chain of
reasoning used to arrive at a particular conclusion by tracing back over the
firing of rules that resulted in the assertion.

There are mainly two modes for an inference engine: forward


chaining and backward chaining. The different approaches are dictated by
whether the inference engine is being driven by the antecedent (left hand
side) or the consequent (right hand side) of the rule. In forward chaining an
antecedent fires and asserts the consequent. For example, consider the
following rule:

A simple example of forward chaining would be to assert Man(Socrates) to the


system and then trigger the inference engine. It would match R1 and assert
Mortal(Socrates) into the knowledge base.

Backward chaining is a bit less straight forward. In backward chaining the


system looks at possible conclusions and works backward to see if they might
be true. So if the system was trying to determine if Mortal(Socrates) is true it
would find R1 and query the knowledge base to see if Man(Socrates) is true.
One of the early innovations of expert systems shells was to integrate
inference engines with a user interface. This could be especially powerful with
backward chaining. If the system needs to know a particular fact but doesn't it
can simply generate an input screen and ask the user if the information is
known. So in this example, it could use R1 to ask the user if Socrates was a
Man and then use that new information accordingly.

The use of rules to explicitly represent knowledge also enabled explanation


abilities. In the simple example above if the system had used R1 to assert that
Socrates was Mortal and a user wished to understand why Socrates was
mortal they could query the system and the system would look back at the
rules which fired to cause the assertion and present those rules to the user as
an explanation. In English if the user asked "Why is Socrates Mortal?" the
system would reply "Because all men are mortal and Socrates is a man". A
significant area for research was the generation of explanations from the
knowledge base in natural English rather than simply by showing the more
formal but less intuitive rules.

As expert systems evolved, many new techniques were incorporated into


various types of inference engines.Some of the most important of these were:

 Truth maintenance. These systems record the dependencies in a


knowledge-base so that when facts are altered, dependent knowledge
can be altered accordingly. For example, if the system learns that
Socrates is no longer known to be a man it will revoke the assertion that
Socrates is mortal.
 Hypothetical reasoning. In this, the knowledge base can be divided up
into many possible views, a.k.a. worlds. This allows the inference
engine to explore multiple possibilities in parallel. For example, the
system may want to explore the consequences of both assertions, what
will be true if Socrates is a Man and what will be true if he is not?
 Fuzzy logic. One of the first extensions of simply using rules to
represent knowledge was also to associate a probability with each rule.
So, not to assert that Socrates is mortal, but to assert Socrates may be
mortal with some probability value. Simple probabilities were extended
in some systems with sophisticated mechanisms for uncertain
reasoning and combination of probabilities.
 Ontology classification. With the addition of object classes to the
knowledge base, a new type of reasoning was possible. Along with
reasoning simply about object values, the system could also reason
about object structures. In this simple example, Man can represent an
object class and R1 can be redefined as a rule that defines the class of
all men. These types of special purpose inference engines are
termed classifiers. Although they were not highly used in expert
systems, classifiers are very powerful for unstructured volatile domains,
and are a key technology for the Internet and the emerging Semantic
Web.

Advantages

The goal of knowledge-based systems is to make the critical information


required for the system to work explicit rather than implicit. In a traditional
computer program the logic is embedded in code that can typically only be
reviewed by an IT specialist. With an expert system the goal was to specify
the rules in a format that was intuitive and easily understood, reviewed, and
even edited by domain experts rather than IT experts. The benefits of this
explicit knowledge representation were rapid development and ease of
maintenance.

Ease of maintenance is the most obvious benefit. This was achieved in two
ways. First, by removing the need to write conventional code, many of the
normal problems that can be caused by even small changes to a system
could be avoided with expert systems. Essentially, the logical flow of the
program (at least at the highest level) was simply a given for the system,
simply invoke the inference engine. This also was a reason for the second
benefit: rapid prototyping. With an expert system shell it was possible to enter
a few rules and have a prototype developed in days rather than the months or
year typically associated with complex IT projects.

A claim for expert system shells that was often made was that they removed
the need for trained programmers and that experts could develop systems
themselves. In reality, this was seldom if ever true. While the rules for an
expert system were more comprehensible than typical computer code, they
still had a formal syntax where a misplaced comma or other character could
cause havoc as with any other computer language. Also, as expert systems
moved from prototypes in the lab to deployment in the business world, issues
of integration and maintenance became far more critical. Inevitably demands
to integrate with, and take advantage of, large legacy databases and systems
arose. To accomplish this, integration required the same skills as any other
type of system.

Disadvantages

The most common disadvantage cited for expert systems in the academic
literature is the knowledge acquisition problem. Obtaining the time of domain
experts for any software application is always difficult, but for expert systems it
was especially difficult because the experts were by definition highly valued
and in constant demand by the organization. As a result of this problem, a
great deal of research in the later years of expert systems was focused on
tools for knowledge acquisition, to help automate the process of designing,
debugging, and maintaining rules defined by experts. However, when looking
at the life-cycle of expert systems in actual use, other problems – essentially
the same problems as those of any other large system – seem at least as
critical as knowledge acquisition: integration, access to large databases, and
performance.

Performance was especially problematic because early expert systems were


built using tools such as Lisp, which executed interpreted (rather than
compiled) code. Interpreting provided an extremely powerful development
environment but with the drawback that it was virtually impossible to match
the efficiency of the fastest compiled languages, such as C. System and
database integration were difficult for early expert systems because the tools
were mostly in languages and platforms that were neither familiar to nor
welcome in most corporate IT environments – programming languages such
as Lisp and Prolog, and hardware platforms such as Lisp machines and
personal computers. As a result, much effort in the later stages of expert
system tool development was focused on integrating with legacy
environments such as COBOL and large database systems, and on porting to
more standard platforms. These issues were resolved mainly by the client-
server paradigm shift, as PCs were gradually accepted in the IT environment
as a legitimate platform for serious business system development and as
affordable minicomputer servers provided the processing power needed for AI
applications.

Applications

Hayes-Roth divides expert systems applications into 10 categories illustrated


in the following table. The example applications were not in the original
Hayes-Roth table, and some of them arose well afterward. Any application
that is not footnoted is described in the Hayes-Roth book. Also, while these
categories provide an intuitive framework to describe the space of expert
systems applications, they are not rigid categories, and in some cases an
application may show traits of more than one category.

Category Problem addressed Examples


Inferring situation
Hearsay (speech recognition),
Interpretation descriptions from sensor
PROSPECTOR
data
Inferring likely
Prediction consequences of given Preterm Birth Risk Assessment
situations
Inferring system
CADUCEUS, MYCIN, PUFF,
Diagnosis malfunctions from
Mistral, Eydenet, Kaleidos
observables
Dendral, Mortgage Loan
Configuring objects Advisor, R1 (DEC VAX
Design
under constraints Configuration), SID (DEC VAX
9000 CPU)
Mission Planning for Autonomous
Planning Designing actions
Underwater Vehicle
Comparing observations
Monitoring REACTOR
to plan vulnerabilities
Providing incremental
Debugging solutions for complex SAINT, MATHLAB, MACSYMA
problems
Executing a plan to
Repair administer a prescribed Toxic Spill Crisis Management
remedy
Diagnosing, assessing,
SMH.PAL,Intelligent Clinical
Instruction and repairing student
Training, STEAMER
behavior
Interpreting, predicting,
Real Time Process Control, Space
Control repairing, and monitoring
Shuttle Mission Control
system behaviors
Components of a Database Management System
Organizations produce and gather data as they operate. Contained in a database, data is typically
organized to model relevant aspects of reality in a way that supports processes requiring this
information. Knowing how this can be managed effectively is vital to any organization.

What is a Database Management System (or DBMS)?

Organizations employ Database Management Systems (or DBMS) to help them effectively manage their
data and derive relevant information out of it. A DBMS is a technology tool that directly supports data
management. It is a package designed to define, manipulate, and manage data in a database.

Some general functions of a DBMS:

 Designed to allow the definition, creation, querying, update, and administration of databases

 Define rules to validate the data and relieve users of framing programs for data maintenance

 Convert an existing database, or archive a large and growing one


 Run business applications, which perform the tasks of managing business processes, interacting
with end-users and other applications, to capture and analyze data

Some well-known DBMSs are Microsoft SQL Server, Microsoft Access, Oracle, SAP, and others.

Components of DBMS

DBMS have several components, each performing very significant tasks in the database management
system environment. Below is a list of components within the database and its environment.

Software
This is the set of programs used to control and manage the overall database. This includes the DBMS
software itself, the Operating System, the network software being used to share the data among users,
and the application programs used to access data in the DBMS.

Hardware
Consists of a set of physical electronic devices such as computers, I/O devices, storage devices, etc., this
provides the interface between computers and the real world systems.

Data
DBMS exists to collect, store, process and access data, the most important component. The database
contains both the actual or operational data and the metadata.

Procedures
These are the instructions and rules that assist on how to use the DBMS, and in designing and running
the database, using documented procedures, to guide the users that operate and manage it.

Database Access Language


This is used to access the data to and from the database, to enter new data, update existing data, or
retrieve required data from databases. The user writes a set of appropriate commands in a database
access language, submits these to the DBMS, which then processes the data and generates and displays
a set of results into a user readable form.

Query Processor
This transforms the user queries into a series of low level instructions. This reads the online user’s query
and translates it into an efficient series of operations in a form capable of being sent to the run time
data manager for execution.

Run Time Database Manager


Sometimes referred to as the database control system, this is the central software component of the
DBMS that interfaces with user-submitted application programs and queries, and handles database
access at run time. Its function is to convert operations in user’s queries. It provides control to maintain
the consistency, integrity and security of the data.

Data Manager
Also called the cache manger, this is responsible for handling of data in the database, providing a
recovery to the system that allows it to recover the data after a failure.

Database Engine
The core service for storing, processing, and securing data, this provides controlled access and rapid
transaction processing to address the requirements of the most demanding data consuming
applications. It is often used to create relational databases for online transaction processing or online
analytical processing data.

Data Dictionary
This is a reserved space within a database used to store information about the database itself. A data
dictionary is a set of read-only table and views, containing the different information about the data used
in the enterprise to ensure that database representation of the data follow one standard as defined in
the dictionary.

Report Writer
Also referred to as the report generator, it is a program that extracts information from one or more files
and presents the information in a specified format. Most report writers allow the user to select records
that meet certain conditions and to display selected fields in rows and columns, or also format the data
into different charts.

Great Performance through Effective DBMS

A company’s performance is greatly affected by how it manages its data. And one of the most basic
tasks of data management is the effective management of its database. Understanding the different
components of the DBMS and how it works and relates to each other is the first step to employing an
effective DBMS.

Constraints on Database Tables


Constraints are the rules enforced on the data columns of a table. These are
used to limit the type of data that can go into a table. This ensures the
accuracy and reliability of the data in the database.

Constraints could be either on a column level or a table level. The column


level constraints are applied only to one column, whereas the table level
constraints are applied to the whole table.

Following are some of the most commonly used constraints available in


SQL. These constraints have already been discussed in SQL - RDBMS
Conceptschapter, but it’s worth to revise them at this point.

 NOT NULL Constraint − Ensures that a column cannot have NULL value.

 DEFAULT Constraint − Provides a default value for a column when none is


specified.

 UNIQUE Constraint − Ensures that all values in a column are different.

 PRIMARY Key − Uniquely identifies each row/record in a database table.


 FOREIGN Key − Uniquely identifies a row/record in any of the given database
table.

 CHECK Constraint − The CHECK constraint ensures that all the values in a
column satisfies certain conditions.

 INDEX − Used to create and retrieve data from the database very quickly.

Constraints can be specified when a table is created with the CREATE TABLE
statement or you can use the ALTER TABLE statement to create constraints
even after the table is created.

Dropping Constraints
Any constraint that you have defined can be dropped using the ALTER
TABLE command with the DROP CONSTRAINT option.

For example, to drop the primary key constraint in the EMPLOYEES table,
you can use the following command.

ALTER TABLE EMPLOYEES DROP CONSTRAINT EMPLOYEES_PK;

Some implementations may provide shortcuts for dropping certain


constraints. For example, to drop the primary key constraint for a table in
Oracle, you can use the following command.

ALTER TABLE EMPLOYEES DROP PRIMARY KEY;

Some implementations allow you to disable constraints. Instead of


permanently dropping a constraint from the database, you may want to
temporarily disable the constraint and then enable it later.

Integrity Constraints
Integrity constraints are used to ensure accuracy and consistency of the
data in a relational database. Data integrity is handled in a relational
database through the concept of referential integrity.

There are many types of integrity constraints that play a role


in Referential Integrity (RI). These constraints include Primary Key,
Foreign Key, Unique Constraints and other constraints which are mentioned
above.
Data Mart

Data Mart and Types of Data Marts


Data mart can be defined as the subset of data warehouse of an organization which is limited to a specific
business unit or group of users. It is a subject-oriented database and is also known as High Performance Query
Structures (HPQS).

Data marts are of two types – Dependent and Independent.


Dependent Data Mart – This data mart depends on the enterprise data warehouse and works in top-down
manner.

Independent Data Mart – This data mart does not depend on the enterprise data warehouse and works in
bottom-up manner.
Hybrid Data Marts

A hybrid data mart allows you to combine input from sources other than a data
warehouse. This could be useful for many situations, especially when you need ad hoc
integration, such as after a new group or product is added to the
organization. illustrates a hybrid data mart.
Hybrid Data Mart

Benefits of Data Marts

 Allows the data to be accessed in lesser time


 Cost-efficient alternative to the bulky data warehouse
 Easy to use as designed according to the needs of specific user group
 Fastens the business processes.

OLTP vs. OLAP


We can divide IT systems into transactional (OLTP) and analytical (OLAP). In
general we can assume that OLTP systems provide source data to data warehouses,
whereas OLAP systems help to analyze it.
- OLTP (On-line Transaction Processing) is characterized by a large number of
short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP
systems is put on very fast query processing, maintaining data integrity in multi-access
environments and an effectiveness measured by number of transactions per second. In
OLTP database there is detailed and current data, and schema used to store
transactional databases is the entity model (usually 3NF).

- OLAP (On-line Analytical Processing) is characterized by relatively low volume of


transactions. Queries are often very complex and involve aggregations. For OLAP
systems a response time is an effectiveness measure. OLAP applications are widely used
by Data Mining techniques. In OLAP database there is aggregated, historical data,
stored in multi-dimensional schemas (usually star schema).

The following table summarizes the major differences between OLTP and OLAP system
design.

OLTP System OLAP System


Online Transaction Processing Online Analytical Processing
(Operational System) (Data Warehouse)
Operational data; OLTPs are the original source Consolidation data; OLAP data comes from the
Source of data
of the data. various OLTP Databases
Purpose of To help with planning, problem solving, and decision
To control and run fundamental business tasks
data support
Reveals a snapshot of ongoing business Multi-dimensional views of various kinds of business
What the data
processes activities
Inserts and Short and fast inserts and updates initiated by
Periodic long-running batch jobs refresh the data
Updates end users
Relatively standardized and simple queries
Queries Often complex queries involving aggregations
Returning relatively few records
Depends on the amount of data involved; batch data
Processing refreshes and complex queries may take many
Typically very fast
Speed hours; query speed can be improved by creating
indexes
Larger due to the existence of aggregation
Space Can be relatively small if historical data is
structures and history data; requires more indexes
Requirements archived
than OLTP
Database Highly normalized with many tables Typically de-normalized with fewer tables; use of
Design star and/or snowflake schemas
Backup religiously; operational data is critical Instead of regular backups, some environments may
Backup and
to run the business, data loss is likely to entail consider simply reloading the OLTP data as a
Recovery
significant monetary loss and legal liability recovery method

Data Warehousing
The term "Data Warehouse" was first coined by Bill Inmon in 1990.
According to Inmon, a data warehouse is a subject oriented, integrated,
time-variant, and non-volatile collection of data. This data helps analysts to
take informed decisions in an organization.

An operational database undergoes frequent changes on a daily basis on


account of the transactions that take place. Suppose a business executive
wants to analyze previous feedback on any data such as a product, a
supplier, or any consumer data, then the executive will have no data
available to analyze because the previous data has been updated due to
transactions.

A data warehouses provides us generalized and consolidated data in


multidimensional view. Along with generalized and consolidated view of
data, a data warehouses also provides us Online Analytical Processing
(OLAP) tools. These tools help us in interactive and effective analysis of
data in a multidimensional space. This analysis results in data
generalization and data mining.

Data mining functions such as association, clustering, classification,


prediction can be integrated with OLAP operations to enhance the
interactive mining of knowledge at multiple level of abstraction. That's why
data warehouse has now become an important platform for data analysis
and online analytical processing.

Understanding a Data Warehouse


 A data warehouse is a database, which is kept separate from the organization's
operational database.

 There is no frequent updating done in a data warehouse.

 It possesses consolidated historical data, which helps the organization to analyze


its business.

 A data warehouse helps executives to organize, understand, and use their data
to take strategic decisions.

 Data warehouse systems help in the integration of diversity of application


systems.

 A data warehouse system helps in consolidated historical data analysis.

Why a Data Warehouse is Separated from


Operational Databases
A data warehouses is kept separate from operational databases due to the
following reasons −

 An operational database is constructed for well-known tasks and workloads such


as searching particular records, indexing, etc. In contract, data warehouse
queries are often complex and they present a general form of data.

 Operational databases support concurrent processing of multiple transactions.


Concurrency control and recovery mechanisms are required for operational
databases to ensure robustness and consistency of the database.
 An operational database query allows to read and modify operations, while an
OLAP query needs only read only access of stored data.

 An operational database maintains current data. On the other hand, a data


warehouse maintains historical data.

Data Warehouse Features


The key features of a data warehouse are discussed below −

 Subject Oriented − A data warehouse is subject oriented because it provides


information around a subject rather than the organization's ongoing operations.
These subjects can be product, customers, suppliers, sales, revenue, etc. A data
warehouse does not focus on the ongoing operations, rather it focuses on
modelling and analysis of data for decision making.

 Integrated − A data warehouse is constructed by integrating data from


heterogeneous sources such as relational databases, flat files, etc. This
integration enhances the effective analysis of data.

 Time Variant − The data collected in a data warehouse is identified with a


particular time period. The data in a data warehouse provides information from
the historical point of view.

 Non-volatile − Non-volatile means the previous data is not erased when new
data is added to it. A data warehouse is kept separate from the operational
database and therefore frequent changes in operational database is not
reflected in the data warehouse.

Note − A data warehouse does not require transaction processing,


recovery, and concurrency controls, because it is physically stored and
separate from the operational database.

Data Warehouse Applications


As discussed before, a data warehouse helps business executives to
organize, analyze, and use their data for decision making. A data
warehouse serves as a sole part of a plan-execute-assess "closed-loop"
feedback system for the enterprise management. Data warehouses are
widely used in the following fields −
 Financial services

 Banking services

 Consumer goods

 Retail sectors

 Controlled manufacturing

Types of Data Warehouse


Information processing, analytical processing, and data mining are the
three types of data warehouse applications that are discussed below −

 Information Processing − A data warehouse allows to process the data stored


in it. The data can be processed by means of querying, basic statistical analysis,
reporting using crosstabs, tables, charts, or graphs.

 Analytical Processing − A data warehouse supports analytical processing of


the information stored in it. The data can be analyzed by means of basic OLAP
operations, including slice-and-dice, drill down, drill up, and pivoting.

 Data Mining − Data mining supports knowledge discovery by finding hidden


patterns and associations, constructing analytical models, performing
classification and prediction. These mining results can be presented using the
visualization tools.

we will discuss the business analysis framework for the data warehouse
design and architecture of a data warehouse.

Business Analysis Framework


The business analyst get the information from the data warehouses to
measure the performance and make critical adjustments in order to win
over other business holders in the market. Having a data warehouse offers
the following advantages −

 Since a data warehouse can gather information quickly and efficiently, it can
enhance business productivity.

 A data warehouse provides us a consistent view of customers and items, hence,


it helps us manage customer relationship.
 A data warehouse also helps in bringing down the costs by tracking trends,
patterns over a long period in a consistent and reliable manner.

To design an effective and efficient data warehouse, we need to understand


and analyze the business needs and construct a business analysis
framework. Each person has different views regarding the design of a data
warehouse. These views are as follows −

 The top-down view − This view allows the selection of relevant information
needed for a data warehouse.

 The data source view − This view presents the information being captured,
stored, and managed by the operational system.

 The data warehouse view − This view includes the fact tables and dimension
tables. It represents the information stored inside the data warehouse.

 The business query view − It is the view of the data from the viewpoint of
the end-user.

Three-Tier Data Warehouse Architecture


Generally a data warehouses adopts a three-tier architecture. Following are
the three tiers of the data warehouse architecture.

 Bottom Tier − The bottom tier of the architecture is the data warehouse
database server. It is the relational database system. We use the back end tools
and utilities to feed data into the bottom tier. These back end tools and utilities
perform the Extract, Clean, Load, and refresh functions.

 Middle Tier − In the middle tier, we have the OLAP Server that can be
implemented in either of the following ways.

o By Relational OLAP (ROLAP), which is an extended relational database


management system. The ROLAP maps the operations on
multidimensional data to standard relational operations.

o By Multidimensional OLAP (MOLAP) model, which directly implements the


multidimensional data and operations.

 Top-Tier − This tier is the front-end client layer. This layer holds the query
tools and reporting tools, analysis tools and data mining tools.
The following diagram depicts the three-tier architecture of data warehouse

Data Warehouse Models


From the perspective of data warehouse architecture, we have the following
data warehouse models −

 Virtual Warehouse

 Data mart

 Enterprise Warehouse

Virtual Warehouse
The view over an operational data warehouse is known as a virtual
warehouse. It is easy to build a virtual warehouse. Building a virtual
warehouse requires excess capacity on operational database servers.
Data Mart
Data mart contains a subset of organization-wide data. This subset of data
is valuable to specific groups of an organization.

In other words, we can claim that data marts contain data specific to a
particular group. For example, the marketing data mart may contain data
related to items, customers, and sales. Data marts are confined to subjects.

Points to remember about data marts −

 Window-based or Unix/Linux-based servers are used to implement data marts.


They are implemented on low-cost servers.

 The implementation data mart cycles is measured in short periods of time, i.e.,
in weeks rather than months or years.

 The life cycle of a data mart may be complex in long run, if its planning and
design are not organization-wide.

 Data marts are small in size.

 Data marts are customized by department.

 The source of a data mart is departmentally structured data warehouse.

 Data mart are flexible.

Enterprise Warehouse
 An enterprise warehouse collects all the information and the subjects spanning
an entire organization

 It provides us enterprise-wide data integration.

 The data is integrated from operational systems and external information


providers.

 This information can vary from a few gigabytes to hundreds of gigabytes,


terabytes or beyond.

Load Manager
This component performs the operations required to extract and load
process.
The size and complexity of the load manager varies between specific
solutions from one data warehouse to other.

Load Manager Architecture


The load manager performs the following functions −

 Extract the data from source system.

 Fast Load the extracted data into temporary data store.

 Perform simple transformations into structure similar to the one in the data
warehouse.

Extract Data from Source


The data is extracted from the operational databases or the external
information providers. Gateways is the application programs that are used
to extract data. It is supported by underlying DBMS and allows client
program to generate SQL to be executed at a server. Open Database
Connection(ODBC), Java Database Connection (JDBC), are examples of
gateway.

Fast Load
 In order to minimize the total load window the data need to be loaded into the
warehouse in the fastest possible time.
 The transformations affects the speed of data processing.

 It is more effective to load the data into relational database prior to applying
transformations and checks.

 Gateway technology proves to be not suitable, since they tend not be


performant when large data volumes are involved.

Simple Transformations
While loading it may be required to perform simple transformations. After
this has been completed we are in position to do the complex checks.
Suppose we are loading the EPOS sales transaction we need to perform the
following checks:

 Strip out all the columns that are not required within the warehouse.

 Convert all the values to required data types.

Warehouse Manager
A warehouse manager is responsible for the warehouse management
process. It consists of third-party system software, C programs, and shell
scripts.

The size and complexity of warehouse managers varies between specific


solutions.

Warehouse Manager Architecture


A warehouse manager includes the following −

 The controlling process

 Stored procedures or C with SQL

 Backup/Recovery tool

 SQL Scripts
Operations Performed by Warehouse Manager
 A warehouse manager analyzes the data to perform consistency and referential
integrity checks.

 Creates indexes, business views, partition views against the base data.

 Generates new aggregations and updates existing aggregations. Generates


normalizations.

 Transforms and merges the source data into the published data warehouse.

 Backup the data in the data warehouse.

 Archives the data that has reached the end of its captured life.

Note − A warehouse Manager also analyzes query profiles to determine


index and aggregations are appropriate.

Query Manager
 Query manager is responsible for directing the queries to the suitable tables.

 By directing the queries to appropriate tables, the speed of querying and


response generation can be increased.

 Query manager is responsible for scheduling the execution of the queries posed
by the user.
Query Manager Architecture
The following screenshot shows the architecture of a query manager. It
includes the following:

 Query redirection via C tool or RDBMS

 Stored procedures

 Query management tool

 Query scheduling via C tool or RDBMS

 Query scheduling via third-party software

Detailed Information
Detailed information is not kept online, rather it is aggregated to the next
level of detail and then archived to tape. The detailed information part of
data warehouse keeps the detailed information in the starflake schema.
Detailed information is loaded into the data warehouse to supplement the
aggregated data.

The following diagram shows a pictorial impression of where detailed


information is stored and how it is used.
Note − If detailed information is held offline to minimize disk storage, we
should make sure that the data has been extracted, cleaned up, and
transformed into starflake schema before it is archived.

Summary Information
Summary Information is a part of data warehouse that stores predefined
aggregations. These aggregations are generated by the warehouse
manager. Summary Information must be treated as transient. It changes
on-the-go in order to respond to the changing query profiles.

The points to note about summary information are as follows −

 Summary information speeds up the performance of common queries.

 It increases the operational cost.

 It needs to be updated whenever new data is loaded into the data warehouse.

 It may not have been backed up, since it can be generated fresh from the
detailed information.
Data Manipulation Language
Data manipulation is the process of changing data in an effort to make it easier to read or be
more organized. For example, a log of data could be organized in alphabetical order, making
individual entries easier to locate.

When we have to insert records into table or get specific record from the table, or need to change
some record, or delete some record or perform any other actions on records in the database, we
need to have some media to perform it. DML helps to handle user requests. It helps to insert, delete,
update, and retrieve the data from the database. Let us see some of them.

Querying Tables

Inserting Data

Updating Data

Deleting Data

Committing/Rollbacking Changes

Inserting Data
Inserting data is with the syntax: INSERT INTO <table> [<columns,..>]
VALUES (value [...]);

Only one row is inserted into a table at one time using this syntax.

For example,

SQL> INSERT INTO employee (empno, name, dept_no)


VALUES (1, 'TOM LEE', 4);

Assume the employee table is only of three fields empno, name and dept_no. Then the
above statement can be rewritten in a simplified format (by omitting the column list)
as:

SQL> INSERT INTO employee VALUES (1, 'TOM LEE', 4);


Let's assume the fourth column of the employee table is NULL-able field
called comment, then the SQL statement can be rewritten to explicitly specify the
NULL value.

SQL> INSERT INTO employee VALUES (1, 'TOM LEE', 4,


NULL);

Let's assume the fifth column of the employee table is a DATE field called hiredate,
then the SQL statement can be rewritten to include the SYSDATE function call.

SQL> INSERT INTO employee VALUES (1, 'TOM LEE', 4,


NULL, SYSDATE);

To interactively prompt users to enter values for the fields at SQL*Plus terminal, use
the following syntax:

SQL> INSERT INTO employee VALUES (&employee_id,


&name, &salary, &comment, SYSDATE);

The SQL*Plus terminal then will prompt users to enter values for the three
variables employee_id, name, and salary.

To inserting rows by copying from another table, use a subquery.

For example,

SQL> INSERT INTO managers (empno, name, salary)

SELECT empno, name, salary FROM employee WHERE


jobtitle="MANAGER";

Note that to insert into a table a large amount data that are stored external files,
click here for guide.

Updating Data
To change the data, use the syntax

UPDATE <table name>

SET columname = value [, column name = value ...]


WHERE <condition>;

The WHERE condition specifies what data to be updated. Without WHERE condition
all table rows will be updated.

For example,

SQL> UPDATE employee

SET sal = sal + 1000, COMMENT="GOOD PERFORMANCE"

WHERE empno = 53;

Updating from a subquery is possible. For example,

UPDATE employee

SET (jobtitle, sal) = (SELECT jobtitle, sal FROM employee


WHERE empno=87)

WHERE empno= 54;

Deleting Data
To delete a row from a table, use the syntax
DELETE FROM <table name> WHERE <condition>;

The FROM keyword is often omitted for simplicity.

For example,

DELETE FROM employee WHERE empno=8999;

It is illegal to delete a row that contains a primary key that is a foreign key in another
table. Oracle will throw an execution error on this.

For example, attempt to delete a row in the department table with dept_no = 10 is not
allowed if the employee table has a foreign key referential constraint in the
department table on dept_no column, and some employees in the employee table have
the dept_no as 10.
Committing/Rollbacking Changes
Database transactions end end with the following events:

 COMMIT or ROLLBACK is issued.


 A DDL command is executed and automatically committed.
 User exits from server connection.
 System crashes.

With COMMIT/ROLLBACK, the server is able to achieve data consistency and allow
users to preview the data changes before making the changes permanently.

The COMMIT statement ends the current transaction and makes permanent any changes
made during that transaction. Until you commit the changes, other users cannot access
the changed data; they see the data as it was before you made the changes. An
automatic COMMIT is performed when a DDL statement is issued or normal exit
from SQL*Plus without explicitly issuing COMMIT or ROLLBACK.

Consider a simple transaction that transfers money from one bank account to another.
The transaction requires two updates because it debits the first account, then credits
the second. In the example below, after crediting the second account, you issue a
commit, which makes the changes permanent. Only then do other users see the
changes.

The ROLLBACK statement ends the current transaction and undoes any changes made
during that transaction. Rolling back is useful for two reasons. First, if you make a
mistake like deleting the wrong row from a table, a rollback restores the original data.
Second, if you start a transaction that you cannot finish because an exception is raised
or a SQL statement fails, a rollback lets you return to the starting point to take
corrective action and perhaps try again. An automatic ROLLBACK is performed
when the abnormal termination of SQL*Plus or system failure.

SAVEPOINT names and marks the current point in the processing of a transaction. Used
with the ROLLBACK TO statement, savepoints let you undo parts of a transaction instead
of the whole transaction. In the example below, you mark a savepoint before doing an
insert. If the INSERT statement tries to store a duplicate value in the empno column, the
predefined exception DUP_VAL_ON_INDEX is raised. In that case, you roll back to the
savepoint, undoing just the insert.
The following figure illustrates the description of COMMIT, SAVEPOINT and
ROLLBACK.

The following commands are the sample use of COMMIT, SAVEPOINT, and
ROLLBACK commands.

SQL> UPDATE employee

SET sal = sal + 1000, COMMENT="GOOD PERFORMANCE"

WHERE empno = 53;

1 row updated.

SQL> COMMIT;

Commit complete.

SQL> DELETE FROM employee;

49 rows deleted.

SQL> ROLLBACK;

Rollback complete.

SQL> UPDATE .....


SQL> SAVEPOINT UPDATE_pt1

Savepoint created.

SQL> UPDATE ....

SQL> SAVEPOINT UPDATE_pt2

Savepoint created.

SQL> ROLLBACK TO UPDATE_pt1;

Rollback complete.

You might also like