Dbms and Data Warehouse
Dbms and Data Warehouse
The database is a collection of inter-related data which is used to retrieve, insert and
delete the data efficiently. It is also used to organize the data in the form of a table,
schema, views, and reports, etc.
For example: The college Database organizes the data about the admin, staff, students
and faculty etc.
Using the database, you can easily retrieve, insert, and delete the information.
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all the
data in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the
database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup of
data from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces
Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and large
memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in most of
the organization, all the data stored in a single database and if the database is damaged
due to electric failure or database corruption then the data may be lost forever.
MySQL
PostgreSQL
Microsoft Access
SQL Server
Oracle
There are the following differences between DBMS and File systems:
Sharing of data Due to the centralized approach, Data is distributed in many files,
data sharing is easy. and it may be of different
formats, so it isn't easy to share
data.
Data Abstraction DBMS gives an abstract view of The file system provides the
data that hides the details. detail of the data representation
and storage of data.
Security and DBMS provides a good protection It isn't easy to protect a file under
Protection mechanism. the file system.
Recovery DBMS provides a crash recovery The file system doesn't have a
Mechanism mechanism, i.e., DBMS protects the crash mechanism, i.e., if the
user from system failure. system crashes while entering
some data, then the content of
the file will be lost.
Manipulation DBMS contains a wide variety of The file system can't efficiently
Techniques sophisticated techniques to store store and retrieve the data.
and retrieve the data.
Where to use Database approach used in large File system approach used in
systems which interrelate many large systems which interrelate
files. many files.
Data Due to the centralization of the In this, the files and application
Redundancy and database, the problems of data programs are created by
Inconsistency redundancy and inconsistency are different programmers so that
controlled. there exists a lot of duplication of
data which may lead to
inconsistency.
Structure The database structure is complex The file system approach has a
to design. simple structure.
Data In this system, Data Independence In the File system approach,
Independence exists, and it can be of two types. there exists no Data
o Logical Data Independence Independence.
Data Models In the database approach, 3 types In the file system approach, there
of data models exist: is no concept of data models
exists.
o Hierarchal data models
o Network data models
o Relational data models
Flexibility Changes are often a necessity to The flexibility of the system is less
the content of the data stored in as compared to the DBMS
any system, and these changes are approach.
more easily with a database
approach.
1. Internal Level
o The internal level has an internal schema which describes the physical storage
structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be
stored in a block.
o The physical level is used to describe complex low-level data structures in detail.
2. Conceptual Level
3. External Level
o At the external level, a database contains several schemas that sometimes called
as subschema. The subschema is used to describe the different view of the
database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.
1. Subject-oriented
A specific from
collected business
here.purpose can be analyzed with the data
2. Integrated
3. Time-variant
4. Non-volatile
If there are any modifications made, then it will affect the reports
and analysis.
Data Ownership
We know
service. The
that
main
data
concern
warehouses
of it isare
thesoftware
security applications
of data. for
You have to be more sure that the people who handle and
analyze the customer data are the employees that your company
trusts.
Data Rigidity
The data that is imported into the data warehouse is often static
data sets that have less flexibility. They have less ability to
generate a particular solution.
But usually, organizations do not guess the time required for the
ETL process. As a result, it leads to a backlog of works in the
organization.
Discuss ETL process:
First, we clean the data extracted from each source. Cleaning may
be the correction of misspellings or may deal with providing
default values for missing data elements, or elimination of
duplicates when we bring in the same data from various source
systems.
Standardization of data components forms a large part of data
transformation. Data transformation contains many forms of
combining pieces of data from different sources. We combine
data from single source record or related data parts from many
source records.
9.3M
129
SQL CREATE TABLE
Other than these two categories, one more type exists that is
called "Hybrid Data Marts."
There are OLAP servers available for nearly all the major database
systems.
It helps to get all the data together to create accurate and quick
information about the business.
It helps to analyze the time series.
Provides a platform for all types of business, including planning,
budgeting, forecasting, financial reporting, data warehouse
reporting.
Allows users to do compatible calculations.
Allows users to divide a big cube into dice cube data by several
dimensions, measures, and filters.
It helps the end-users to analyze data in multiple dimensions so
that they make better decisions in business.
Disadvantages of Online Analytical Processing
Increasing of dimension
What is OLTP?
OLTP means online transaction processing. It is an operational
system used for handling recent operational data.
OLAP is the system used for data analysis, OLTP is the system used
for data transactions.
OLAP is identified by a large amount of data, whereas OLTP is
identified by a large number of small amounts of data.
OLAP is large in size, basically ranging from 1Tb to 100Pb, OLTP is
small in size ranging from 1Mb to 10 Gb.
OLAP operates with a data warehouse, OLTP operates with a
traditional database management system.
Processing speed is less in OLAP, but OLTP has a faster processing
speed.
OLAP reply time is more, usually takes seconds to minute to
respond, OLTP responds fastly, takes milliseconds.
OLTP needs both read and write operations, but OLAP needs only
read operations.
The objective of OLAP is to make decisions with the help of large
data sources. On the other hand, the objective of OLTP is day-to-
day operations.
Queries are complex in OLAP, and queries are simple in OLTP.
User strength is low in OLAP. Its database allows only hundreds of
users, whereas the OLTP database allows thousands of users.
OLAP helps to improve the productivity of business analysts, OLTP
helps to improve the productivity and self-service of users.
OLAP is created for business analysis, whereas OLTP is created for
real-time business operations.