Reviewer Infoshit
Reviewer Infoshit
Week 1 - Orientation
Week 2 – Intro to DB Systems
Data – group of information that represent the
qualitative or quantitative attributes of a variable or set
of variables.
Information – refined or processed data that has been
transformed into meaningful and useful form of data.
Databases & Data Modeling
Database – a collection of information that is
organized so that it can be easily managed and Evolution of DB
updated.
1960 – File Processing Systems (batch
Database Management System (DBMS) – collection processing, punch cards, magnetic tape)
of interrelated data and a set of programs to access
File System Data Processing – updating, sorting
the data.
or validating a data file.
❖ facilitates the process of defining,
constructions, manipulating and sharing File – a chunk of data. File processing is anything
databases among various users. you do to that data.
Examples of DBMS
Logical Model – describes the data flow and database
content. It adds detail to the overall structure in the
conceptual model.
Components:
a. Entities – each entity represents a set of
things, persons, or concepts relevant to a
business.
b. Relationships – every relationship
represent an association between two
Network Database – This model solves the problem entities.
of data redundancy by representing relationships in c. Attributes – each attribute is a descriptive
terms of sets rather than hierarchy. piece of characteristic or any other
Relational Database – uses relations or information that is useful to describe an
two-dimensional tables to store information. entity.
✔ Presence of attributes for each entity.
Object Relational Database – a DBMS similar to ✔ Key Attributes/Non-Key Attributes
relational DB but with object-oriented database ✔ Primary Key – Foreign Key Relationships
model. ✔ User-friendly attribute names
✔ More detailed than Conceptual model.
Data Modeling – process of creating a simplified
✔ Database agnostic
diagram of a software system and the data elements
✔ A bit difficult to enhance.
it contains.
Physical Model – describes the specifics of how the
❖ Provides a blueprint for designing a new
logical model will be realized. It is specific to a
database.
designated database software system.
❖ Helps an organization use its data effectively
to meet business needs for information. ✔ Entity names are now table names.
❖ The data model lives on to become the ✔ Attributes are now column names.
documentation and justification for why the ✔ Database compatible names.
database exists and how it flows. ✔ Data type for each column is specified.
Why is Data Modeling done? ✔ Difficult for users to understand.
Why is Data Modeling important? Graph Modeling – often used to describe data sets
that contain complex relationships.
✔ Eliminates redundancy.
✔ Reduces storage requirements. Week 3 – Business Rules & Data Abstraction
✔ Enables efficient retrieval.
Business Rule – a set of approved guidelines or
Phases of Data Modeling frameworks within an organization. It is also a
statement that imposes some form of constraint on a
Conceptual Model – typical starting point for data specific aspect of the database.
modeling, identifying the various data sets and data
flow. These rules will influence a wide variety of database
issues:
✔ Highly abstract
✔ Easy to understand. ✔ Data you collect and store.
✔ Easy to enhance. ✔ The way you define and establish
✔ Only entities are visible. relationships.
✔ No software tool is required.
✔ Types of information that the database can ❖ Data are stored as tables.
provide.
Relational Model Concepts:
✔ Security and confidentiality of the data itself.
Tuple – a single row of a table which contains a single
Each organization has its own data and information
record.
requirements, and each has its own unique way of
conducting its business; therefore, every Relation Schema – represents the name of the
organization needs its own specific set of relation with its attributes.
business rules.
Degree – total number of attributes which relation is
Types of Business Rules: called the degree of the relation.
Database-oriented – impose constraints that you can Cardinality – total number of rows present in the table.
establish within the logical design of the database.
Implementing a given constraint by modifying Column – represents the set of values for a specific
various field specification elements, relationship attribute.
characteristics or a combination of the two. Relation Instance – a finite set of tuples in the
(Like irerequire mo sa fields yung address, name, or RDBMS. Never have duplicates tuples.
any other information. Sa database lang siya Relation Key – every row has one of multiple
macoconfigure) attributes which is called relation key.
Application-oriented – impose constraints that you Attribute Domain – every attribute has some
cannot establish within the logical design of the pre-defined value and scope which is known as
database. Needs to be established within the physical attribute.
design of the database where they will be more
meaningful. Relational Integrity Constraints – referred to
conditions which must be present for a valid relation.
(Eto naman yung automatically magaappear sa mga Derived from the rules in the mini world that the
forms like if limited lang yung access sa users due to database represents.
location issues, magkakaroon ng warning or notice sa
mismong system na magnonotify sa user) Types of Integrity Constraints:
Data Abstraction – refers to the process of hiding Domain Constraints – can be violated if an attribute
irrelevant details from the user. There are three levels value is not appearing in the corresponding domain.
of data abstraction to achieve data independence. The value of each attribute must be unique.
Data Independence – users and data should not Key Constraints – an attribute can uniquely identify a
directly interact with each other. The user should be at tuple in a relation is called the key to the table. The
a different level and the data should be present at value of the attribute for different tuples in the relation
some other level. has to be unique.
View Level – this level tells the application about how Referential Integrity Constraints – based on the
the data should be shown to the user. concept of foreign keys. This happens where relation
refers to a key attribute of a different or same relation.
Conceptual Level/Logical Level – this level tells how However, that key element must exist in the table.
the data is actually stored and structured.
Operations in Relational Model
Physical Level or Internal Schema – this tells us
where the data is stored. Insert – used to insert data into the relation.
Week 4 – Relational Database Model Delete – used to delete tuples from the table.
Relational Model – represents the database a Modify – allows you to change the values of some
collection of relations. attributes in existing tuples.
❖ The table name and column names are Select – allows you to choose a specific range of
helpful to interpret the meaning of values in data.
each row.
Best Practices for Creating a Relational Model:
❖ Data are represented as a set of relations.
✔ Data need to be represented as a collection of Deletion Anomaly – This anomaly happens when
relations. deletion of a data record results in losing some
✔ Each relation should be depicted clearly in the unrelated information that was stored as part of the
table. record that was deleted from a table.
✔ Rows should contain data about instances of
Update Anomaly – Manually updating from the
an entity.
database which can be time-consuming.
✔ Columns must contain data about attributes of
the entity. Database Relationships
✔ Cells of the table should hold a single value.
✔ Each column should be given a unique name. One-to-One Relationships (1:1) – each record in tbl A
✔ No two rows should be identical. relates to only one record in tbl B, vice versa.
✔ The values of an attribute should be from the One-to-Many Relationships (1:N) – tbl A can relate to
same domain. zero, one or many records in tbl B, vice versa.
Advantages:
Many-to-Many Relationship – Each project can
✔ Simplicity – simpler than hierarchical and involve more than one employee and each employee
network model. can be working on more than one project.
✔ Structural Independence – only concerned Codd’s Relational Database Rules
with data and not with a structure.
✔ Easy to use – quite natural and simple to Dr. Edgar F. Codd – 12 rules a database must obey in
understand. order to be regarded as a true relational database.
✔ Query capability – possible for a high-level
Rule 1: Information Rule – the organization of
query language like SQL to avoid complex
database navigation. headers in the database should be flexible and should
not have an impact.
✔ Data independence – can be changed without
having to change any application. Rule 2: Guaranteed Access Rule – Every single data
✔ Scalable – should be enlarged to enhance element is guaranteed to be accessible logically with
usability. a combination of table-name, primary-key and
attribute-name. Pointers are not allowed.
Disadvantages:
Rule 3: Systematic Treatment of NULL values – NULL
▪ Few relational databases have limits on filed
values in a database must be given a systematic
length.
treatment because it can interpret as a missing data,
▪ Relational databases can be complex as the
not known or data is not applicable.
amount of data grows.
▪ Complex relational database systems may Rule 4: Active Online Catalog – the structure
lead to isolated databases where info cannot description of the entire database must be stored in
be shared from one system to another. an online catalog, known as data dictionary.
Week 5 – Relational Database cont. METADATA - data in database, also stored as rows
and columns. Collection of these is stored in the data
Key Constraints – There must be at least one minimal
subset of attributes in the relation, which can identify dictionary or system catalog.
a tuple uniquely. If there are more than one, these are SYSTEM CATALOG - accessed by DBMS to perform
called candidate keys. various transactions and data dictionary has the user
▪ No two tuples can have identical values for accessible views that are accessed by the
developers/designers/users,
key attributes.
▪ A key attribute cannot have null values. Rule 5: Comprehensive Data Sub-Language Rule - A
database can only be accessed using a language
Data Redundancy – having multiple copies of same
data in the database. having linear syntax that supports data definition, data
manipulation, and transaction management
Insertion Anomaly – This problem happens when the operations.
insertion of a data record is not possible without
Rule 6: View Updating Rule - All the views of a
adding some additional unrelated data to the record.
database, which can theoretically updated, must also
be updatable by the system.
TYPES OF VIEWS:
1. USER View - lists only those tables and views 3. SET DIFFERENCE - takes the two sets and
which are created by current user/schema. It returns the values that are in the first set but
does not list the tables and views of other not the second set.
schemas nor the ones to which it has access.
2. ALL View - lists all the tables and views that
are owned by the current user as well as
those tables and views to which it has access.
3. DBA View - will have access to all the tables
and views of all users/schema. But these
views will be accessible by only those who
have DBA privileges.
Rule 7: High-Level Insert, Update, and Delete Rule - a
database must support high-level insertion, update, Rule 8: Physical Data Independence - the data stored
and deletion. This must not be limited to a single row, in a database must be independent of the applications
that is, it must also support union, intersection, and that access the database. If any change in the
minus operations to yield sets of data records. physical structure of the database occurs, it must not
have any impact on how data is being accessed by
Relation Set Operators in DBMS external applications.
1. UNION - combines two different results Rule 9: Logical Data Independence - the logical data
obtained by a query into a single result in the in a database must be independent of it’s user’s view
form of a table. Results should be similar if (application). This is one of the most difficult rules to
union is to be applied on them. If duplicate apply. For example, if two tables are merged or one is
values are required in the resultant data, then split in to two different tables, there should be no
UNION ALL is used. impact or change on the user application.
Rule 10: Integrity Independence - A database must be
independent of the application that uses it. All its
integrity constraints can be independently modified
without the need of any change in the application.
Rule 11: Distribution Independence - The foundation
of distributed database systems. The end-user must
not be able to see that data is distributed over various
locations. User should always get the impression that
the data is located at one site only.
Distributed Database System - A database that is not
limited to one system, it is spread over different sites
like multiple computers or over a network of
computers. Located on various sites that don’t share
physical components.
2. INTERSECTION - Gives the common data Rule 12: Non-Subversion Rule - If a system has an
values between the two data sets that are interface that provides to low-level records, then the
intersected. Removes all duplicates before interface must not be able to subvert the system and
displaying the result. bypass security and integrity constraints.
WEEK 6 - ENTITY RELATIONSHIP DIAGRAM Multivalued Attribute - An attribute that has
multiple values for a single entity at a time.
Entity Relationship Diagram (ERD) - also known as 4. Composite Attribute - If an attribute has two or
entity relationship model, a graphical representation more other attributes.
that depicts relationships among people, objects, 5. Derived Attribute - As the name suggests, the
places, etc., within an IT systems. It also uses data derived attribute is an attribute whose value
modeling that can help define business processes can be calculated from another attribute.
and serve as the foundation for a relational database. 6. Relationship - represented by
diamond-shaped box. All the entities
IMPORTANCE OF ERDS
participating in a relationship, are connected
★ Provide a visual starting point for database to it by a line.
design that can also be used to help PARTICIPATION CONSTRAINTS
determine information system requirements in
the organization.
★ After a relational database is rolled out, an
ERD can still serve as a reference point for
any debugging or re-engineering needed for
later purposes.
Functional Dependency
-an attribute is dependent on another attribute if the
latter uniquely identifies it.
-It is denoted by A -> B, which means A determines B
and B depends upon A.
-For example, Student name can be determined by
Data Anomalies
A data anomaly is an unexpected side effect of trying
Category or Union - Relationship of one to insert, update, or delete a row. Essentially, more
superclass or subclass with more than one data must be provided to accomplish an operation than
superclass. expected.
WEEK 10 - Database Normalization -Deletion anomaly occurs when the deletion of some
data deletes other required data as well, resulting in
Database Normalization - It is the process of unintended data loss. For example, what happens if we
organizing the data in a database. It helps in try to delete the item with item code I1106?
removing duplicate values in the database. -The details of the retail outlet R1002 will also be
Normalization divides the larger table into smaller deleted from the database, causing unintended and
tables and links them using relationships. potentially critical data loss.
Update Anomaly
Parser
The parser starts by tokenizing, or replacing, some
of the words in the SQL statement with special
symbols. It then checks the statement for the
following:
Correctness – The parser verifies that the SQL
statement conforms to SQL semantics, or rules, that
ensure the correctness of the query statement. For
example, the parser checks if the SQL command
ends with a semi-colon. If the semi-colon is missing,
the parser returns an error.
Authorization – The parser also validates that the
user running the query has the necessary
authorization to manipulate the respective data. For
example, only admin users might have the right to
delete data.
Relational Engine
The relational engine, or query processor, creates a
plan for retrieving, writing, or updating the
corresponding data in the most effective manner.
For example, it checks for similar queries, reuses
previous data manipulation methods, or creates a
new one.