DBMS M3
DBMS M3
Additional features allow users to specify more complex retrievals from database
1. Unknown value. A person’s date of birth is not known, so it is represented by NULL in the
database.
2. Unavailable or withheld value. A person has a home phone but does not want it to be
listed, so it is withheld and represented as NULL in the database.
3. Not applicable attribute. An attribute CollegeDegree would be NULL for a person who has no
college degrees because it does not apply to that person.
Each individual NULL value is considered to be different from every other NULL value in the various
database records. When a NULL is involved in a comparison operation, the result is considered to
be UNKNOWN (it may be TRUE or it may be FALSE). Hence, SQL uses a three-valued logic with
values TRUE, FALSE, and UNKNOWN instead of the standard two-valued (Boolean) logic with
values TRUE or FALSE. It is therefore necessary to define the results (or truth values) of three-
valued logical expressions when the logical connectives AND, OR, and NOT are used
In select-project-join queries, the general rule is that only those combinations of tuples that evaluate
the logical expression in the WHERE clause of the query to TRUE are selected . Tuple combinations
that evaluate to FALSE or UNKNOWN are not selected.
SQL allows queries that check whether an attribute value is NULL using the comparison operators
IS or IS NOT.
Example: Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname
FROM EMPLOYEE
WHERE Super_ssn IS NULL;
1.1.2 Nested Queries, Tuples, and Set/Multiset Comparisons
Some queries require that existing values in the database be fetched and then used in a
comparison condition. Such queries can be conveniently formulated by using nested queries,
which are complete select-from-where blocks within the WHERE clause of another query. That
other query is called the outer query
Example1: List the project numbers of projects that have an employee with last name ‘Smith’ as
manager
SELECT DISTINCT Pnumber FROM PROJECT WHERE
Pnumber IN
(SELECT Pnumber FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘smith’);
Example2: List the project numbers of projects that have an employee with last name ‘Smith’ as
either manager or as worker.
SELECT DISTINCT Pnumber FROM PROJECT WHERE
Pnumber IN
(SELECT Pnumber FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘smith’)
OR
Pnumber IN
(SELECT Pno FROM WORKS_ON, EMPLOYEE WHERE Essn=Ssn AND
Lname=‘smith’);
We make use of comparison operator IN, which compares a value v with a set (or multiset) of
values V and evaluates to TRUE if v is one of the elements in V.
SQL allows the use of tuples of values in comparisons by placing them within parentheses. For
example, the following query will select the Essns of all employees who work the same (project,
hours) combination on some project that employee ‘John Smith’ (whose Ssn = ‘123456789’) works
on
In this example, the IN operator compares the subtuple of values in parentheses (Pno,Hours) within
each tuple in WORKS_ON with the set of type-compatible tuples produced by the nested query.
In general, EXISTS(Q) returns TRUE if there is at least one tuple in the result of the nested query Q,
and it returns FALSE otherwise.
For each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT tuples whose
Essn value matches the EMPLOYEE Ssn; if the result is empty, no dependents are related to the
employee, so we select that EMPLOYEE tuple and retrieve its Fname and Lname.
Example: Retrieve the name of each employee who works on all the projects controlled
by department number 5
SELECT Fname, Lname
UNIQUE Functions
UNIQUE(Q) returns TRUE if there are no duplicate tuples in the result of query Q; otherwise, it
returns FALSE. This can be used to test whether the result of a nested query is a set or a multiset.
An SQL join clause combines records from two or more tables in a database. It creates a set that
can be saved as a table or used as is. A JOIN is a means for combining fields from two tables by
using values common to each. SQL specifies four types of JOIN
1. INNER,
2. OUTER
3. EQUIJOIN and
4. NATURAL JOIN
INNER JOIN
An inner join is the most common join operation used in applications and can be regarded as the
default join-type. Inner join creates a new result table by combining column values of two tables (A
and B) based upon the join- predicate (the condition). The result of the join can be defined as the
outcome of first taking the Cartesian product (or Cross join) of all records in the tables (combining
every record in table A with every record in table B)—then return all records which satisfy the join
predicate
Example: SELECT * FROM employee
employee.dno = department.dnumber;
NATURAL JOIN is a type of EQUIJOIN where the join predicate arises implicitly by comparing all
columns in both tables that have the same column-names in the joined tables. The resulting joined
table contains only one column for each pair of equally named columns.
CROSS JOIN returns the Cartesian product of rows from tables in the join. In other words, it will
produce rows which combine each row from the first table with each row from the second table.
OUTER JOIN
An outer join does not require each record in the two joined tables to have a matching record. The
joined table retains each record-even if no other matching record exists . Outer joins subdivide
further into
• Left outer joins
• Right outer joins
• Full outer joins
No implicit join-notation for outer joins exists in standard SQL.
Examples
1. Find the sum of the salaries of all employees, the maximum salary, the minimum salary, and the
average salary.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;
2. Find the sum of the salaries of all employees of the ‘Research’ department, as well as the
maximum salary, the minimum salary, and the average salary in this department.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE Dname=‘Research’;
If NULLs exist in the grouping attribute, then a separate group is created for all tuples with a NULL
value in the grouping attribute. For example, if the EMPLOYEE table had some tuples that had
NULL for the grouping attribute Dno, there would be a separate group for those tuples in the result
of query
HAVING provides a condition on the summary information regarding the group of tuples associated
with each value of the grouping attributes. Only the groups that satisfy the condition are retrieved in
the result of the query.
Example: For each project on which more than two employees work, retrieve the project number,
the project name, and the number of employees who work on the project.
SELECT Pnumber, Pname, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE Pnumber=Pno
GROUP BY Pnumber, Pname
HAVING COUNT (*) > 2;
Example: For each department that has more than five employees, retrieve the department number
and the number of its employees who are making more than $40,000.
SELECT Dnumber, COUNT (*)
FROM DEPARTMENT, EMPLOYEE
WHERE Dnumber=Dno AND Salary>40000 AND
( SELECT Dno
FROM EMPLOYEE
GROUP BY Dno
HAVING COUNT (*) > 5);
A query is evaluated conceptually by first applying the FROM clause to identify all tables involved in
the query or to materialize any joined tables followed by the WHERE clause to select and join
tuples, and then by GROUP BY and HAVING. ORDER BY is applied at the end to sort the query
result Each DBMS has special query optimization routines to decide on an execution plan that is
efficient to execute
In general, there are numerous ways to specify the same query in SQL .This flexibility in specifying
queries has advantages and disadvantages.
The main advantage is that users can choose the technique with which they are most
comfortable when specifying a query. For example, many queries may be specified with join
conditions in the WHERE clause, or by using joined relations in the FROM clause, or with
some form of nested queries and the IN comparison. From the programmer’s and the
system’s point of view regarding query optimization, it is generally preferable to write a query
with as little nesting and implied ordering as possible.
The disadvantage of having numerous ways of specifying the same query is that this may
confuse the user, who may not know which technique to use to specify particular types of
queries. Another problem is that it may be more efficient to execute a query specified in one
way than the same query specified in an alternative way
General form :
CREATE ASSERTION <Name_of_assertion> CHECK (<cond>)
For the assertion to be satisfied, the condition specified after CHECK clause must return true.
For example, to specify the constraint that the salary of an employee must not be greater than the
salary of the manager of the department that the employee works for in SQL, we can write the
following assertion:
CREATE ASSERTION SALARY_CONSTRAINT
CHECK ( NOT EXISTS ( SELECT * FROM EMPLOYEE E, EMPLOYEE M,
DEPARTMENT D WHERE E.Salary>M.Salary AND
E.Dno=D.Dnumber AND D.Mgr_ssn=M.Ssn ) );
The constraint name SALARY_CONSTRAINT is followed by the keyword CHECK, which is followed
by a condition in parentheses that must hold true on every database state for the assertion to be
satisfied. The constraint name can be used later to refer to the constraint or to modify or drop it. Any
WHERE clause condition can be used, but many constraints can be specified using the EXISTS and
NOT EXISTS style of SQL conditions.
By including this query inside a NOT EXISTS clause, the assertion will specify that the result of this
query must be empty so that the condition will always be TRUE. Thus, the assertion is violated if the
result of the query is not empty
Example: consider the bank database with the following tables
Does the trigger execute for each updated or deleted record, or once for the entire
statement ?. We define such granularity as follows:
This trigger is activated once (per UPDATE This trigger is activated before deleting each
statement) after all records are updated record
Examples:
1) If the employee salary increased by more than 10%, then increment the rank field by 1.
2) Keep the bonus attribute in Employee table always 3% of the salary attribute
Suppose that the action to take would be to call an external stored procedure
SALARY_VIOLATION which will notify the supervisor
The trigger is given the name SALARY_VIOLATION, which can be used to remove or
deactivate the trigger later
In this example the events are: inserting a new employee record, changing an employee’s
salary, or changing an employee’s supervisor
The action is to execute the stored procedure INFORM_SUPERVISOR
Triggers can be used in various applications, such as maintaining database consistency, monitoring
database updates.
All new customers opening an account must have opening balance >= $100. However, once the
account is opened their balance can fall below that amount.
A view in SQL terminology is a single table that is derived from other tables. other tables can be
base tables or previously defined views. A view does not necessarily exist in physical form; it is
considered to be a virtual table, in contrast to base tables, whose tuples are always physically
stored in the database. This limits the possible update operations that can be applied to views, but
it does not provide any limitations on querying a view. We can think of a view as a way of specifying
a table that we need to reference frequently, even though it may not exist physically.
For example, referring to the COMPANY database, we may frequently issue queries that retrieve
the employee name and the project names that the employee works on. Rather than having to
specify the join of the three tables EMPLOYEE,WORKS_ON, and PROJECT every time we issue
this query, we can define a view that is specified as the result of these joins. Then we can issue
queries on the view, which are specified as single table retrievals rather than as retrievals involving
two joins on three tables. We call the EMPLOYEE,WORKS_ON, and PROJECT tables the defining
tables of the view.
In SQL, the command to specify a view is CREATE VIEW. The view is given a (virtual) table name
(or view name), a list of attribute names, and a query to specify the contents of the view . If none of
the view attributes results from applying functions or arithmetic operations, we do not have to
specify new attribute names for the view, since they would be the same as the names of the
attributes of the defining tables in the default case.
Example 1:
In example 1, we did not specify any new attribute names for the view WORKS_ON1. In this
case,WORKS_ON1 inherits the names of the view attributes from the defining tables EMPLOYEE,
PROJECT, and WORKS_ON.
Example 2 explicitly specifies new attribute names for the view DEPT_INFO, using a one-to-one
correspondence between the attributes specified in the CREATE VIEW clause and those specified
in the SELECT clause of the query that defines the view.
We can now specify SQL queries on a view—or virtual table—in the same way we specify queries
involving base tables.
For example, to retrieve the last name and first name of all employees who work on the ‘ProductX’
project, we can utilize the WORKS_ON1 view and specify the query as :
The same query would require the specification of two joins if specified on the base relations
directly. one of the main advantages of a view is to simplify the specification of certain queries.
Views are also used as a security and authorization mechanism.
A view is supposed to be always up-to-date; if we modify the tuples in the base tables on which the
view is defined, the view must automatically reflect these changes. Hence, the view is not realized
or materialized at the time of view definition but rather at the time when we specify a query on the
view. It is the responsibility of the DBMS and not the user to make sure that the view is kept up-to-
date.
If we do not need a view any more, we can use the DROP VIEW command to dispose of it. For
example : DROP VIEW WORKS_ON1;
The disadvantage of this approach is that it is inefficient for views defined via complex queries that
are time-consuming to execute, especially if multiple queries are going to be applied to the same
view within a short period of time.
The second strategy, called view materialization, involves physically creating a temporary view
table when the view is first queried and keeping that table on the assumption that other queries
on the view will follow. In this case, an efficient strategy for automatically updating the view table
when the base tables are updated must be developed in order to keep the view up -to-date.
The view is generally kept as a materialized (physically stored) table as long as it is being queried. If
the view is not queried for a certain period of time, the system may then automatically remove the
physical table and recompute it from scratch when future queries reference the view.
Updating of views is complicated and can be ambiguous. In general, an update on a view defined
on a single table without any aggregate functions can be mapped to an update on the underlying
base table under certain conditions. For a view involving joins, an update operation may be mapped
to update operations on the underlying base relations in multiple ways. Hence, it is often not
possible for the DBMS to determine which of the updates is intended.
To illustrate potential problems with updating a view defined on multiple tables, consider the
WORKS_ON1 view, and suppose that we issue the command to update the PNAME attribute of
‘John Smith’ from ‘ProductX’ to ‘ProductY’. This view update is shown in UV1:
UV1: UPDATEWORKS_ON1
SET Pname = ‘ProductY’
WHERE Lname=‘Smith’ AND Fname=‘John’
AND Pname=‘ProductX’;
This query can be mapped into several updates on the base relations to give the desired update
effect on the view. In addition, some of these updates will create additional side effects that affect
the result of other queries.
For example, here are two possible updates, (a) and (b), on the base relations corresponding to the
view update operation in UV1:
(a): UPDATEWORKS_ON
SET Pno= (SELECT Pnumber
FROM PROJECT
WHERE Pname=‘ProductY’ )
WHERE Essn IN ( SELECT Ssn
FROM EMPLOYEE
WHERE Lname=‘Smith’ AND Fname=‘John’ )
AND
Pno= (SELECT Pnumber
FROM PROJECT
WHERE Pname=‘ProductX’ );
Update (a) relates ‘John Smith’ to the ‘ProductY’ PROJECT tuple instead of the ‘ProductX’
PROJECT tuple and is the most likely desired update. However, (b) would also give the desired
update effect on the view, but it accomplishes this by changing the name of the ‘ProductX’ tuple in
the PROJECT relation to ‘ProductY’.
It is quite unlikely that the user who specified the view update UV1 wants the update to be
interpreted as in (b), since it also has the side effect of changing all the view tuples with Pname =
‘ProductX’.
Some view updates may not make much sense; for example, modifying the Total_sal attribute of the
DEPT_INFO view does not make sense because Total_sal is defined to be the sum of the individual
employee salaries. This request is shown as UV2:
UV2: UPDATEDEPT_INFO
SET Total_sal=100000
WHERE Dname=‘Research’;
A large number of updates on the underlying base relations can satisfy this view update.
Generally, a view update is feasible when only one possible update on the base relations can
accomplish the desired update effect on the view. Whenever an update on the view can be mapped
to more than one update on the underlying base relations, we must have a certain procedure for
choosing one of the possible updates as the most likely one.
If a base relation within a schema is no longer needed, the relation and its definitio n can be deleted
by using the DROP TABLE command. For example, if we no longer wish to keep track of
dependents of employees in the COMPANY database, , we can get rid of the DEPENDENT relation
by issuing the following command:
DROP TABLE DEPENDENT CASCADE;
If the RESTRICT option is chosen instead of CASCADE, a table is dropped only if it is not
referenced in any constraints (for example, by foreign key definitions in another relation) or views
or by any other elements. With the CASCADE option, all such constraints, views, and other
elements that reference the table being dropped are also dropped automatically from the schema,
along with the table itself.
The DROP TABLE command not only deletes all the records in the table if successful, but also
removes the table definition from the catalog. If it is desired to delete only the records but to leave
the table definition for future use, then the DELETE command should be used instead of DROP
TABLE.
The DROP command can also be used to drop other types of named schema elements, such as
constraints or domains.
For example, to add an attribute for keeping track of jobs of employees to the EMPLOYEE base
relation in the COMPANY schema , we can use the command:
ALTER TABLE COMPANY.EMPLOYEE ADD COLUMN Job VARCHAR(12);
We must still enter a value for the new attribute Job for each individual EMPLOYEE tuple. This can
be done either by specifying a default clause or by using the UPDATE command individually on
each tuple. If no default clause is specified, the new attribute will have NULLs in all the tuples of the
relation immediately after the command is executed; hence, the NOT NULL constraint is not allowed
in this case.
To drop a column, we must choose either CASCADE or RESTRICT for drop behavior. If CASCADE
is chosen, all constraints and views that reference the column are dropped automatically from the
schema, along with the column. If RESTRICT is chosen, the command is successful only if no views
or constraints (or other schema elements) reference the column.
For example, the following command removes the attribute Address from the EMPLOYEE base
table:
ALTER TABLE COMPANY.EMPLOYEE DROP COLUMN Address CASCADE;
It is also possible to alter a column definition by dropping an existing default clause or by defining a
new default clause. The following examples illustrate this clause:
ALTER TABLE COMPANY.DEPARTMENT ALTER COLUMN Mgr_ssn DROP DEFAULT;
‘333445555’;
For example we can change the data type of the column named "DateOfBirth" from date to year in
the "Persons" table using the following SQL statement:
ALTER TABLE Persons
ALTER COLUMN DateOfBirth year;
2.1 Introduction
We often encounter a situations in which we need the greater flexibility of a general -purpose
programming language in addition to the data manipulation facilities provided by SQL.For example,
we may want to integrate a database applications with GUI or we may want to integrate with other
existing applications.
The use of SQL commands within a host language is called Embedded SQL. Conceptually,
embedding SQL commands in a host language program is straight forward. SQL statements can be
used wherever a statement in the host language is allowed. SQL statements must be clearly
marked so that a preprocessor can deal with them before invoking the compiler for the host
language. Any host language variable used to pass arguments into an SQL command must be
declared in SQL.
There are two complications:
1. Data types recognized by SQL may not be recognized by the host language and vice versa
- This mismatch is addressed by casting data values appropriately before passing them to or
from SQL commands.
2. SQL is set-oriented
- Addressed using cursors
EXEC SQL BEGIN DECLARE SECTION and EXEC SQL END DECLARE SECTION
char c_sname[20];
long c_sid;
short c_rating;
float c_age;
EXEC SQL END DECLARE SECTION
The first question that arises is which SQL types correspond to the various C types, since we have
just declared a collection of C variables whose values are intended to be read (and possibly set) in
an SQL run-time environment when an SQL statement that refers to them is executed. The SQL-92
standard defines such a correspondence between the host language types and SQL types for a
number of host languages. In our example, c_sname has the type CHARACTER(20) when referred
to in an SQL statement, c_sid has the type INTEGER, crating has the type SMALLINT, and c_age
has the type REAL.
We also need some way for SQL to report what went wrong if an error condition arises when
executing an SQL statement. The SQL-92 standard recognizes two special variables for reporting
errors, SQLCODE and SQLSTATE.
SQLCODE is the older of the two and is defined to return some negative value when an
error condition arises, without specifying further just what error a particular negative
integer denotes.
SQLSTATE, introduced in the SQL-92 standard for the first time, associates predefined
values with several common error conditions, thereby introducing some uniformity to how
errors are reported.
One of these two variables must be declared. The appropriate C type for SQLCODE is long and the
appropriate C type for SQLSTATE is char [6] , that is, a character string five characters long.
2.2.2 Cursors
A major problem in embedding SQL statements in a host language like C is that an impedance
mismatch occurs because SQL operates on sets of records, whereas languages like C do not
cleanly support a set-of-records abstraction. The solution is to essentially provide a mechanism that
allows us to retrieve rows one at a time from a relation- this mechanism is called a cursor
We can declare a cursor on any relation or on any SQL query. Once a cursor is declared, we can
open it (positions the cursor just before the first row)
Fetch the next row
Move the cursor (to the next row,to the row after the next n, to the first row or previous row
etc by specifying additional parameters for the fetch command)
Close the cursor
Cursor allows us to retrieve the rows in a table by positioning the cursor at a part icular row and
reading its contents.
Basic Cursor Definition and Usage
Cursors enable us to examine, in the host language program, a collection of rows computed by an
Embedded SQL statement:
We usually need to open a cursor if the embedded statement is a SELECT. we can avoid
opening a cursor if the answer contains a single row
INSERT, DELETE and UPDATE statements require no cursor. some variants of DELETE
and UPDATE use a cursor.
Examples:
i) Find the name and age of a sailor, specified by assigning a value to the host variable c_sid,
declared earlier
EXEC SQL SELECT s.sname,s.age
INTO :c_sname, :c_age
FROM Sailaor s
WHERE s.sid=:c.sid;
ii) Compute the name and ages of all sailors with a rating greater than the current value of the host
variable c_minrating
SELECT s.sname,s.age
FROM sailors s WHERE s.rating>:c_minrating;
The query returns a collection of rows. The INTO clause is inadequate. The solution is to use a
cursor:
DECLARE sinfo CURSOR FOR
SELECT s.sname,s.age
FROM sailors s
WHERE s.rating>:c_minrating;
This code can be included in a C program and once it is executed, the cursor sinfo is defined.
We can open the cursor by using the syntax:
OPEN sinfo;
A cursor can be thought of as ‘pointing’ to a row in the collection of answers to the query associated
with it.When the cursor is opened, it is positioned just before the first row.
We can use the FETCH command to read the first row of cursor sinfo into host language variables:
FETCH sinfo INTO :c_sname, :c_age;
When the FETCH statement is executed, the cursor is positioned to point at the next row and the
column values in the row are copied into the corresponding host variables. By repeatedly executing
this FETCH statement, we can read all the rows computed by the query, one row at time.
When we are done with a cursor, we can close it:
CLOSE sinfo;
iii) To retrieve the name, address and salary of an employee specified by the variable ssn
If the keyword SCROLL is specified, the cursor is scrollable, which means that variants of the
FETCH command can be used to position the cursor in very flexible ways; otherwise, only the basic
FETCH command, which retrieves the next row, is allowed
If the keyword INSENSITIVE is specified, the cursor behaves as if it is ranging over a private copy
of the collection of answer rows. Otherwise, and by default, other actions of some transaction could
modify these rows, creating unpredictable behavior.
A holdable cursor is specified using the WITH HOLD clause, and is not closed when the transaction
is committed.
Optional ORDER BY clause can be used to specify a sort order. The order-item-list is a list of order-
items. An order-item is a column name, optionally followed by one of the keywords ASC or DESC
Every column mentioned in the ORDER BY clause must also appear in the select-list of the query
associated with the cursor; otherwise it is not clear what columns we should sort on
The answer is sorted first in ascending order by minage, and if several rows have the same minage
value, these rows are sorted further in descending order by rating
8 25.5
3 25.5
7 35.0
Dynamic SQL
Dynamic SQL Allow construction of SQL statements on-the-fly. Consider an application such as a
spreadsheet or a graphical front-end that needs to access data from a DBMS. Such an application
must accept commands from a user and, based on what the user needs, generate appropriate SQL
statements to retrieve the necessary data. In such situations, we may not be able to predict in
advance just what SQL statements need to be executed. SQL provides some facilities to deal with
such situations; these are referred to as Dynamic SQL.
Example:
In contrast to Embedded SQL, ODBC and JDBC allow a single executable to access
different DBMSs Without recompilation.
A driver is a software program that translates the ODBC or JDBC calls into DBMS-specific calls.
Drivers are loaded dynamically on demand since the DBMSs the application is going to access
are known only at run-time. Available drivers are registered with a driver manager a driver does
not necessarily need to interact with a DBMS that understands SQL. It is sufficient that the
driver translates the SQL commands from the application into equivalent commands that the
DBMS understands.
An application that interacts with a data source through ODBC or JDBC selects a data source,
dynamically loads the corresponding driver, and establishes a connection with the data source .
There is no limit on the number of open connections. An application can have several open
connections to different data sources. Each connection has transaction semantics; that is,
changes from one connection are visible to other connections only after the connection has
committed its changes. While a connection is open, transactions are executed by submitting
SQL statements, retrieving results, processing errors, and finally committing or rolling back. The
application disconnects from the data source to terminate the interaction.
2.3.1 Architecture
The architecture of JDBC has four main components:
Application
Driver manager
Drivers
Data sources
Driver manager
Load JDBC drivers and pass JDBC function calls from the application to the correct driver
Handles JDBC initialization and information calls from the applications and can log all
function calls
Performs some rudimentary error checking
Drivers
Data sources
Drivers in JDBC are classified into four types depending on the architectural relations hip between
the application and the data source:
Type I Bridges:
This type of driver translates JDBC function calls into function calls of another API that is not
native to the DBMS.
An example is a JDBC-ODBC bridge; an application can use JDBC calls to access an
ODBC compliant data source. The application loads only one driver, the bridge.
Advantage:
it is easy to piggyback the application onto an existing installation, and no new
drivers have to be installed.
Drawbacks:
The increased number of layers between data source and application affects
performance
the user is limited to the functionality that the ODBC driver supports.
2.4.2 Connections
A session with a data source is started through creation of a Connection object; Connections are
specified through a JDBC URL, a URL that uses the jdbc protocol. Such a URL has the form
jdbc:<subprotocol>:<otherParameters>
The SQL query specifies the query string, but uses ''?' for the values of the parameters, which are
set later using methods setString, setFloat,and setlnt. The ''?” placeholders can be used anywhere
in SQL statements where they can be replaced with a value. Examples of places where they can
appear include the WHERE clause (e.g., 'WHERE author=?'), or in SQL UPDATE and INSERT
statements. The method setString is one way to set a parameter value; analogous methods are
available for int, float, and date. It is good style to always use clearParameters() before setting
parameter values in order to remove any old data.
There are different ways of submitting the query string to the data source. In the example, we used
the executeUpdate command, which is used if we know that the SQL statement does not return
any records (SQL UPDATE, INSERT,ALTER, and DELETE statements). The executeUpdate
method returns
- an integer indicating the number of rows the SQL statement modified;
- 0 for successful execution without modifying any rows.
The executeQuery method is used if the SQL statement returns data, such as in a regular SELECT
query. JDBC has its own cursor mechanism in the form of a ResultSet object.
With these accessor methods, we can retrieve values from the current row of the query result
referenced by the ResultSet object. There are two forms for each accessor method. One method
retrieves values by column index, starting at one, and the other retrieves values by column name.
String sqlQuerYi
ResultSet rs = stmt.executeQuery(sqIQuery)
while (rs.nextO)
{
isbn = rs.getString(l);
title = rs.getString(" TITLE");
/ / process isbn and title
}
An SQLWarning is a subclass of SQLException. Warnings are not as severe as errors and the
program can usually proceed without special handling of warnings. Warnings are not thrown like
other exceptions, and they are not caught as part of the try-catch block around a java.sql statement.
We need to specifically test whether warnings exist. Connection, Statement, and ResultSet
objects all have a getWarnings() method with which we can retrieve SQL warnings if they exist.
Duplicate retrieval of warnings can be avoided through clearWarnings(). Statement objects clear
warnings automatically on execution of the next statement; ResultSet objects clear warnings every
time a new tuple is accessed.
Typical code for obtaining SQLWarnings looks similar to the code shown below:
try
{
stmt = con.createStatement();
warning = con.getWarnings();
while( warning != null)
{
/ / handleSQLWarnings / / code to process warning
warning = warning.getNextWarningO; / /get next warning
}
con.clear\Varnings() ;
stmt.executeUpdate( queryString );
warning = stmt.getWarnings();
while( warning != null)
{
/ / handleSQLWarnings / / code to process warning
warning = warning.getNextWarningO; / /get next warning
}
} / / end try
catch ( SQLException SQLe)
{
/ / code to handle exception
} / / end catch
ResultSet rs=st.executeQuery(query);
String sname=rs.getString(2);
System.out.println(sname);
con.close();
import java.sql.*;
2.12 SQLJ
("jdbc:oracle:thin:@localhost:1521:xesid","system","ambika");
Statement st=con.createStatement();
ResultSet rs=st.executeQuery(query);
String s=rs.getString(1);
System.out.println(s);
con.close();
catch(Exception e)
Books books;
#sql books = {
};
while (books.next()) {
books.close() ;
All SQLJ statements have the special prefix #sql. In SQLJ, we retrieve the results of SQL queries
with iterator objects, which are basically cursors. An iterator is an instance of an iterator class.
Usage of an iterator in SQLJ goes through five steps:
1. Declare the Iterator Class: In the preceding code, this happened through the statement
#sql iterator Books (String title, Float price);
This statement creates a new Java class that we can use to instantiate objects.
2. Instantiate an Iterator Object from the New Iterator Class:
while (true)
if (books.endFetch())
{ break: }
Benefits :
• reduces the amount of information transfer between client and database server
• Compilation step is required only once when the stored procedure is created. Then after it
does not require recompilation before executing unless it is modified and reutilizes the same
execution plan whereas the SQL statements need to be compiled every time whenever it is
sent for execution even if we send the same SQL statement every time
• It helps in re usability of the SQL code because it can be used by multiple users and by
multiple clients since we need to just call the stored procedure instead of writing the
same SQL statement every time. It helps in reducing the development time
Is/As
<declaration>
Begin
<SQL Statement>
Exception
-----
-----
End procedurename;
Student(usn:string,sname:string)
Let us now write a stored procedure to retrieve the count of students with sname ‘Akshay’
is
stu_cnt int;
begin
end ss;
Stored procedures can also have parameters. These parameters have to be valid SQL types, and
have one of three different modes: IN, OUT, or INOUT.
OUT parameters are returned from the stored procedure; it assigns values to all OUT
parameters that the user can process
INOUT parameters combine the properties of IN and OUT parameters: They contain values
to be passed to the stored procedures, and the stored procedure can set their values as
return values
IN book_isbn CHAR(lO),
IN addedQty INTEGER)
In Embedded SQL, the arguments to a stored procedure are usually variables in the host language.
For example, the stored procedure AddInventory would be called as follows:
EXEC SQL BEGIN DECLARE SECTION
char isbn[lO];
long qty;
EXEC SQL END DECLARE SECTION
/ / set isbn and qty to some values
EXEC SQL CALL AddInventory(:isbn,:qty);
Stored procedures enforce strict type conformance: If a parameter is of type INTEGER, it cannot be
called with an argument of type VARCHAR.
Procedures without parameters are called static procedures and with parameters are called
dynamic procedures.
as
eName varchar(20);
begin
select fname into eName from employee where ssn=Essn and dno=5;
end emp;
Stored procedures can be called in interactive SQL with the CALL statement:
CustomerInfo customerinfo;
while (customerinfo.next()
System.out.println(customerinfo.cid() + "," +
customerinfo.count()) ;
2.6.3 SQL/PSM
SQL/Persistent Stored Modules is an ISO standard mainly defining an extension of SQL with
procedural language for use in stored procedures.
procedure code;
Page 50
We can declare a function similarly as follows:
RETURNS sqIDataType
function code;
Example:
RETURNS INTEGER
ELSE rating=O;
END IF;
RETURN rating;
We can declare local variables using the DECLARE statement. In our example, we declare two
local variables: 'rating', and 'numOrders'.
PSM/SQL functions return values via the RETURN statement. In our example, we return the
value of the local variable 'rating'.
We can assign values to variables with the SET statement. In our example, we assigned the
return value of a query to the variable 'numOrders'.
SQL/PSM has branches and loops. Branches have the following form:
ELSEIF statements;
ELSEIF statements;
ELSE statements;
END IF
statements:
END LOOP
Queries can be used as part of expressions in branches; queries that return a single value can be
assigned to variables.We can use the same cursor statements as in Embedded SQL (OPEN,
FETCH, CLOSE), but we do not need the EXEC SQL constructs, and variables do not have to be
prefixed by a colon ':'.
3.1 Introduction
Data-intensive is used to describe applications with a need to process large volumes of data.
The volume of data that is processed can be in the size of terabytes and petabytes and this type
of data is also referred as big data. Data-intensive computing is used in many applications
ranging from social networking to computational science where a large amount of data needs to
be accessed, stored, indexed and analyzed. It is more challenging as the amount of data keeps
on accumulating over time and the rate at which the data is generating also increases
3.2.1Single-Tier
Initially, data-intensive applications were combined into a single tier, including the DBMS,
application logic, and user interface. The application typically ran on a mainframe, and users
accessed it through dumb terminals that could perform only data input and display.
At the presentation layer, we need to provide forms through which the user can issue requests, and
display responses that the middle tier generates. It is important that this layer of code be easy to
adapt to different display devices and formats; for example, regular desktops versus handheld
devices versus cell phones. This adaptivity can be achieved either at the middle tier through
generation of different pages for different types of client, or directly at the client through style sheets
that specify how the data should be presented. The hypertext markup language (HTML) is the basic
data presentation language.
HTML Forms
HTML forms are a common way of communicating data from the client tier to the middle tier.
The general format of a form :
<FORM ACTION=“page.jsp" METHOD="GET" NAME="LoginForm">
------
</FORM>
• ACTION: Specifies the URI of the page to which the form contents are submitted. If the
ACTION attribute is absent, then the URI of the current page is used
• METHOD: The HTTP/1.0 method used to submit the user input from the filled-out form to
the webserver. There are two choices: GET and POST
• NAME: This attribute gives the form a name
A single HTML document can contain more than one form. Inside an HTML form, we can have any
HTML tags except another FORM element
There are two different ways to submit HTML Form data to the webserver. If the method GET is
used, then the contents of the form are assembled into a query URI (as discussed next) and sent to
the server. If the method POST is used, then the contents of the form are encoded as in the GET
method, but the contents are sent in a separate data block instead of appending them directly to the
URI. Thus, in the GET method the form contents are directly visible to the user as the constructed
URI, whereas in the POST method, the form contents are sent inside the HTTP request message
body and are not visible to the user.
JavaScript is a scripting language at the client tier with which we can add programs to webpages
that run directly at the client. JavaScript is often used for the following types of computation at the
client:
Browser Detection: JavaScript can be used to detect the browser type and load a
browser-specific page.
Form Validation: JavaScript is used to perform simple consistency checks on form fields
Browser Control: This includes opening pages in customized windows; examples include
the annoying pop-up advertisements that you see at many websites, which are programmed
using JavaScript.
JavaScript is usually embedded into an HTML document with a special tag, the SCRIPT tag
The SCRIPT tag has the attribute LANGUAGE, which indicates the language in which the script is
written. For JavaScript, we set the language attribute to JavaScript. Another attribute of the SCRIPT
tag is the SRC attribute, which specifies an external file with JavaScript code that is automatically
embedded into the HTML document. Usually JavaScript source code files use a '.js' extension.
Style Sheets
A style sheet is a method to adapt the same document contents to different presentation formats . A
style sheet contains instructions that tell a web how to translate the data of a document into a
presentation that is suitable for the client's display. The use of style sheets has many advantages:
• we can reuse the same document many times and display it differently depending on the
context
• we can tailor the display to the reader's preference such as font size, color style, and even
level of detail.
• we can deal with different output formats, such as different output devices (laptops versus
cell phones), different display sizes (letter versus legal paper), and different display media
(paper versus digital display)
• we can standardize the display format within a corporation and thus apply style sheet
conventions to documents at any time.
XSL
CSS
• CSS was created for HTML with the goal of separating the display characteristics of different
formatting tags from the tags themselves
• CSS defines how to display HTML elements.
• Styles are normally stored in style sheets, which are files that contain style definitions.
• Many different HTML documents, such as all documents in a website, can refer to the same
CSS.
• Thus, we can change the format of a website by changing a single file.
• Each line in a CSS sheet consists of three parts; a selector, a property, and a value.They are
syntactically arranged in the following way:
selector {property: value}
• The selector is the element or tag whose format we are defining.
• The property indicates the tag's attribute whose value we want to set in the style sheet
• Example: BODY {BACKGROUND-COLOR: yellow}
P {MARGIN-LEFT: 50px; COLOR: red}
XSL
The middle layer runs code that implements the business logic of the application . The middle tier
code is responsible for supporting all the different roles involved in the application . For example, in
an Internet shopping site implementation, we would like
• customers to be able to browse the catalog and make purchases
• administrators to be able to inspect current inventory, and
• data analysts to ask summary queries about purchase histories
• Each of these roles can require support for several complex actions
The first generation of middle-tier applications was stand-alone programs written in a general-
purpose programming language such as C, C++, and Perl. Programmers quickly realized that
Program fragment: A Sample 'web Page Where Form Input Is Sent to a CGI Script
Application Servers
Application logic can be enforced through server-side programs that are invoked using the CGl
protocol. However, since each page request results in the creation of a new process, this solu tion
does not scale well to a large number of simultaneous requests. An application server maintains a
pool of threads or processes and uses these to execute requests. Thus, it avoids the startup cost of
creating a new process for each request. They facilitate concurrent access to several
heterogeneous data sources (e.g., by providing JDBC drivers), and provide session management
services.
JavaServer Pages
Java Server Pages (JSP) is a server-side programming technology that enables the creation of
dynamic, platform-independent method for building Web-based applications. JSP have access to
the entire family of Java APIs, including the JDBC API to access enterprise databases
JavaServer pages (.JSPs) interchange the roles of output amI application logic. JavaServer pages
are written in HTML with servlet-like code embedded in special HT1VIL tags. Thus, in comparison to
servlets, JavaServer pages are better suited to quickly building interfaces that have some logic
inside, whereas servlets are better suited for complex application logic.
Maintaining State
There is a need to maintain a user's state across different pages. As an example, consider a user
who wants to make a purchase at the Barnes and Nobble website. The user must first add items
into her shopping basket, which persists while she navigates through the site Thus, we use the
notion of state mainly to remember information as the user navigates through the site.
The HTTP protocol is stateless. We call an interaction with a webserver stateless if no information
is retained from one request to the next request. We call an interaction with a webserver stateful, or
we say that state is maintained, if some memory is stored between requests to the server, and
different actions are taken depending on the contents stored.
Since we cannot maintain state in the HTTP protocol, where should we maintain state? There are
basically two choices:
We can maintain state in the middle tier, by storing information in the local main memory of
the application logic, or even in a database system
Alternatively, we can maintain state on the client side by storing data in the form of a cookie.
1. Discuss how NULLs are treated in comparison operators in SQL. How are NULLs treated when
aggregate functions are applied in an SQL query? How are NULLs treated if they exist in
grouping attributes?
2. Describe the six clauses in the syntax of an SQL retrieval query. Show what type of constructs
can be specified in each of the six clauses. Which of the six clauses are required and which are
optional?
3. Describe conceptually how an SQL retrieval query will be executed by specifying the conceptual
order of executing each of the six clauses.
4. Explain how the GROUP BY clause works. What is the difference between the WHERE and
HAVING clause?
5. Explain insert, delete and update statements in SQL and give example for each.
6. Write a note on:
i) Views in SQL
ii) Aggregate functions in SQL
7. Explain DROP command with an example.
8. How is view created and dropped? What problems are associated with updating views?
9. How are triggers and assertions defined in SQL? Explain.
10. Consider the following schema for a COMPANY database:
EMPLOYEE (Fname, Lname, Ssn, Address, Super-ssn, Salary, Dno)
DEPARTMENT (Dname, Dnumber, Mgr-ssn, Mgr-start-date)
DEPT-LOCATIONS (Dnumber, Dlocation)
PROJECT (Pname, Pnumber, Plocation, Dnum)
WORKS-ON (Ess!!, Pno, Hours)
DEPENDENT (Essn, Dependent-name, Sex, Bdate, Relationship)
write the SQL query for the following:
i) List the names of managers who have at least one dependent.
ii) Retrieve the list of employees and the projects they are working on, ordered by department and,
within each department, ordered alphabetically by last name, first name.
iii) For each project, retrieve the project number, the project name, and the number of
employees who work on that project.
iv) For each project on which more than two employees work, retrieve the project
number, the project name, and the number of employees who work on the project.
v) For each project, retrieve the project number, the project name, and the number of
employees from department 4 who work on the project.