RDBMS Chapter 3
RDBMS Chapter 3
RDBMS
3. Database Integrity & Security Concepts
Domain Constraints
1. A domain of possible values should be associated with every attribute. These domain
constraints are the most basic form of integrity constraint.
They are easy to test for when data is entered.
2. Domain types
1. Attributes may have the same domain, e.g. cname and employee-name.
2. It is not as clear whether bname and cname domains ought to be distinct.
3. At the implementation level, they are both character strings.
4. At the conceptual level, we do not expect customers to have the same names
as branches, in general.
5. Strong typing of domains allows us to test for values inserted, and whether
queries make sense. Newer systems, particularly object-oriented database
systems, offer a rich set of domain types that can be extended easily.
3. The check clause in SQL-92 permits domains to be restricted in powerful ways that
most programming language type systems do not permit.
1. The check clause permits schema designer to specify a predicate that must be
satisfied by any value assigned to a variable whose type is the domain.
2. Examples:
create domainhourly-wagenumeric(5,2)
constraint wage-value-testcheck(value>= 4.00)
Referential integrity
Referential integrity is a database constraint that ensures that references between data are
indeed valid and intact.Referential integrity is a property of data which, when satisfied,
requires every value of one attribute (column) of a relation (table) to exist as a value of
another attribute (column) in a different (or the same) relation (table).
Referential integrity in a relational database is consistency between coupled tables.
Referential integrity isusually enforced by the combination of a primary key and a foreign
key. For referential integrity to hold, anyfield in a table that is declared a foreign key can
contain only values from a parent table's primary key field.
Referential integrity is a feature provided by relational database management systems
(RDBMS’s) that preventsusers or applications from entering inconsistent data. Most
RDBMS’s have various referential integrity rulesthat you can apply when you create a
relationship between two tables.
Referential integrity is a database management safeguard that ensures every foreign key
matches a primarykey. For example, customer numbers in a customer file are the primary
keys, and customer numbers in the orderfile are the foreign keys. If a customer record is
deleted, the order records must also be deleted; otherwise they areleft without a primary
reference. If the DBMS does not test for this, it must be programmed into the applications.
The SQL syntax for defining referential integrity looks essentially like the following. The
words in capital letters denote keywords. The brackets indicate optional parameters. The
foreign key columns are in table1 and the primary key (or other unique combination of
columns) is in table2.
In SQL a foreign key can refer to any unique combination of columns in the referenced table.
If the referenced column list is omitted, the foreign key refers to the primary key. The SQL
provides the following referential integrity actions for deletions.
• Cascade. The deletion of a record may cause the deletion of corresponding foreign-key
records. For example, ifyou delete a company, you might also want to delete the company’s
history of addresses.
• No action. Alternatively, you may forbid the deletion of a record if there are dependent
foreign-key records. Forexample, if you have sold products to a company, you might want to
prevent deletion of the company record.
• Set null. The deletion of a record may cause the corresponding foreign keys to be set to
null. For example, ifthere is an aircraft substitution on a flight, you may want to nullify some
seat assignments. (These passengersmust then request other seat assignments.)
• Set default. You may set a foreign key to a default value instead of to null upon deletion of
a record.
Database Security
Security protects data from intentional oraccidental misuse or destruction, by
controllingaccess to the data.
Database security is concerned with the ability ofthe system to enforce a security policy
governingthe disclosure, modification or destruction ofinformation.
Issues
Confidentiality
– information is only disclosed to authorized users
• Integrity
– information is only modified by authorized users
• Availability
– information is accessible by authorized users
Types of Security
• Authorization Policies
– Disclosure and modification of data
• Data Consistency Policies
– Consistency and correctness of data
• Availability Policies
– Availability of information to users
• Identification/Authentication/Audit Policies
– Authorizing users to access data
Authentication vs Authorization
The process of securely identifying its users by a system is called authentication.
Authentication tries to identify the identity of the user and whether the user is the person
he/she is representing to be. Determining the level of access (what resources are made
accessible to the user) of an authenticated user is done by authorization.
What is Authentication?
Authentication is used to establish the identity of a user who is trying to use a system.
Establishing the identity is done by testing a unique piece of information that is known only
by the user being authenticated and the authentication system. This unique piece of
information could be a password, or a physical property that is unique to the user such as a
fingerprint or other bio metric, etc. Authentication systems work by challenging the user to
provide the unique piece of information, and if the system can verify that information the user
is considered as authenticated. Authentication systems could range from simple password
challenging systems to complicated systems such as Kerberos. Local authentication methods
are the simplest and most common authentication systems used. In this kind of a system, the
usernames and password of authenticated users are stored on the local server system. When a
user wants to login, he/she sends his/her username and password in plaintext to the server. It
compares the received information with the database and if it is a match, the user will be
authenticated. Advanced authentication systems like Kerberos uses trusted authentication
servers to provide authentication services.
What is Authorization?
The method that is used to determine the resources that are accessible to an authenticated user
is called authorization (authorization). For example, in a database, set of users can update/
modify the database, while some users can only read the data. So, when a user logs in to the
database, the authorization scheme determines whether that user should be given the ability to
modify the database or just the ability to read the data. So, in general, an authorization
scheme determines whether an authenticated user should be able to perform a particular
operation on a particular resource. In addition, authorization schemes can use factors like the
time of day, physical location, number of accesses to the system, etc. when authorizing users
to access some resources in the system.
Authentication is the process of verifying the identity of a user who is trying to gain access to
a system, whereas authorization is a method that is used to determine the recourses that are
accessible to an authenticated user. Even though authentication and authorization performs
two different tasks, they are closely related. In fact, in most of the host-based and client/
server systems, these two mechanisms are implemented using the same hardware/ software
systems. The authorization scheme depends on the authentication scheme to ensure the
identities of the users who enter in to the system and get access to the resources.
Controlling Access:
Discretionary Access Control :
The typical method of enforcing discretionary access control in a database system is based
on the granting and revoking of privileges. Let us consider privileges in the context of a
relational DBMS. We will discuss a system of privileges somewhat like the one originally
developed for the SQL language. Many current relational DBMSs use some variation of this
technique. The main idea is to include statements in the query language that allow the DBA
and selected users to grant and revoke privileges.
1. Types of Discretionary Privileges
In SQL the concept of an authorization identifier is used to refer, roughly speaking,
to a user account (or group of user accounts). For simplicity, we will use the words
user or account interchangeably in place of authorizationidentifier. The DBMS must
provide selective access to each relation in the databasebased on specific accounts.
Operations may also be controlled; thus, having an account does not necessarily
entitle the account holder to all the functionality provided by the DBMS. Informally,
there are two levels for assigning privileges to use the database system:
The account level. At this level, the DBA specifies the particular privilegesthat each account
holds independently of the relations in the database.
The relation (or table) level. At this level, the DBA can control the privilegeto access each
individual relation or view in the database.
References privilege on R.This gives the account the capability toreference(or refer to) a
relation R when specifying integrity constraints. This privilege can also be restricted to
specific attributes of R.
Notice that to create a view, the account must have the SELECT privilege on all relations
involved in the view definition in order to specify the query that correspondsto the view.
3. Revoking of Privileges
In some cases it is desirable to grant a privilege to a user temporarily. For example, the owner
of a relation may want to grant the SELECT privilege to a user for a specific task and then
revoke that privilege once the task is completed. Hence, a mechanism for revoking privileges
is needed. In SQL a REVOKE command is included for the purpose of canceling privileges.
Suppose that the DBA creates four accounts—A1, A2, A3, and A4—and wants only A1 to be
able to create base relations. To do this, the DBA must issue the following GRANT
command in SQL:
The CREATETAB (create table) privilege gives account A1 the capability to create new
database tables (base relations) and is hence an account privilege. This privilege was part of
earlier versions of SQL but is now left to each individual system implementation to define.
In SQL2 the same effect can be accomplished by having the DBA issue a
CREATESCHEMA command, as follows:
User account A1 can now create tables under the schema called EXAMPLE. To continue our
example, suppose that A1 creates the two-base relations EMPLOYEE and DEPARTMENT
shown in Following Figure ; A1 is then theownerof these two relations andhence has all the
relation privileges on each of them.
Next, suppose that account A1 wants to grant to account A2 the privilege to insert and delete
tuples in both relations. However, A1 does not want A2 to be able to propagate these
privileges to additional accounts. A1 can issue the following command:
Notice that the owner account A1 of a relation automatically has the GRANTOPTION,
allowing it to grant privileges on the relation to other accounts. However,account A2 cannot
grant INSERT and DELETE privileges on the EMPLOYEE and DEPARTMENT tables
because A2 was not given the GRANT OPTION in the precedingcommand.
Next, suppose that A1 wants to allow account A3 to retrieve information from either of the
two tables and to be able to propagate the SELECT privilege to other accounts. A1 can issue
the following command:
Notice that A4 cannot propagate the SELECT privilege to other accounts because the
GRANT OPTION was not given to A4.
Now suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE relation
from A3; A1 then can issue this command:
The DBMS must now revoke the SELECT privilege on EMPLOYEE from A3, and it must
also automatically revoke the SELECT privilege on EMPLOYEE from A4. This is because
A3 granted that privilege to A4, but A3 does not have the privilege any more.
Next, suppose that A1 wants to give back to A3 a limited capability to SELECT from the
EMPLOYEE relation and wants to allow A3 to be able to propagate the privilege. The
limitation is to retrieve only the Name, Bdate, and Address attributes and only for the tuples
with Dno = 5. A1 then can create the following view:
After the view is created, A1 can grant SELECT on the view A3EMPLOYEE to A3 as
follows:
Finally, suppose that A1 wants to allow A4 to update only the Salary attribute of
EMPLOYEE; A1 can then issue the following command:
Techniques to limit the propagation of privileges have been developed, although they have
not yet been implemented in most DBMSs and are not a part of SQL. Limiting horizontal
propagation to an integer number i means that an account B given the GRANT OPTION can
grant the privilege to at most i other accounts.
Vertical propagation is more complicated; it limits the depth of the granting ofprivileges.
Granting a privilege with a vertical propagation of zero is equivalent to granting the privilege
with noGRANT OPTION. If account A grants a privilege to account B with the vertical
propagation set to an integer number j> 0, this means that the account B has the GRANT
OPTION on that privilege, but B can grant the privilege to other accounts only with a vertical
propagation less than j. In effect, vertical propagation limits the sequence of GRANT
OPTIONS that can be given from one account to the next based on a single original grant of
the privilege.
Typical security classes are top secret (TS), secret (S), confidential (C), and unclassified (U),
where TS is the highest level and U the lowest. Other more complex security classification
schemes exist, in which the security classes are organized in a lattice. For simplicity, we will
use the system with four security classification levels, where TS ≥ S ≥ C ≥ U, to illustrate our
discussion. The commonly used model for multilevel security, known as the Bell-LaPadula
model, classifies each subject (user, account, program) and object (relation, tuple, column,
view, operation) into one of the security classifications TS, S, C, or U. We will refer to the
clearance (classification) of a subject S as class(S) and to the classification of an object O as
class(O). Two restrictions are enforced on data access based on the subject/object
classifications:
1. A subject S is not allowed read access to an object O unless class(S) ≥ class(O). This
is known as the simple security property.
2. A subject S is not allowed to write an object O unless class(S) ≤ class(O). This is
known as the star property (or *-property).
The first restriction is intuitive and enforces the obvious rule that no subject can read an
object whose security classification is higher than the subject’s security clearance. The
second restriction is less intuitive. It prohibits a subject from writing an object at a lower
security classification than the subject’s security clearance. Violation of this rule would allow
information to flow from higher to lower classifications, which violates a basic tenet of
multilevel security. For example, a user (subject) with TS clearance may make a copy of an
object with classification TS and then write it back as a new object with classification U, thus
making it visible throughout the system.
To incorporate multilevel security notions into the relational database model, it is common to
consider attribute values and tuples as data objects. Hence, each attribute A is associated with
a classification attributeC in the schema, and each attribute value in a tuple is associated
with a corresponding security classification. In addition, in some models, a tuple
classification attribute TC is added to the relation attributes to provide a classification for
each tuple as a whole. The model we describe here is known as the multilevel model, because
it allows classifications at multiple security levels. A multilevel relation schema R with n
attributes would be represented as:
where each Ci represents the classification attribute associated with attribute Ai.
The value of the tuple classification attribute TC in each tuple t—which is the highest of all
attribute classification values within t—provides a general classification for the tuple itself.
Each attribute classification Ci provides a finer security classification for each attribute value
within the tuple. The value of TC in each tuple t is the highest of all attribute classification
values Ci within t.
The apparent key of a multilevel relation is the set of attributes that would have formed the
primary key in a regular (single-level) relation. A multilevel relation will appear to contain
different data to subjects (users) with different clearance levels. In some cases, it is possible
to store a single tuple in the relation at a higher classification level and produce the
corresponding tuples at a lower-level classification through a process known as filtering. In
other cases, it is necessary to store two or more tuples at different classification levels with
the same value for the apparent key.
This leads to the concept of polyinstantiation, where several tuples can have the same
apparent key value but have different attribute values for users at different clearance levels.
We illustrate these concepts with the simple example of a multilevel relation shown in Figure
24.2(a), where we display the classification attribute values next to each attribute’s value.
Assume that the Name attribute is the apparent key, and consider the query
SELECT*FROMEMPLOYEE. A user with security clearance S would see the same relation
shown in Figure 24.2(a), since all tuple classifications are less than or equal to S. However, a
user with security clearance C would not be allowed to see the values for Salary of ‘Brown’
and Job_performance of ‘Smith’, since they have higher classification. The tuples would be
filtered to appear as shown in Figure 24.2(b), with Salary and Job_performanceappearing as
null. For a user with security clearance U, the filtering allows only theNameattribute of
‘Smith’ to appear, with all the other
attributes appearing as null (Figure 24.2(c)). Thus, filtering introduces null values for
attribute values whose security classification is higher than the user’s security clearance.
In general, the entity integrity rule for multilevel relations states that all attributes that are
members of the apparent key must not be null and must have the same security classification
within each individual tuple. Additionally, all other attribute values in the tuple must have a
security classification greater than or equal to that of the apparent key. This constraint
ensures that a user can see the key if the user is permitted to see any part of the tuple. Other
integrity rules, called null integrity and interinstance integrity, informally ensure that if a
tuple value at some security level can be filtered (derived) from a higher-classified tuple, then
it is sufficient to store the higher-classified tuple in the multilevel relation.
To illustrate polyinstantiation further, suppose that a user with security clearance C tries to
update the value of Job_performance of ‘Smith’ in Figure 24.2 to ‘Excellent’; this
corresponds to the following SQL update being submitted by that user:
Since the view provided to users with security clearance C (see Figure 24.2(b)) per-mits such
an update, the system should not reject it; otherwise, the user could infer that some nonnull
value exists for the Job_performance attribute of ‘Smith’ rather than the null value that
appears. This is an example of inferring information through what is known as a covert
channel, which should not be permitted in highly secure systems (see Section 24.6.1).
However, the user should not be allowed to overwrite the existing value of Job_performance
at the higher classification level. The solution is to create a polyinstantiation for the ‘Smith’
tuple at the lower classification level C, as shown in Figure 24.2(d). This is necessary since
the new tuple cannot be filtered from the existing tuple at classification S.
Discretionary access control (DAC) policies are characterized by a high degree of flexibility,
which makes them suitable for a large variety of application domains. The main drawback of
DAC models is their vulnerability to malicious attacks, such as Trojan horses embedded in
application programs. The reason is that discretionary authorization models do not impose
any control on how information is propagated and used once it has been accessed by users
authorized to do so. By contrast, mandatory policies ensure a high degree of protection—in a
way, they prevent any illegal flow of information. Therefore, they are suitable for military
and high security types of applications, which require more protection. However, mandatory
policies have the drawback of being too rigid in that they require a strict classification of
subjects and objects into security levels, and there-fore they are applicable to few
environments. In many practical situations, discretionary policies are preferred because they
offer a better tradeoff between security and applicability.
Role-based access control (RBAC) emerged rapidly in the 1990s as a proven technology for
managing and enforcing security in large-scale enterprise-wide systems. Its basic notion is
that privileges and other permissions are associated with organizational roles, rather than
individual users. Individual users are then assigned to appropriate roles. Roles can be created
using the CREATE ROLE and DESTROYROLE commands. The GRANT and REVOKE
commands discussed in Section 24.2can then be used to assign and revoke privileges from
roles, as well as for individual users when needed. For example, a company may have roles
such as sales account manager, purchasing agent, mailroom clerk, department manager, and
so on. Multiple individuals can be assigned to each role. Security privileges that are common
to a role are granted to the role name, and any individual assigned to this role would
automatically have those privileges granted.
RBAC can be used with traditional discretionary and mandatory access controls; it ensures
that only authorized users in their specified roles are given access to certain data or resources.
Users create sessions during which they may activate a subset of roles to which they belong.
Each session can be assigned to several roles, but it maps to one user or a single subject only.
Many DBMSs have allowed the concept of roles, where privileges can be assigned to roles.
The role hierarchy in RBAC is a natural way to organize roles to reflect the organization’s
lines of authority and responsibility. By convention, junior roles at the bottom are connected
to progressively senior roles as one moves up the hierarchy. The hierarchic diagrams are
partial orders, so they are reflexive, transitive, and antisymmetric. In other words, if a user
has one role, the user automatically has roles lower in the hierarchy. Defining a role hierarchy
involves choosing the type of hierarchy and the roles, and then implementing the hierarchy
by granting roles to other roles. Role hierarchy can be implemented in the following manner:
The above are examples of granting the roles full_time and intern to two types of employees.
Another issue related to security is identity management. Identity refers to a unique name of
an individual person. Since the legal names of persons are not necessarily unique, the identity
of a person must include sufficient additional information to make the complete name unique.
Authorizing this identity and managing the schema of these identities is called Identity
Management. Identity Management addresses how organizations can effectively
authenticate people and manage their access to confidential information. It has become more
visible as a business requirement across all industries affecting organizations of all sizes.
Identity Management administrators constantly need to satisfy application owners while
keeping expenditures under control and increasing IT efficiency.
Another important consideration in RBAC systems is the possible temporal constraints that
may exist on roles, such as the time and duration of role activations, and timed triggering of a
role by an activation of another role. Using an RBAC model is a highly desirable goal for
addressing the key security requirements of Web-based applications. Roles can be assigned to
workflow tasks so that a user with any of the roles related to a task may be authorized to
execute it and may play a certain role only for a certain duration.
RBAC models have several desirable features, such as flexibility, policy neutrality, better
support for security management and administration, and other aspects that make them
attractive candidates for developing secure Web-based applications. These features are
lacking in DAC and MAC models. In addition, RBAC models include the capabilities
available in traditional DAC and MAC policies. Furthermore, an RBAC model provides
mechanisms for addressing the security issues related to the execution of tasks and
workflows, and for specifying user-defined and organization-specific policies. Easier
deployment over the Internet has been another reason for the success of RBAC models.
Database encryption
Database encryption can generally be defined as a process that uses an algorithm to
transform data stored in a database into "cipher text" that is incomprehensible without first
being decrypted. It can therefore be said that the purpose of database encryption is to protect
the data stored in a database from being accessed by individuals with potentially "malicious"
intentions. The act of encrypting a database also reduces the incentive for individuals to hack
the aforementioned database as "meaningless" encrypted data is of little to no use for hackers.
There are multiple techniques and technologies available for database encryption.
DES is an implementation of a Feistel Cipher. It uses 16 round Feistel structure. The block
size is 64-bit. Though, key length is 64-bit, DES has an effective key length of 56 bits, since
8 of the 64 bits of the key are not used by the encryption algorithm (function as check bits
only). General Structure of DES is depicted in the following illustration −
Since DES is based on the Feistel Cipher, all that is required to specify DES is −
Round function
Key schedule
Any additional processing − Initial and final permutation
The initial and final permutations are straight Permutation boxes (P-boxes) that are inverses
of each other. They have no cryptography significance in DES. The initial and final
permutations are shown as follows −
Round Function
The heart of this cipher is the DES function, f. The DES function applies a 48-bit key to the
rightmost 32 bits to produce a 32-bit output.
Expansion Permutation Box − Since right input is 32-bit and round key is a 48-bit,
we first need to expand right input to 48 bits. Permutation logic is graphically
depicted in the following illustration −
XOR (Whitener). − After the expansion permutation, DES does XOR operation on
the expanded right section and the round key. The round key is used only in this
operation.
Substitution Boxes. − The S-boxes carry out the real mixing (confusion). DES uses 8
S-boxes, each with a 6-bit input and a 4-bit output. Refer the following illustration −
The S-box rule is illustrated below −
There are a total of eight S-box tables. The output of all eight s-boxes is then
combined in to 32 bit section.
Straight Permutation − The 32 bit output of S-boxes is then subjected to the straight
permutation with rule shown in the following illustration:
Key Generation
The round-key generator creates sixteen 48-bit keys out of a 56-bit cipher key. The process of
key generation is depicted in the following illustration −
The logic for Parity drop, shifting, and Compression P-box is given in the DES description.
DES Analysis
The DES satisfies both the desired properties of block cipher. These two properties make
cipher very strong.
Avalanche effect − A small change in plaintext results in the very grate change in the
ciphertext.
Completeness − Each bit of ciphertext depends on many bits of plaintext.
During the last few years, cryptanalysis have found some weaknesses in DES when key
selected are weak keys. These keys shall be avoided.
DES has proved to be a very well designed block cipher. There have been no significant
cryptanalytic attacks on DES other than exhaustive key search.
Digital signatures
Digital signatures are the public-key primitives of message authentication. In the physical
world, it is common to use handwritten signatures on handwritten or typed messages. They
are used to bind signatory to the message.
Similarly, a digital signature is a technique that binds a person/entity to the digital data. This
binding can be independently verified by receiver as well as any third party.
Digital signature is a cryptographic value that is calculated from the data and a secret key
known only by the signer.
In real world, the receiver of message needs assurance that the message belongs to the sender
and he should not be able to repudiate the origination of that message. This requirement is
very crucial in business applications, since likelihood of a dispute over exchanged data is
very high.
As mentioned earlier, the digital signature scheme is based on public key cryptography. The
model of digital signature scheme is depicted in the following illustration −
It should be noticed that instead of signing data directly by signing algorithm, usually a hash
of data is created. Since the hash of data is a unique representation of data, it is sufficient to
sign the hash in place of data. The most important reason of using hash instead of data
directly for signing is efficiency of the scheme.
Let us assume RSA is used as the signing algorithm. As discussed in public key encryption
chapter, the encryption/signing process using RSA involves modular exponentiation.
Signing large data through modular exponentiation is computationally expensive and time
consuming. The hash of the data is a relatively small digest of the data, hence signing a hash
is more efficient than signing the entire data.
Out of all cryptographic primitives, the digital signature using public key cryptography is
considered as very important and useful tool to achieve information security.
Apart from ability to provide non-repudiation of message, the digital signature also provides
message authentication and data integrity. Let us briefly see how this is achieved by the
digital signature −
Message authentication − When the verifier validates the digital signature using
public key of a sender, he is assured that signature has been created only by sender
who possess the corresponding secret private key and no one else.
Data Integrity − In case an attacker has access to the data and modifies it, the digital
signature verification at receiver end fails. The hash of modified data and the output
provided by the verification algorithm will not match. Hence, receiver can safely deny
the message assuming that data integrity has been breached.
Non-repudiation − Since it is assumed that only the signer has the knowledge of the
signature key, he can only create unique signature on a given data. Thus the receiver
can present data and the digital signature to a third party as evidence if any dispute
arises in the future.
This makes it essential for users employing PKC for encryption to seek digital signatures
along with encrypted data to be assured of message authentication and non-repudiation.
This can have archived by combining digital signatures with encryption scheme. Let us
briefly discuss how to achieve this requirement. There are two possibilities, sign-then-
encrypt and encrypt-then-sign.
However, the crypto system based on sign-then-encrypt can be exploited by receiver to spoof
identity of sender and sent that data to third party. Hence, this method is not preferred. The
process of encrypt-then-sign is more reliable and widely adopted. This is depicted in the
following illustration −
The receiver after receiving the encrypted data and signature on it, first verifies the signature
using sender’s public key. After ensuring the validity of the signature, he then retrieves the
data through decryption using his private key.
Symmetric cryptography was well suited for organizations such as governments, military,
and big financial corporations were involved in the classified communication.
With the spread of more unsecure computer networks in last few decades, a genuine need was
felt to use cryptography at larger scale. The symmetric key was found to be non-practical due
to challenges it faced for key management. This gave rise to the public key cryptosystems.
Different keys are used for encryption and decryption. This is a property which set
this scheme different than symmetric encryption scheme.
Each receiver possesses a unique decryption key, generally referred to as his private
key.
Receiver needs to publish an encryption key, referred to as his public key.
Some assurance of the authenticity of a public key is needed in this scheme to avoid
spoofing by adversary as the receiver. Generally, this type of cryptosystem involves
trusted third party which certifies that a public key belongs to a specific person or
entity only.
Encryption algorithm is complex enough to prohibit attacker from deducing the
plaintext from the ciphertext and the encryption (public) key.
Though private and public keys are related mathematically, it is not be feasible to
calculate the private key from the public key. In fact, intelligent part of any public-key
cryptosystem is in designing a relationship between two keys.
RSA Public key Encryption Algorithm
This cryptosystem is one the initial system. It remains most employed cryptosystem even
today. The system was invented by three scholars Ron Rivest, Adi Shamir, and Len
Adleman and hence, it is termed as RSA cryptosystem.
We will see two aspects of the RSA cryptosystem, firstly generation of key pair and secondly
encryption-decryption algorithms.
Each person or a party who desires to participate in communication using encryption needs to
generate a pair of keys, namely public key and private key. The process followed in the
generation of keys is described below −
ed = 1 mod (p − 1)(q − 1)
The Extended Euclidean Algorithm takes p, q, and e as input and gives d as output.
Example
An example of generating RSA Key pair is given below. (For ease of understanding, the
primes p & q taken here are small values. Practically, these values are very high).
de = 29 × 5 = 145 = 1 mod 72
Once the key pair has been generated, the process of encryption and decryption are relatively
straightforward and computationally easy.
Interestingly, RSA does not directly operate on strings of bits as in case of symmetric key
encryption. It operates on numbers modulo n. Hence, it is necessary to represent the plaintext
as a series of numbers less than n.
RSA Encryption
Suppose the sender wish to send some text message to someone whose public key is
(n, e).
The sender then represents the plaintext as a series of numbers less than n.
To encrypt the first plaintext P, which is a number modulo n. The encryption process
is simple mathematical step as −
C = Pe mod n
In other words, the ciphertext C is equal to the plaintext P multiplied by itself e times
and then reduced modulo n. This means that C is also a number less than n.
Returning to our Key Generation example with plaintext P = 10, we get ciphertext C
−
C = 105 mod 91
RSA Decryption
The decryption process for RSA is also very straightforward. Suppose that the
receiver of public-key pair (n, e) has received a ciphertext C.
Receiver raises C to the power of his private key d. The result modulo n will be the
plaintext P.
Plaintext = Cd mod n
Returning again to our numerical example, the ciphertext C = 82 would get decrypted
to number 10 using private key 29 −
Symmetric encryption in the context of database encryption involves a private key being
applied to data that is stored and called from a database. This private key alters the data in a
way that causes it to be unreadable without first being decrypted.Data is encrypted when
saved, and decrypted when opened given that the user knows the private key. Thus if the data
is to be shared through a database the receiving individual must have a copy of the secret key
used by the sender in order to decrypt and view the data.A clear disadvantage related to
symmetric encryption is that sensitive data can be leaked if the private key is spread to
individuals that should not have access to the data.However, given that only one key is
involved in the encryption process it can generally be said that speed is an advantage of
symmetric encryption.
Cryptosystems
A cryptosystem is an implementation of cryptographic techniques and their accompanying
infrastructure to provide information security services. A cryptosystem is also referred to as a
cipher system.
The objective of this simple cryptosystem is that at the end of the process, only the sender
and the receiver will know the plaintext.
Components of a Cryptosystem
For a given cryptosystem, a collection of all possible decryption keys is called a key space.
An interceptor (an attacker) is an unauthorized entity who attempts to determine the
plaintext. He can see the ciphertext and may know the decryption algorithm. He, however,
must never know the decryption key.
Statistical database
A statistical database is a database used for statistical analysis purposes. It is an OLAP
(online analytical processing), instead of OLTP (online transaction processing) system.
Modern decision, and classical statistical databases are often closer to the relational model
than the multidimensional model commonly used in OLAP systems today.
Statistical databases typically contain parameter data and the measured data for these
parameters. For example, parameter data consists of the different values for varying
conditions in an experiment (e.g., temperature, time). The measured data (or variables) are
the measurements taken in the experiment under these varying conditions.
Statistical databases often incorporate support for advanced statistical analysis techniques,
such as correlations, which go beyond SQL. They also pose unique security concerns, which
were the focus of much research, particularly in the late 1970s and early to mid-1980s.