Storage and File Structure
Storage and File Structure
So far we have studied the DBMS at level of the logical model. The logical model of a
database system is the correct level for the database users to focus on. The goal of a
database system is to simplify and facilitate access to data. As members of the
development staff and as potential Database Administrators, we need to understand
the physical level better than a typical user.
1. Cache is the fastest and the most costly for of storage. The type of cache
referred to here is the type that is typically built into the CPU chip and is
256KB, 512KB, or 1MB. Thus, cache is used by the operating system and has
no application to database, per se.
2. Main memory is the volatile memory in the computer system that is used to
hold programs and data. While prices have been dropping at a staggering rate,
the increases in the demand for memory have been increasing faster. Today's
32-bit computers have a limitation of 4GB of memory. This may not be
sufficient to hold the entire database and all the associated programs, but the
more memory available will increase the response time of the DBMS. There are
attempts underway to create a system with the most memory that is cost
effective, and to reduce the functionality of the operating system so that only
the DBMS is supported, so that system response can be increased. However,
the contents of main memory are lost if a power failure or system crash occurs.
6. Tape storage is used for backup and archival data. It is cheaper and slower
than all of the other forms, but it does have the feature that there is no limit on
the amount of data that can be stored, since more tapes can be purchased. As
the tapes get increased capacity, however, restoration of data takes longer and
longer, especially when only a small amount of data is to be restored. This is
because the retrieval is sequential, the slowest possible method.
Magnetic Disks
A typical large commercial database may require hundreds of disks!
The surface is divided into a set of tracks (circles). These tracks are divided into a set
of sectors, which is the smallest unit of data that can be written or read at one time.
Sectors can range in size from 31 bytes to 4096 bytes, with 512 bytes being the most
common. A collection of a specific track from both surfaces and from all of the
platters is called a cylinder.
Platters can range in size from 1.8 inches to 14 inches. Today, 5 1/4 inches and 3 1/2
inches are the most common, because they have the highest seek times and lowest
cost.
A disk controller interfaces the computer system and the actual hardware of the disk
drive. The controller accepts high-level command to read or write sectors. The
controller then converts the commands in the necessary specific low-level commands.
The controller will also attempt to protect the integrity of the data by computing and
using checksums for each sector. When attempting to read the data back, the controller
recalculates the checksum and makes several attempts to correctly read the data and
get matching checksums. If the controller is unsuccessful, it will notify the operating
system of the failure.
The controller can also handle the problem of eliminating bad sectors. Should a sector
go bad, the controller logically remaps the sector to one of the extra unused sectors
that disk vendors provide, so that the reliability of the disk system is higher. It is
cheaper to produce disks with a greater amount of sectors than advertised and then
map out bad sectors than it is to produce disks with no bad sectors or with extremely
limited possibility of sectors going bad.
There are many different types of disk controllers, but the most common ones today
are SCSI, IDE, and EIDE.
1. Seek time is the time to reposition the head and increases with the distance that
the head must move. Seek times can range from 2 to 30 milliseconds. Average
seek time is the average of all seek times and is normally one-third of the
worst-case seek time.
2. Rotational latency time is the time from when the head is over the correct track
until the data rotates around and is under the head and can be read. When the
rotation is 120 rotations per second, the rotation time is 8.35 milliseconds.
Normally, the average rotational latency time is one-half of the rotation time.
3. Access time is the time from when a read or write request is issued to when the
data transfer begins. It is the sum of the seek time and latency time.
4. Data-transfer rate is the rate at which data can be retrieved from the disk and
sent to the controller. This will be measured as megabytes per second.
5. Mean time to failure is the number of hours (on average) until a disk fails.
Typical times today range from 30,000 to 800,000 hours (or 3.4 to 91 years).
Requests for disk I/O are generated by both the file system and by the virtual memory
manager found in most systems. Each request specifies the address on the disk to be
referenced; that address specifies is in the form of a block number. Each block is a
contiguous sequence of sectors from a single track of one platter and ranges from 512
bytes to several kilobytes of data. The lower level file manager must convert block
addresses into the hardware-level cylinder, surface, and sector number.
Since access to data on disk is several orders of magnitude slower is access to data in
main memory; much attention has been paid to improving the speed of access to
blocks on the disk. This is also where more main memory can speed up the response
time, by making sure that the data needed is in memory when it is needed.
This is the same problem that is addressed in designing operating systems, to insure
the best response time from the file system manager and the virtual memory manager.
Log disk. You can use a disk for writing a sequential log.
Buffering. The more information you have in buffers in main memory, the
more likely you are to not have to get the information from the disk. However it
is more likely that more of the memory will be wasted with information not
necessary.
RAID
RAIDs are Redundant Arrays of Inexpensive Disks. There are six levels of organizing
these disks:
0 -- Non-redundant Striping
1 -- Mirrored Disks
6 -- P + Q Redundancy
Tertiary Storage
This is commonly optical disks and magnetic tapes.
Storage Access
A database is mapped into a number of different files, which are maintained by the
underlying operating system. Files are organized into block and a block may contain
one or more data item.
A major goal of the DBMS is to minimize the number of block transfers between the
disk and memory. Since it is not possible to keep all blocks in main memory, we need
to manage the allocation of the space available for the storage of blocks. This is also
similar to the problems encountered by the operating system, and can be in conflict
with the operating system, since the OS is concerned with processes and the DBMS is
concerned with only one family of processes.
Buffer Manager
Programs in a DBMS make requests (that is, calls) on the buffer manager when they
need a block from a disk. If the block is already in the buffer, the requester is passed
the address of the block in main memory. If the block in not in the buffer, the buffer
manager first allocates space in the buffer for the block, through out some other block,
if required, to make space for the new block. If the block that is to be thrown out has
been modified, it must first be written back to the disk. The internal actions of the
buffer manager are transparent to the programs that issue disk-block requests.
Replacement strategy. When there is no room left in the buffer, a block must be
removed from the buffer before a new one can be read in. Typically, operating
systems use a least recently use (LRU) scheme. There is also a Most Recent
Used (MRU) that can be more optimal for DBMSs.
Pinned blocks. A block that is not allowed to be written back to disk is said to
be pinned. This could be used to store data that has not been committed yet.
File Organization
Fixed-Length Records
If each character occupies 1 byte and a real occupies 8 bytes, then this record
occupies 40 bytes. If the first record occupies the first 40 bytes and the second
record occupies the second 40 bytes, etc. we have some problems.
Another solution would be to have two sets of pointers. One that would link the
current record to the next logical record (linked list) plus a free list (a list of free
slots.) This increases the size the file.
Variable-Length Records
Record types that allow variable lengths for one or more fields
It could be solved:
Any record can be placed anywhere in the file. There is no ordering of records and
there is a single file for each relation.
Any record can be placed anywhere in the file. A hashing function is computed on
some attribute of each record. The function specifies in which block the record should
be placed.
Several different relations can be stored in the same file. Related records of the
different relations can be stored in the same block.
Integrity constraints