0% found this document useful (0 votes)
77 views

Disk Storage, Basic File Structures, and Hashing: Dr. Hasnaa Raafat Dr. Nora Zakie

This document discusses disk storage and basic file structures. It begins by introducing primary and secondary storage, with magnetic disks being the most common secondary storage. Data is stored on disks in tracks and sectors, with blocks being the basic unit of transfer between disk and memory. Reading and writing blocks has overhead from seek time, rotational latency, and transfer time. Files are broken into fixed-size pages or blocks for storage across disk blocks. The cost of a page input/output operation depends on these factors.

Uploaded by

Hasnaa Adel
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

Disk Storage, Basic File Structures, and Hashing: Dr. Hasnaa Raafat Dr. Nora Zakie

This document discusses disk storage and basic file structures. It begins by introducing primary and secondary storage, with magnetic disks being the most common secondary storage. Data is stored on disks in tracks and sectors, with blocks being the basic unit of transfer between disk and memory. Reading and writing blocks has overhead from seek time, rotational latency, and transfer time. Files are broken into fixed-size pages or blocks for storage across disk blocks. The cost of a page input/output operation depends on these factors.

Uploaded by

Hasnaa Adel
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

 

Chapter 16

Disk Storage,
Basic File Structures, and
Hashing
Dr. Hasnaa Raafat Dr. Nora Zakie

 1 
 Introduction 

 In a computerized database, the data is stored on


computer storage media, which includes:

 Primary Storage
 can be processed directly by the CPU
 e.g., the main memory, cache
 fast, expensive, but of limited capacity

 Secondary Storage
 cannot be processed directly by the CPU
 magnetic disks, optical disks, tapes
 slow, cost less, but have a large capacity.
 2 
 Storage Hierarchy 

Volatile Cache
Primary
storage Memory Unit price

Secondary Flash Memory


storage
Magnetic Disk
Non-volatile speed

Tertiary Optical Disk


storage
Magnetic Tape $$

 3 
 Storage of Databases 

 For the following reasons, most databases are


stored permanently on secondary storage:

 They are too large to fit entirely in main


memory

 They must persist over long period of times, but


the main memory is a volatile storage

 Secondary storage costs less

 4 
 Secondary Storage 

 Magnetic-disk: cannot be directly processed by the CPU;


it must be brought to the main memory first.

 Data is stored on spinning disk, and read/written


magnetically

 Primary medium for the long-term storage of data;


typically stores entire database.

 Non-volatile

 slow access to data

 large storage capacity (on the order of gigabytes)


 5 
 Disk Storage Devices 

 Preferred secondary storage device for high storage


capacity and low cost.
 Data stored as magnetized areas on magnetic disk
surfaces.
 A disk pack contains several magnetic disks
connected to a rotating spindle.
 Disks are divided into concentric circular tracks
on each disk surface.
 Track capacities vary typically from 4 to 50
Kbytes or more

 6 
 Disk Storage Devices (contd.) 

 A track is divided into smaller blocks or sectors


 because it usually contains a large amount of
information
 The division of a track into sectors is hard-coded on the
disk surface and cannot be changed.
 One type of sector organization calls a portion of a
track that subtends (faces) a fixed angle at the center as
a sector.
 A track is divided into blocks.
 The block size B is fixed for each system.
 Typical block sizes range from B=512 bytes to B=4096 bytes.
 Whole blocks are transferred between disk and main
memory for processing.

 7 
 Disk Storage Devices (contd.) 
 A read-write head moves to the track that contains the
block to be transferred.
 Disk rotation moves the block under the read-write
head for reading or writing.
 A physical disk block (hardware) address consists of:
 a cylinder number (imaginary collection of tracks of
same radius from all recorded surfaces)
 the track number or surface number (within the
cylinder)
 and block number (within track).
 Reading or writing a disk block is time consuming because
of the seek time s and rotational delay (latency) rd.
 Double buffering can be used to speed up the transfer of
contiguous disk blocks.

 8 
 Physical Characteristics of Disks 

 9 
 Magnetic Hard Disk Mechanism 

NOTE: Diagram is schematic, and simplifies the structure of actual disk drives

 10 
 Components of a Disk 
 The platters spin (say, 90rps).
 The arm assembly is moved in or out to position a
head on a desired track.
 Read-write head
 Positioned very close to the platter surface
(almost touching it)
 Reads or writes magnetically encoded
information.
 Only one head reads/writes at any one time.

 Surface of platter divided into circular tracks

 11 
 Physical Characteristics of Disks 
 Track
 an information storage circle on the surface of a
disk.
 Over 16,000 tracks per platter
 each track can store between 4KB and 50KB of
data.
 Each track is divided into sectors.
 Tracks under heads make a cylinder (imaginary!)

 Cylinder
 the tracks with the same diameter on all
surfaces of a disk pack.
 Cylinder i consists of i-th track of all the
platters
 12 
 Physical Characteristics of Disks 

 Sector
 a part of a track with fixed size
 separated by fixed-size inter block gaps
 Typical sectors per track

 200 (on inner tracks) to 400 (on outer tracks)

 13 
 Sectors 

 14 
 

 15 
 Pages and Blocks 

 Data files decomposed into pages (blocks)


 fixed size piece of contiguous information in
the file
 sizes range from 512 bytes to several kilobytes

 block is the smallest unit for transferring data


between the main memory and the disk.

 Address of a page (block):


 (cylinder#, track# (within cylinder), sector#
(within track)

 16 
 
Pages and Blocks

Track

Gap
Sector

One track 1 2 3 4 ...

1 page/block = 4 Sectors

 17 
 
Page I/O
 Page I/O --- one page I/O is the cost (or time needed) to
transfer one page of data between the memory and the disk.

 The cost of a (random) page I/O =


 seek time + rotational delay + block transfer time

 Seek time
 time needed to position read/write head on correct
track.
 Rotational delay (latency)
 time needed to rotate the beginning of page under
read/write head.
 Block transfer time
 time needed to transfer data in the page/block.

 18 
 Page I/O 
 Average rotational delay (rd)
 rd = ½ * (1/p) min = (60*1000)/(2*p) msec
 OR
 rd = ½ * cost of 1 revolution
 = ½ * (60*1000/p) msec
 where
 p is speed of disk rotation (how many revolutions
per minute - rpm)
 Example
 Speed of disk rotation is p = 3600 rpm
 60 revolutions/sec
 rd = 8.33 ms

 19 
 
Page I/O
 Transfer rate (tr)
 tr = track size / cost of one revolution
 = track size / (60*1000/p) in msec

 Bulk transfer rate (btr)


 btr = (B/(B+G)) * tr bytes/msec
 Where B is the block size in bytes
 G is interblock gap size in bytes

 Block transfer time (btt)


 btt = B / tr not taking into acount G
 btt = B / btr taking into acount G
 20 
 Page I/O 

 Example:
 Track size = 50 KB and p = 3600 rpm
 Block size B = 3KB = 3000 bytes

 tr = (50*1000)/(60*1000/3600) = 3000 bytes/msec

 btt = B / tr = 3000/3000 = 1 msec

 21 
 Page I/O 

 Cost of page I/O = s + rd + * btt


 Average time for reading/writing n consecutive
pages that are in the same track or cylinder = s +
rd + n * btt
 Where s…..seek time

 Average time for reading/writing consecutively n


noncontigues pages/blocks that are in the same
cylinder = s + n * (rd + btt)

 22 
 
An Example
 A hard disk specifications:
 4 platters, 8 Surfaces, 3.5 Inch diameter
 213 = 8192 tracks/surface
 28 = 256 sectors/track
 29 = 512 bytes/sector
 Average seek time s = 25 ms
 Rotation rate rd = 3600 rpm = 60 rps
 1 rev. = 16.66 msec
 Transfer rate
 tr = 1 KB in 0.117 ms
 tr = 1 KB in 0.130 ms with gap

 23 
 An Example 
 What is the total capacity of this disk
 8 GB (8*213*28*29=233)

 How many bytes does one track hold?


 256 sectors/track*512 bytes/sector = 128KB

 How many blocks per track?


 one block = 4096 bytes = 8 sectors (4096/512)
 256/8 = 32 blocks/track

 24 
 
An Example
 How long does it take to access one block?

 One block = 4096 bytes


 8 sectors = 4096/512
 Rotation rate r
 1 rev. = 16.66 msec.
 Time to access 1 sector (s + r/2 + tr/(secters/KB)
 25 + (16.66/2) + .117/2 = 33.3885 ms.

 time to access 1 block


 time to access the first sector of the block +
time to access the subsequent 7 sectors.
 25 
 
An Example
 T = 25 + (16.66/2) + (0.117/2) * 1 + (0.13/2) *7 =
33.3885 + 0.455 ms = 33.8435ms

 Compare to one sector access time: 33.3885 ms

1 2 3 ... 8

1 block = 8 Sectors

 26 
 
Buffering
 A buffer
 is a contiguous reserved area in main memory
available for storage of copies of disk blocks.
 to speed up the processes.

 For a read command


 the block from disk is copied into the buffer.

 For a write command


 the contents of the buffer are copied into the
disk.

 27 
 Accessing Data Through RAM Buffer 

RAM
Block transfer

Application Buffer

block
Record
transfer Page frames

 28 
 Buffer Manager 
 a software component of a DBMS that responds to requests for data and
decides what buffer to use and what pages to replace in the buffer to
accommodate the newly requested blocks.
 Programs call on the buffer manager when they need a block from disk.
 If the block is already in the buffer,
 The requesting program is given the address of the block in main
memory
 If the block is not in the buffer,
 The buffer manager allocates space in the buffer for the block,
replacing (throwing out) some other block, if required, to make
space for the new block.
 The block that is thrown out is written back to disk only if it was
modified since the most recent time that it was written to/fetched
from the disk.

 29 
 Buffer Manager 

 Once space is allocated in the buffer, the buffer


manager reads the block from the disk to the buffer,
and passes the address of the block in main memory
to requester.
 Buffer pool:
 The available main memory storage viewed by buffer
manager.
 Which contains a collection of pages.
 Buffer manger kept two types of information about
each page in buffer pool:
 Pin-count: the number of times that page has been requested if
zero is become unpinned
 Dirty bit: all pages initially zero, takes 1 when page updated by
any application program

 30 
 Buffer Replacement Policy: 

 Frame is chosen for replacement by a replacement


policy:
 Least-recently-used (LRU), throw out that page that
has not been used (read or written) for the longest
time.
 Most recently used(MRU), when a block that is used
most recently is not needed until all the remaining
blocks in the relation are processed.
 First- in- first- out (FIFO) …..etc.

 Policy can have big impact on # of I/O’s; depends


on the access pattern.

 31 

You might also like