The document discusses different methods of organizing database records on disk for storage and indexing, including file organization, primary and secondary indexing, dense and sparse indexing, and clustering indexing.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
22 views
Week 11 Storage Structure and Indexing
The document discusses different methods of organizing database records on disk for storage and indexing, including file organization, primary and secondary indexing, dense and sparse indexing, and clustering indexing.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17
RDBMS – AI2101 (AY : 2023-24)
Week 11 Storage Structure
and Indexing Storage Structure and Indexing Outlines • File Organization • Indexing • Classification of Indexing
10/05/2024 SCSE, Manipal University Jaipur 3
File Organization • Even though the database shows us data in the form of relations we must understand that it is just a logical representation. • Finally, all the data needs to be stored on the hard disk as files. • It is very important in performance point of view that this organization should allow the database software to access data quickly and in an efficient way.
10/05/2024 SCSE, Manipal University Jaipur 4
Files on Disk • How to store the records on hard disk? • You can rely on operating systems. But, the way we access normal data is totally different from the way we access database data. • In database, we might not require the whole file, just one or two records are required. So, we let DBMS use its very own organization. • Database ====> Files / Block ====> Record ====> Fields • We divide each table in block & then store it in hard disk. • If the size of a block is 1024 byte and 1 record is of size 4 byte then how many records in one block you can save? • 1024 / 4 =256 i.e; blocking factor is the average number of record per block.
10/05/2024 SCSE, Manipal University Jaipur 5
Fixed Length v/s Variable Length Records • Consider a list of mobile numbers where each number is exactly 10 digits long. • Retrieving number will be simple here because we can read 10 characters at a time. • But consider a list of names. How do we know the size of names? They can vary. • To make matters more complex we can have records of certain number of fixed length entries and some variable length entries.
10/05/2024 SCSE, Manipal University Jaipur 6
Records on Disk in Blocks • Spanned: It allows partial part of record to be stored in a block. R1 R2 R3 R4 R4 R5 R6 R7
• Advantages : no wastage of memory.
• Disadvantage: No. of block access will increase to access a record. • Unspanned: No record can be stored in more than one block. R1 R2 R3 R4 R5 R6 R7 ||||
• Advantages : No. of block access will be less to access a record.
• Disadvantage: no wastage of memory.
10/05/2024 SCSE, Manipal University Jaipur 7
Records on Disk in Files • Ordered File Organization: All records in a file are ordered on some search key value. • Searching can be done in binary search mode. • Advantage: Searching can be efficient. Only when we search on search key value. If we search on other attribute, then no advantage. • Disadvantage: Insertion will be expensive due to reorganization of the entire file. • Un-Ordered File Organization: All records in a file are inserted wherever the place is available (usually at the end of file). • Searching can be done in linear search mode. • Advantage: Insertion of a record is efficient. • Disadvantage: Searching is very inefficient. 10/05/2024 SCSE, Manipal University Jaipur 8 Indexing • Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library • Search Key - attribute or set of attributes used to look up records in a file. • An index file consists of records (called index entries) of the form search-key Block -pointer • Index files are typically much smaller than the original file • Two basic kinds of indices: • Ordered indices: search keys are stored in sorted order • Hash indices: search keys are distributed uniformly across “buckets” using a “hash function”.
Primary Indexing • Data file is ordered on primary key & we will build index on primary key. • A primary index is an ordered file whose records are of fixed length with two fields. First field is same as primary key of data file and second field is a pointer to the data block where key is available. • Index is created for the first record of each block is known as block anchors.
10/05/2024 SCSE, Manipal University Jaipur 11
Dense Indexing • Dense index — Index record appears for every search-key value in the file.
10/05/2024 SCSE, Manipal University Jaipur 12
Sparse Indexing • Sparse Index: contains index records for only some search-key values. • Applicable when records are sequentially ordered on search-key
10/05/2024 SCSE, Manipal University Jaipur 13
Secondary Indexing • Secondary Index provides a secondary means of accessing a file for which primary access already exist. • It will be dense index. i.e., index will be created for every record in a file. • Secondary Index does not have any impact on how the rows are actually organized in data blocks. • They can be in any order. The only ordering is w.r.t the index key in index blocks. 10/05/2024 SCSE, Manipal University Jaipur 14 Clustering Indexing • Clustered Indexing is used when there are multiple related records found at one place. • It is defined on ordered data. The important thing to note here is that the index table of clustered indexing is created using non-key values which may or may not be unique. • To achieve faster retrieval, we group columns having similar characteristics. • The indexes are created using these groups and this process is known as Clustering Index.