Data Representation A tuple is essentially a sequence of bytes. It's the job of the DBMS to interpret those bytes into attribute types and values. The DBMS's catalogs contain the schema information about tables that the system uses to figure out the tuple's layout.
Data Representation INTEGER/BIGINT/SMALLINT/TINYINT C/C++ Representation FLOAT/REAL vs. NUMERIC/DECIMAL IEEE-754 Standard / Fixed-point Decimals VARCHAR/VARBINARY/TEXT/BLOB Header with length, followed by data bytes. TIME/DATE/TIMESTAMP 32/64-bit integer of (micro)seconds since Unix epoch
Data Representation Variable Precision Numbers Inexact, variable-precision numeric type that uses the "native" C/C++ types. Faster than Fixed-precision numbers Example: FLOAT, REAL/DOUBLE Fixed Precision Numbers Numeric data types with arbitrary precision and scale. Used when round errors are unacceptable. Example: NUMERIC, DECIMAL
System Catalogs A DBMS stores meta-data about databases in its internal catalogs. Tables, columns, indexes, views Users, permissions Internal statistics You can query the DBMS’s internal catalog to get info about the database
Storage Model The relational model does not specify that we have to store all of a tuple's attributes together in a single page. This may not actually be the best layout for some workloads
OLAP On-line Analytical Processing: Complex queries that read large portions of the database spanning multiple entities. You execute these workloads on the data you have collected from your OLTP application(s).
Storage Models The DBMS can store tuples in different ways that are better for either OLTP or OLAP workloads. We have been assuming the row storage model.
Row Storage Model The DBMS stores all attributes for a single tuple contiguously in a page. Ideal for OLTP workloads where queries tend to operate only on an individual entity and insert- heavy workloads.
Row Storage Model Advantages Fast inserts, updates, and deletes. Good for queries that need the entire tuple. Disadvantages Not good for scanning large portions of the table and/or a subset of the attributes.
Column Storage Model The DBMS stores the values of a single attribute for all tuples contiguously in a page. Also known as a "column store". Ideal for OLAP workloads where read-only queries perform large scans over a subset of the table’s attributes.
Column Storage Model Advantages Reduces the amount wasted I/O because the DBMS only reads the data that it needs. Better query processing and data compression. Disadvantages Slow for point queries, inserts, updates, and deletes because of tuple splitting/stitching.