CS143 Final Cheatsheet
CS143 Final Cheatsheet
Data Modification
Authorization Graph
Nodes: users
Edges: granted privileges
Revoking Privileges
Rotational Delay
Transfer Time
primary key
Dense
(key, pointer) pair for every record
Why Use Dense Index?
100 mil records (900B/rec), 4B search key, 4B
ptr, 4KB block, unspanned
For table:
4096/900 = 4 records/blk
100 mil tuples/(4 records/blk) = 25 mil blocks
25 mil blocks * 4KB/blocks = 100 GB
For index:
8 bytes/entry, 4096/8 = 512 entr/block
100M/512 = 195313 blks
195313*4KB = 781 MB
Can store index into RAM for no disk IO
Sparse, Primary Index
(key, pointer) pair for every block
Points to first record in block
Multi-level Index
Sparse (2nd lvl) -> 1st lvl -> sequential
Secondary (non-clustering) Index
Tuples in table not ordered by index search key
1st level always dense, sparse from 2nd level
Insertion
Overflow (a new bucket)
Redistribute
Traditional Index
Pros: simple, sequential blocks
Cons: bad for updates, ugly over time
B+ Tree
Pros: suitable for updates, balanced, min space
usage guarantee
Cons: non-sequential index blocks
Leaf Node
Left of # pts to tuples, right pts to next leaf
Non-leaf Node
Left of # points to lower level (right side is
Insertion
Simple: Insert as next
Leaf Overflow: Split tuple and add, then copy up
to parent
Non-leaf overflow: split leaf node, insert into
non-leaf node by splitting, move new value up to
the root
New Root: Split old root and make middle value
as new root as parent to those split tuples
Number of Ptrs/Key for B+ Tree
Common Queries
Check Constraint
Q: Check that CS class has to be >3 units
A: CHECK(dept <> CS OR unit > 3)
Cross product itself
Q: Sensor (date, time, temp, humidity) stores
temp every few hours. Get highest
temperature of each day
A: SELECT date, MAX(temp) FROM Sensor
GROUP BY date;
( R 1 ( Sensor ) R 2 ( Sensor ) )
Complicated Queries
Q: Find names of such companies that all
employees have salaries > $100000
A: SELECT company-name FROM Company C
WHERE 100000 < ALL (SELECT salary
FROM Work W WHERE C.company-name
= W.company-name);
companyname ( Company )
companyname ( salary 100000 ( Work ) )
Q: Find names of employees whose total salary
is higher than those of all employees living
in Los Angeles
A: SELECT person-name FROM Work
GROUP BY person-name
HAVING SUM(salary) > ALL
(SELECT SUM(salary) FROM Work,
Employee WHERE Work.person-name =
Employee.person-name AND city = 'Los
Angeles' GROUP BY Work.person-name)
b R +|R|b S
b R +b RbS
Join Stage:
Sequentially read R and S blocks one at a time
Uses one block for output buffer
2 ( b R +b S ) log M 1
bR
M 2
b R +|R|( C+ J )
DB2
Uses 400 + 6000 = 6400
o RUNSTATS ON TABLE <userid>.<table>
AND INDEXES ALL
Hash join usually best equi-join
o ANALYZE TABLE <table> COMPUTE
5
STATISTICS
o ANALYZE TABLE <table> ESTIMATE
STATISTICS (cheaper)
Merge Stage:
Irrelevant in MySQL; only rule-based
Sequentially read R and S blocks one at a time
optimization
) (
b R +b S
Bucket size:
b R +b S
bR
M 1
E/R Model
Entity: thing or object
Rectangle
Attribute: property of entities
Ellipsis
Key: set of attributes to uniquely identify entity
i.e. Student
o One table for every subtree (including the
root) with all its attributes plus all
inherited attributes
Transactions
ACID
Atomicity: All or nothing operation, either all
finished or not at all