Data Bases Cheatsheet
Data Bases Cheatsheet
Key Minimal set of attributes that uniquely identifies the entity INSERT INTO… INSERT INTO name (relation_attributes) Define 𝑋 → 𝑌 Trivial FD: sid, sname → sid
Superkey A subset of attributes that uniquely identifies its tuples VALUES… VALUES (‘value1’, ‘value2’, …) ∀𝑡1 , 𝑡2 ∈ 𝑅 completely non-trivial: totally
Candidate Relation could have multiple keys called candidate keys DELETE FROM DELETE FROM name [WHERE…] 𝑡1 . 𝑋 = 𝑡2 . 𝑋 ⟹ 𝑡1 . 𝑌 = 𝑡2 . 𝑌 do NOT share attributes
Key UPDATE… SET... UPDATE name SET attribute = value [WHERE …] Non-trivial
Primary Key One of the candidate key chosen ORDER BY name DESC OR name ASC non-completely non-trivial:
Cardinality SELECT… AS… SELECT name AS newName Rename share SOME attributes
Degree GROUP BY Usually models the ‘each’ noun e.g. each director, in the qn Minimal Cover
Entity-Relationship Model and is similar to what you are SELECT-ing Given a relation R(A,B,C,D,E,F). The following set F of FDs hold for this table.
HAVING SELECT a FROM Acts a GROUP BY a.Actor 𝐹 = {𝐴𝐵 → 𝐶𝐷, 𝐶 → 𝐶𝐸, 𝐶 → 𝐹, 𝐹 → 𝐸, 𝐶𝐷𝐹 → 𝐸, 𝐷𝐹𝐸 → 𝐴}
Weak Entities
HAVING COUNT (*) > ( SELECT COUNT (m.title) Step 1: Decompose FDs
Relationship must exist and be unique for each entity in
FROM Movies m WHERE qualifier ) 𝐹 = {𝐴𝐵 → 𝐶, 𝐴𝐵 → 𝐷, 𝐶 → 𝐶, 𝐶 → 𝐸, 𝐶 → 𝐹, 𝐹 → 𝐸, 𝐶𝐷𝐹 → 𝐸, 𝐷𝐹𝐸 → 𝐴}
the set. Weak entities can only be defined for a
Col appearing in HAVING clause must either be in GROUP Step 2: Eliminate redundant attributes from LHS of FDs: CHECK CLOSURE
participation constrained by (1,1) cardinality
BY clause or be an argument of an aggregation operator If we replaced 𝐷𝐹𝐸 → 𝐴 with 𝐷𝐹 → 𝐴, then we get 𝐷𝐹 + = {𝐷, 𝐹, 𝐴, 𝐸}
Arithmetic WHERE assets * 1.7 < 17 OR SELECT (rating + 0.2) * 10 𝐹 = {𝐴𝐵 → 𝐶, 𝐴𝐵 → 𝐷, 𝐶 → 𝐹, 𝐹 → 𝐸, 𝐷𝐹 → 𝐴}
Hierarchies:
Logical WHERE qualifier1 AND, OR, NOT Step 3: Eliminate redundant FDs
- Subclass and Superclass e.g. Person Student
Connectors AND qualifier 2 Remove 𝐶 → 𝐸 since 𝐶 ∗ = {𝐶, 𝐹, 𝐸}. Hence, we are left with:
- Inheritance: subclass inherits attributes of superclass
LIKE WHERE title LIKE ‘W_%S’ 𝐹 = {𝐴𝐵 → 𝐶, 𝐴𝐵 → 𝐷, 𝐶 → 𝐹, 𝐹 → 𝐸, 𝐶𝐷𝐹 → 𝐸, 𝐷𝐹𝐸 → 𝐴}
- Specialization: subclass as its own attributes
Symbol ‘_’ stands for single arbitrary char 𝐶𝐷𝐹 + = {𝐶, 𝐷, 𝐹, 𝐸, 𝐴} If we removed 𝐶𝐷𝐹 → 𝐸, 𝐶𝐷𝐹 + = {𝐶, 𝐷, 𝐹, 𝐸, 𝐴}.
Symbol ‘%’ stands for 0 or more arbitrary char So, we can remove this f.d.
Recursive Relationship Sets and Roles in Relationships UNION SELECT a FROM Acts a WHERE a.b > 0 𝐹 = {𝐴𝐵 → 𝐶, 𝐴𝐵 → 𝐷, 𝐶 → 𝐹, 𝐹 → 𝐸, 𝐷𝐹𝐸 → 𝐴}
INTERSECT UNION / INTERSECT / EXCEPT Decompositions
EXCEPT SELECT b FROM Acts b WHERE b.a > 0 Decompositions have to be:
Aggregation COUNT ([DISTINCT] A) Number of [unique] values in A col - Lossless: When you ⨂ them back, the original is subset of the result
operators COUNT ([DISTINCT] *) Number of (unique) rows To be lossless, attributes common between two relations must
The Relational Model SUM ([DISTINCT] A) Sum of all (unique) values in A column functionally determine all attributes in ONE of the two relations
No. of columns = Degree AVG ([DISTINCT] A) Average of all (unique) values in A col - Dependency preserving: {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐴 → 𝐶} {𝐴 → 𝐵, 𝐵 → 𝐶}
No. of rows = Cardinality SUM, AVG, MIN, MAX often appear in SELECT statement Computing FD Projections
A subset of attributes relation A is a foreign key if it is the primary key relation B Set ― 𝑣 IN Q is true iff value 𝑣 is in the set returned by Q For 𝐹𝑋𝑌
Options if tuple t in Courses is to be deleted: Comparison ― 𝑣 NOT IN Q is true iff value 𝑣 is not in the set returned by Q - Compute 𝑋 + = 𝑋.., we have 𝑋 → 𝑋. .∩ 𝑋𝑌
- Disallow deletion if some rows in Enrolls refers to t in WHERE ― EXISTS Q is true iff the result of Q is non-empty - Compute 𝑌 + = 𝑌. ., we have 𝑌 → 𝑌. .∩ 𝑋𝑌
- Delete all rows in Enrolls that refer to t clause ― NOT EXISTS Q is true iff the result of Q is empty - So, 𝐹𝑋𝑌 = {𝑋 → 𝑥𝑦𝑧, 𝑌 → 𝑥𝑦𝑧}
- For each row in Enrolls that refer to t, replace cid value with DEFAULT ― UNIQUE Q is true iff the result of Q has no duplicates - 𝑥𝑦𝑧 is the common attribute of 𝑋. .∩ 𝑋𝑌
- For each row in Enrolls that refer to t, replace cid value with NULL, ― 𝑣 op ANY Q is true iff there exists some 𝑣′ in result of Q s.t. 𝑣 Normal Forms
provided cid is not a primary key attribute in Enrolls op 𝑣′ is true 3NF BCNF
― 𝑣 op ALL Q is true iff for each 𝑣′ in result of Q, 𝑣 op 𝑣′ is true 1. Trivial 1. Trivial
Structured Query Language ― op ∈ { = , <> , < , <= , > , >=} 2. LHS is a superkey 2. LHS is a superkey
DDL – Data Definition Language 3. RHS is a prime attribute (appear in at
CREATE TABLE CREATE TABLE name ( Creates table Relational Algebra least one key)
…) for name 𝜎𝑐 (𝑅) Select tuples of relation R that satisfy condition c Decomposition into BCNF (only guarantees lossless)
DEFAULT name CHAR(24) DEFAULT ‘some name’ Default value 𝜋𝐿 (𝑅) List attributes L of relation R - Let 𝑋 → 𝐴 be an 𝐹𝐷 in 𝐹 that violates BCNF
PRIMARY KEY PRIMARY KEY (name, name) 𝜌(𝑅 ′ (𝑁1 → 𝑁1′ , … ), 𝑅) Rename. Can also do with 𝜌(𝑅 ′ , 𝑅) - Decompose R into
OR 𝑅∪𝑆/𝑅∩𝑆/𝑅−𝑆 𝑅1 = 𝑋𝐴
name CHAR(24) PRIMARY KEY 𝑅1 × 𝑅2 / 𝑅 ⨂𝑐 𝑆 𝑅2 = 𝑅 − 𝐴
REFERENCES FOREIGN KEY (attribute) Where to get - If 𝑅1 or 𝑅2 is not in BCNF, decompose further
𝑅/𝑆 𝑅/𝑆 contains all A tuples s.t. for every B tuple in S
FOREIGN KEY REFERENCES Staff(name) the attribute E.g.
there is a AB tuple in R
DROP TABLE DROP TABLE name Delete table
ALTER TABLE ALTER TABLE name ADD attribute+domain Edit columns Armstrong Axioms
OR ALTER TABLE name DROP attribute Reflexivity: if 𝑌 ⊆ 𝑋, then 𝑋 → 𝑌 Decomposition into 3NF (lossless and dependency preserving)
UNIQUE name CHAR(24) UNIQUE For Candidate
Augmentation: if 𝑋 → 𝑌, then 𝑋𝑍 → 𝑌𝑍 - Compute minimal cover of R
NOT NULL name CHAR(24) NOT NULL key
Transitivity: if 𝑋 → 𝑌 and 𝑌 → 𝑍, then 𝑋 → 𝑍 - Create schema for each FD in minimal cover
CHECK age NUMERIC CHECK (age > < = value) Check value - Choose a key and create 𝑖 + 1th schema
CREATE VIEW CREATE VIEW NewName
Union: if 𝑋 → 𝑌 and 𝑋 → 𝑍, then 𝑋 → 𝑌𝑍
Decomposition: if 𝑋 → 𝑌𝑍, then 𝑋 → 𝑌 and 𝑋 → 𝑍 - Remove redundant schema if one is a subset of another
AS (some query you want in NewName) E.g. Minimal cover of 𝐹 = {𝐴𝐶 → 𝐸, 𝐸 → 𝐷, 𝐴 → 𝐵}, key is 𝐴𝐶
Schema created: 𝑅1 (𝐴, 𝐶, 𝐸), 𝑅2 (𝐸, 𝐷), 𝑅3 (𝐴, 𝐵), 𝑅4 (𝐴, 𝐶)
3NF decomposition is 𝑅1 (𝐴, 𝐶, 𝐸), 𝑅2 (𝐸, 𝐷), 𝑅3 (𝐴, 𝐵)
SQL TRC / DRC
Find names of suppliers that supplies at least two parts and average cost of it Find the names of pizzas that come in a 10 inch or a 12 inch size.
SELECT P.sname, AVG(P.cost) TRC: {T | ∃𝑃 ∈ 𝑃𝑖𝑧𝑧𝑎 ((𝑃. 𝑠𝑖𝑧𝑒 = 10 ∨ 𝑃. 𝑠𝑖𝑧𝑒 = 12) ∧ 𝑇. 𝑛𝑎𝑚𝑒 = 𝑃. 𝑛𝑎𝑚𝑒)}
FROM Part P DRC: {< 𝑁 > |∃𝐶, 𝑆(< 𝐶, 𝑁, 𝑆 > ∈ 𝑃𝑖𝑧𝑧𝑎 ∧ (𝑆 = 10 ⋁ 𝑆 = 12)
GROUP BY P.sname
HAVING COUNT(*) >1 Find codes of the most expensive pizzas
{𝑇|∃𝑃1 ∈ 𝑃𝑖𝑧𝑧𝑎∀𝑃2 ∈ 𝑃𝑖𝑧𝑧𝑎(𝑃1. 𝑝𝑟𝑖𝑐𝑒 ≥ 𝑃2. 𝑝𝑟𝑖𝑐𝑒 ⋀ 𝑃1. 𝑐𝑜𝑑𝑒 = 𝑃2. 𝑐𝑜𝑑𝑒)}
Find names of suppliers who supply at least 5 parts with price > 1000 {< 𝐶1 > |∃𝐶1, 𝑃1∀𝐶2, 𝑃2 (< 𝐶1, 𝑃1 >
SELECT S.name
∈ 𝑃𝑖𝑧𝑧𝑎 ∧ (< 𝐶2, 𝑃2 >∈ 𝑃𝑖𝑧𝑧𝑎 → (𝑃1 ≥ 𝑃2)))}
FROM Part P, Supplier S, PartSupp PS
WHERE P.price > 1000 AND P.partkey = PS.partkey AND PS.suppkey = S.suppkey
Find sids of Suppliers who supply every red part
GROUP BY S.name
{𝑇|∃𝐶 ∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ∀𝑃 ∈ 𝑃𝑎𝑟𝑡𝑠(𝐶. 𝑝𝑖𝑑 = 𝑃. 𝑝𝑖𝑑 ∧ 𝑃. 𝑐𝑜𝑙𝑜𝑟 = 𝑟𝑒𝑑 ∧ 𝑇. 𝑠𝑖𝑑
HAVING COUNT ( DISTINCT P.partkey) > 4
= 𝐶. 𝑠𝑖𝑑)}
Find Dept_no where the avg salary of emp in that dept is > the avg emp {< 𝑋 > | < 𝑋, 𝑌, 𝑍 >∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ∧ ∀< 𝐴, 𝐵, 𝐶 >
SELECT E.dept_no ∈ 𝑃𝑎𝑟𝑡𝑠(𝐶 = 𝑟𝑒𝑑 ∨ ∃< 𝑃, 𝑄, 𝑅 >
FROM Employee E ∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔(𝑄 = 𝐴 ∧ 𝑃 = 𝑋))}
GROUP BY E.dept_no
HAVING AVG(E.salary) > ( SELECT AVG ( T1.salary) FROM Employee T1) Find sids of Suppliers who supply some red part
{𝑇|∃𝐶 ∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔∃𝑃 ∈ 𝑃𝑎𝑟𝑡𝑠 (𝐶. 𝑝𝑖𝑑 = 𝑃. 𝑝𝑖𝑑 ∧ 𝑃. 𝑐𝑜𝑙𝑜𝑟 = 𝑟𝑒𝑑 ∧ 𝑇. 𝑠𝑖𝑑
Find names of required courses for ‘CS’ curriculum that ‘Smith’ did not take = 𝐶. 𝑠𝑖𝑑
SELECT C.course_name {< 𝑋 > | < 𝑋, 𝑌, 𝑍 >∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔
FROM Couse C, Required R ∧ ∃𝑃, 𝑄, 𝑅 (< 𝑃, 𝑄, 𝑅 >∈ 𝑃𝑎𝑟𝑡𝑠 (𝑌 = 𝑃 ∧ 𝑅 = 𝑟𝑒𝑑))}
WHERE R.curriculum = ‘CS’ AND R.CID = C.CID
AND C.CID NOT IN ( SELECT T.CID Find the pids of parts supplied by at least two different suppliers
FROM Student S, Take T {𝑇|∃𝐶1 ∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ∃𝐶1 ∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔(𝐶1. 𝑠𝑖𝑑 <> 𝐶𝑠. 𝑠𝑖𝑑 ∧ 𝐶1. 𝑝𝑖𝑑 = 𝐶2. 𝑝𝑖𝑑 ∧ 𝐶1. 𝑝𝑖𝑑 = 𝑇. 𝑝𝑖𝑑)}
{< 𝑌 > | < 𝑋, 𝑌, 𝑍 >∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ∧ ∃𝐴, 𝐵, 𝐶(< 𝐴, 𝐵, 𝐶 >∈ 𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ∧ 𝐴 <> 𝑋 ∧ 𝑌 = 𝐵)}
WHERE S.student_name = ‘Smith’ AND S.SID = T.SID)
Relational Algebra
Find identifier of all students who never took the course 101 offered by Dept 11
Find sids of Suppliers who supply every red part
SELECT S.SID
FROM Student S (𝜋𝑠𝑖𝑑, 𝑝𝑖𝑑𝐶𝑎𝑡𝑎𝑙𝑜𝑔)/(𝜋𝑝𝑖𝑑𝜎𝑐𝑜𝑙𝑜𝑟 = 𝑟𝑒𝑑𝑃𝑎𝑟𝑡𝑠)
WHERE NOT EXISTS ( SELECT *
Find sids of Suppliers who supply some red part
FROM Transcript T, Section SE
𝜋𝑠𝑖𝑑 (𝐶𝑎𝑡𝑎𝑙𝑜𝑔 ⊗𝑝𝑖𝑑=𝑝𝑖𝑑 (𝜎𝑐𝑜𝑙𝑜𝑟=𝑟𝑒𝑑 𝑃𝑎𝑟𝑡𝑠))
WHERE SE.dept_id = 11 AND SE.course_no = 101
AND S.SID = T.SID AND T.SEID = SE.SEID) List names of suppliers who supply at least two parts
𝜌(𝑇1, 𝑃𝑎𝑟𝑡)
Find course number and dept_id of all course where no student ever got an ‘F’
𝜌(𝑇2, 𝑃𝑎𝑟𝑡)
SELECT C.course_no, D.dept_id
𝜋𝑠𝑛𝑎𝑚𝑒 (𝜎𝑝𝑛𝑜<>𝑝𝑛𝑜 ∧𝑠𝑛𝑎𝑚𝑒=𝑠𝑛𝑎𝑚𝑒 (𝑇1 × 𝑇2))
FROM Course C
WHERE NOT EXISTS ( SELECT * List names of suppliers who supply ALL complex parts whose labor cost is > 100
FROM Transcript T, Section S 𝜌(𝑇1, 𝜋𝑝𝑛𝑜 (𝜎𝑙𝑎𝑏𝑜𝑟>100 (𝐶𝑜𝑚𝑝𝑙𝑒𝑥𝑃𝑎𝑟𝑡))
WHERE T.grade = ‘F’ AND T.SID = S.SID 𝜋𝑠𝑛𝑎𝑚𝑒,𝑝𝑛𝑜 (𝑃𝑎𝑟𝑡 / 𝑇1)
AND C.course_no = S.course_no )
Find the employment numbers of pilots who can fly ALL MD planes
Find names of all students who are enrolled two classes at the same timing
SELECT DISTINCT S.name 𝜌 (𝐵, 𝜋𝑀𝑜𝑑𝑒𝑙𝑁𝑜 (𝜎𝑀𝑎𝑘𝑒𝑟=𝑀𝐷 (𝑃𝑙𝑎𝑛𝑒)))
FROM Student S 𝜌(𝐴, 𝐶𝑎𝑛_𝐹𝑙𝑦)
WHERE S.snum IN ( SELECT E.snum 𝜋𝐸𝑚𝑝_𝑁𝑜 (𝐴) − 𝜋𝐸𝑚𝑝_𝑛𝑜 ((𝜋𝐸𝑚𝑝_𝑛𝑜 (𝐴) × 𝐵) − 𝐴)
FROM Enrolled E1, Enrolled E2, Class C1, Class C2
WHERE E1.enum = E2.enum AND E1.cname <> E2.cname
AND E1.cname = C1.cname AND E2.cname = C2.cname
AND C1.meets_at = C2.meets_at )