MSIT630 Assignment2
MSIT630 Assignment2
1. Consider the relation, r, shown below. Give the result of the following query: (4 points)
building room_number time_slot_id course_id sec_id
Garfield 359 A BIO-101 1
Garfield 359 B BIO-101 2
Saucon 651 A CS-101 2
Saucon 550 C CS-319 1
Painter 705 A BIO-305 1
Painter 705 D MU-199 1
Painter 705 B CS-101 1
Painter 403 D FIN-201 1
In MySQL, use:
SELECT building, room_number, time_slot_id, count(*)
FROM r
GROUP BY building, room_number, time_slot_id with rollup;
2. Write the following queries in relational algebra, using the university schema. (Appendix A,
page 1271) (16 points, 4 points each)
a. Find the names of all students who have taken at least one Elec. Eng. course.
b. Find the IDs and names of all students who have not taken any course offering before
2010.
c. For each department, find the average salary of instructors in that department. You
may assume that every department has at least one instructor.
d. Find the lowest, across all departments, of the per-department average salary computed
by the preceding query.
3. Construct an E-R diagram for a hospital with a set of patients and a set of medical doctors.
Associate with each patient a log of the various tests and examinations conducted by the
doctors.(6 points)
4. Explain the distinction between disjoint and overlapping constraints. Provide an example for
each constraint. (3 points)
5. Explain the distinction between total and partial constraints. Provide an example for each
constraint. (3 points)
6. Consider the following set F of functional dependencies on the relation schema
r(A,B,C,D,E,F): (9 points, 3 points each.)
A→BCD
BC→DE
B→D
D→A
a. Compute B+.
b. Compute D+.
c. Prove (using Armstrong’s axioms) that AF is a superkey.
7. What are the components of a data warehouse? What are the issues to be addressed in building
a warehouse? (4 points)
8. The large volume of data generated on the Internet such as social-media data requires a very
high degree of parallelism on both data storage and processing. Please explain the MapReduce
paradigm for parallel processing. (5 points)