IM Ch05 Advanced Data Modeling Ed9
IM Ch05 Advanced Data Modeling Ed9
Discussion Focus
Your discussion can be divided into three parts to reect the chapter coverage: The rst part of the discussion covers the Extended Entity Relationship Model. Start by exploring the use of entity supertypes and subtypes. Use the specialization hierarchy example in Figure 5.2 to illustrate the main constructs. Illustrate the benets of attribute inheritance and relationship inheritance. Remember that an entity supertype and an entity subtype are related in a 1:1 relationship. Emphasize the use of the subtype discriminator and then explain the concept of overlapping and disjoint constraints in relation to entity subtypes. The completeness constraint indicates whether all entity supertypes must have at least one subtype. Explore the specialization and generalization hierarchies. Finally, explain the use of entity clusters as an alternative method to simplify crowded data models. The second part of the discussion covers the importance of proper primary key selection. Start by clearly stating the function of a PK -- identication -- and how that function differs from the descriptive nature of the other attributes in an entity. Explain the use of PKs to uniquely identify each entity instance. Discuss natural keys, primary keys, and surrogate keys. Examine the primary key guidelines that specify the PK characteristics. PKs must be unique, nonintelligent, they do not change over time, they are ideally composed of a single attribute, they are numeric, and they are security compliant. Finally, contrast the use of surrogate and composite primary keys. Remind students that composite primary keys are useful in composite entities where each primary key combination is allowed only once in the M:N relationship. The third part of the discussion covers four special design cases: Implementing 1:1 relationships. Maintaining the history of time-variant data. Fan traps. Redundant relationships.
An entity subtype is a more specic entity type that is related to an entity supertype, where the entity supertype contains the common characteristics and the entity subtypes contain the unique characteristics of each entity subtype. The entity subtype will store the data that is specic to the entity; that is, attributes that are unique the subtype. What is a specialization hierarchy? A specialization hierarchy depicts the arrangement of higher-level entity supertypes (parent entities) and lower-level entity subtypes (child entities). To answer the question precisely, we have used the texts Figure 5.2. (We have reproduced the gure on the next page for your convenience.) Figure 5.2 shows the specialization hierarchy formed by an EMPLOYEE supertype and three entity subtypesPILOT, MECHANIC, and ACCOUNTANT.
Recall that the subtype inherits all of the attributes and relationships of the supertype. Therefore, all of the attributes of a subtype include the common attributes from the supertype plus the unique (unique to that subtype) attributes from the subtype. All of the attributes of a movie would be: Prod_Num Prod_Title Prod_ReleaseDate Prod_Price Prod_Type Movie_Rating Movie_Director
According to the data model, is it required that every entity instance in the PRODUCT table be associated with an entity instance in the CD table? Why or why not? No. The completeness constraint for the data model shows a total completeness constraint from PRODUCT to the subtypes. However, the total completeness constraint indicates that every instance in the supertype (PRODUCT) must be associated with one row in some subtype, not all subtypes. Since the subtypes are designated as disjoint, or exclusive, then every row in the supertype is associated a row in only one subtype. For some products that subtype will be CD, but for other products the subtype will be either Movie or Book.
Is it possible for a book to appear in the BOOK table without appearing in the PRODUCT table? Why or why not? No. Subtypes can only exist within the context of a supertype.
What is an entity cluster, and what advantages are derived from its use? An entity cluster is a virtual entity type used to represent multiple entities and relationships in the ERD. An entity cluster is formed by combining multiple interrelated entities into a single abstract entity object. An entity cluster is considered virtual or abstract in the sense that it is not actually an entity in the nal ERD, but rather a temporary entity used to represent multiple entities and relationships with the purpose of simplifying the ERD and thus enhancing its readability.
What primary key characteristics are considered desirable? Explain why each characteristic is considered desirable. Desirable PK characteristics are summarized in the texts Table 5.3, reproduced below for your convenience. The table also includes the reason why each characteristic is desirable. (See the Rationale column.)
Rationale PK Characteristic Unique values The PK must uniquely identify each entity instance. A primary key must be able to guarantee unique values. It cannot contain nulls. The PK should not have embedded semantic meaning. An attribute with embedded semantic meaning is probably better used as a descriptive characteristic of the entity rather than as an identier. In other words, a student ID of 650973 would be preferred over Smith, Martha L. as a primary key identier. If an attribute has semantic meaning, it may be subject to updates. This is why names do not make good primary keys. If you have Vickie Smith as the primary key, what happens when she gets married? If a primary key is subject to change, the foreign key values must be updated, thus adding to the database work load. Furthermore, changing a primary key value means that you are basically changing the identity of an entity. A primary key should have the minimum number of attributes possible. Single-attribute primary keys are desirable but not required. Single-attribute primary keys simplify the implementation of foreign keys. Having multiple-attribute primary keys can cause primary keys of related entities to grow through the possible addition of many attributes, thus adding to the database work load and making (application) coding more cumbersome. Unique values can be better managed when they are numeric because the database can use internal routines to implement a counter-style attribute that automatically increments values with the addition of each new row. In fact, most database systems include the ability to use special constructs, such as Autonumber in MS Access, to support selfincrementing primary key attributes. The selected primary key must not be composed of any attribute(s) that might be considered a security risk or violation. For example, using a Social Security number as a PK in an EMPLOYEE table is not a good idea.
Nonintelligent
Preferably single-attribute
Preferably numeric
Security complaint
Composite primary keys are particularly useful in two cases: As identiers of composite entities, where each primary key combination is allowed only once in the M:N relationship. As identiers of weak entities, where the weak entity has a strong identifying relationship with the parent entity. To illustrate the rst case, assume that you have a STUDENT entity set and a CLASS entity set. In addition, assume that those two sets are related in a M:N relationship via an ENROLL entity set in which each student/ class combination may appear only once in the composite entity. The texts Figure 5.6 (reproduced here for your convenience) shows the ERD to represent such a relationship.
Case I
II
III
Action Place the PK of the entity on the One side is mandatory and the mandatory side in the entity on the other side is optional. optional side as a FK and make the FK mandatory. Select the FK that causes the fewest number of nulls or place the FK in Both sides are optional. the entity in w hich the (relationship) role is played. See Case II or consider revising your model to ensure that the two Both sides are mandatory. entities do not belong together in a single entity.
ER Relationship Constraints
Problem Solutions
Given the following business scenario, create a Crows Foot ERD using a specialization hierarchy if appropriate. Two-Bit Drilling Company keeps information on employees and their insurance dependents. Each employee has an employee number, name, date of hire, and title. If an employee is an inspector, then the date of certication and the renewal date for that certication should also be recorded in the system. For all employees, the Social Security number and dependent names should be kept. All dependents must be associated with one and only one employee. Some employees will not have dependents, while others will have many dependents. The data model for this solution is shown in FigP5.1 below.
In this scenario, a specialization hierarchy is appropriate because there is an identiable type or kind of employee (Inspectors), and additional attributes are recorded that are specic to just that kind or type. It is worth noting that if there is only a single subtype, the disjoint/overlapping designation may be omitted if there is only one subtype then there is no other subtype to overlap or be disjoint from. Also, when there is only a single subtype, the completeness constraint is always partial completeness. If the completeness constraint were identied as total completeness, that would mean that every employee must be an inspector, in which inspector would be a synonym for employee not a kind of employee.
Given the following business scenario, create a Crows Foot ERD using a specialization hierarchy if appropriate. Tiny Hospital keeps information on patients and hospital rooms. The system assigns each patient a patient ID number. In addition, the patients name and date of birth are recorded. Some patients are resident patients (they spend at least one night in the hospital) and others are outpatients (they are treated and released). Resident patients are assigned to a room. Each room is identied by a room number. The system also stores the room type (private or semiprivate), and room fee. Over time, each room will have many patients that stay in it. Each resident patient will stay in only one room. Every room must have had a patient, and every resident patient must have a room. The data model for this scenario is given in Figure P5.2 below.
Note that in this scenario, a specialization hierarchy is not appropriate. While resident patients are an identiable kind or type of patient instance, there are not additional attributes that are unique to only that kind or type of patient. Participation in a relationship that is unique to a particular kind or type of instance is not sufcient justication for a specialization hierarchy. Indicating that only some instances will participate in a relationship is addressed by the optional participation designation. In this scenario, all resident patients must have a room; however, not all patients are resident patients so ROOM is optional to patient. If students ask about the need for an attribute to distinguish between outpatients and resident patients, remind them that in this limited scenario the only distinction between outpatients and resident patients is whether or not they are associated with a room. Therefore, they can consider the Room_Num foreign key in the PATIENT table can serve in that capacity. Given the following business scenario, create a Crows Foot ERD using a specialization hierarchy if appropriate. Granite Sales Company keeps information on employees and the departments that they work in. For each department, the department name, internal mail box number, and ofce phone extension are kept. A department can have many assigned employees, and each employee is assigned to only one department. Employees can be salaried employees, hourly employees, or contract employees. All employees are assigned an employee number. This is kept along with the employees name and address. For hourly employees, hourly wage and target weekly work hours are stored (e.g. the
company may target 40 hours/week for some, 32 hours/week for others, and 20 hours/week for others). Some salaried employees are salespeople that can earn a commission in addition to their base salary. For all salaried employees, the yearly salary amount is recorded in the system. For salespeople, their commission percentage on sales and commission percentage on prot are stored in the system. For example, John is a salesperson with a base salary of $50,000 per year plus 2-percent commission on the sales price for all sales he makes plus another 5 percent of the prot on each of those sales. For contract employees, the beginning date and end dates of their contract are stored along with the billing rate for their hours. The data model for this scenario is given in Figure P5.3 below.
4. In Chapter 4, you saw the creation of the Tiny College database design. That design reected such business rules as a professor may advise many students and a professor may chair one department. Modify the design shown in Figure 4.36 to include these business rules: An employee could be staff or a professor or an administrator. A professor may also be an administrator. Staff employees have a work level classication, such a Level I and Level II. Only professors can chair a department. A department is chaired by only one professor. Only professors can serve as the dean of a college. Each of the universitys colleges is served by one dean. A professor can teach many classes. Administrators have a position title. Given that information, create the complete ERD containing all primary keys, foreign keys, and main attributes. The solution is shown in Figure P5.4 below.
6. Some Tiny College staff employees are information technology (IT) personnel. Some IT personnel provide technology support for academic programs. Some IT personnel provide technology infrastructure support. Some IT personnel provide technology support for academic programs and technology infrastructure support. IT personnel are not professors. IT personnel are required to take periodic training to retain their technical expertise. Tiny College tracks all IT personnel training by date, type, and results (completed vs. not completed). Given that information, create the complete ERD containing all primary keys, foreign keys, and main attributes. This problem provides an opportunity to reinforce the idea that to qualify as a subtype, the identiable kind or type of instance must include additional attributes being an identiable kind or type of entity instance is necessary but not sufcient to justify the create of subtypes. Given the minimal attributes specied in the problem, the solution would be as shown in Figure 5.6a.
If, as is often the case in the problems included in textbook, we assume that the attributes specied are just a subset of the complete attribute requirements for each entity, we can consider what the data model would be given that additional attributes that are unique to the described kinds of entity instances will exist. In that case, the expanded solution including subtypes for the described kinds of staff members is shown in Figure 5.6b.
Note that in the specication of ITSTAFF as a subtype of STAFF, there is no disjoint/overlapping designation for the subtype. When there is only one subtype, there is nothing to be disjointed from or to overlap with; therefore, the designation may be safely omitted. 7. The FlyRight Aircraft Maintenance (FRAM) division of the FlyRight Company (FRC) performs all maintenance for FRCs aircraft. Produce a data model segment that reects the following business rules: All mechanics are FRC employees. Not all employees are mechanics. Some mechanics are specialized in engine (EN) maintenance. Some mechanics are specialized in airframe (AF) maintenance. Some mechanics are specialized in avionics (AV) maintenance. (Avionics are the electronic components of an aircraft that are used in communication and navigation.) All mechanics take periodic refresher courses to stay current in their areas of expertise. FRC tracks all course taken by each mechanicdate, course type, certication (Y/N), and performance. FRC keeps a history of the employment of all mechanics. The history includes the date hired, date promoted, date terminated, and so on. (Note: The and so on component is, of course, not a real-
world requirement. Instead, it has been used here to limit the number of attributes you will show in your design.) Given those requirements, create the Crows Foot ERD segment. The solution is shown in the following gure:
8. Martial Arts R Us (MARU) needs a database. MARU is a martial arts school with hundreds of students. It is necessary to keep track of all the different classes that are being offered, who is assigned to teach each class, and which students attend each class. Also, it is important to track the progress of each student as they advance. Create a complete Crows Foot ERD for these requirements: Students are given a student number when they join the school. This is stored along with their name, date of birth, and the date they joined the school. All instructors are also students, but clearly, not all students are instructors. In addition to the normal student information, for each instructor, the date that they start working as an instructor must be recorded, along with their instructor status (compensated or volunteer). An instructor may be assigned to teach any number of classes, but each class has one and only one assigned instructor. Some instructors, especially volunteer instructors, may not be assigned to any class. A class is offered for a specic level at a specic time, day of the week, and location. For example, one class taught on Mondays at 5:00 pm in Room #1 is an intermediate-level class. Another class taught on Mondays at 6:00 pm in Room #1 is a beginner-level class. A third class taught on Tuesdays at 5:00 pm in Room #2 is an advanced-level class. Students may attend any class of the appropriate level during each week so there is no expectation that any particular student will attend any particular class session. Therefore, the actual attendance of students at each individual class meeting must be tracked. A student will attend many different class meetings; and each class meeting is normally attended by many students. Some class meetings may have no students show up for that meeting. New students may not have attended any class meetings yet. At any given meeting of a class, instructors other than the assigned instructor may show up to help. Therefore, a given class meeting may have several instructors (a head instructor and many assistant instructors), but it will always have at least the one instructor that is assigned to that class. For each class meeting, the date that the class was taught and the instructors roles (head instructor or assistant instructor) need to be recorded. For example, Mr. Jones is assigned to teach the Monday, 5:00 pm, intermediate class in Room #1. During one particular meeting of that class, Mr. Jones was present as the head instructor and Ms. Chen came to help as an assistant instructor. Each student holds a rank in the martial arts. The rank name, belt color, and rank requirements are stored. Each rank will have numerous rank requirements. Each requirement is considered a requirement just for the rank at which the requirement is introduced. Every requirement is associated with a particular rank. All ranks except white belt have at least one requirement. A given rank may be held by many students. While it is customary to think of a student as having a single rank, it is necessary to track each students progress through the ranks. Therefore, every rank that a student attains is kept in the system. New students joining the school are automatically given a white belt rank. The date that a student is awarded each rank should be kept in the system. All ranks have at least one student that has achieved that rank at some time. The solution for this case is shown in Figure P5.8 below.
Notice that the gure includes surrogate keys for RANK, REQUIREMENT, and MEETING because the natural keys did not meet the requirements for a good primary key. The most common areas for confusion among students on this particular case surround attendance in the class meetings. Students tend to think of relationship between CLASS and STUDENT similar to the M:N enroll relationship that they have seen throughout the textbook. In this case, however, the relationship is not an enrollment relationship instead it is an attendance relationship. As described in the case, students do not enroll in any particular class. What must be tracked is the attendance for each individual class meeting. Therefore, the M:N relationship in this scenario is actually between the STUDENT and the individual class MEETING. The case also provides an opportunity to reinforce the fact that subtypes inherit not only the attributes of the supertype but also the relationships. One requirement of the case is that the system must be able to track which instructors actually taught each class meeting. There is already a M:N relationship between STUDENT and MEETING that can be implemented with the ATTENDANCE bridge entity using only the Stu_Num and Meet_Num attributes. Students should consider that because INSTRUCTOR is a subtype of STUDENT, instructors are already associated in a M:N relationship with MEETING through that same bridge. By adding the Attend_Role attribute to ATTENDANCE, the bridge entity can properly track all students in a given class meeting and record what role they played in that meeting (e.g. student, assistant instructor, or head instructor). Finally, it is worth pointing out to the students that requirements are described as being an attribute of a rank. Some students will immediate consider requirements to be an entity, while others will model requirement as an attribute of the RANK entity. Considering rank requirements to be an attribute of RANK is perfectly acceptable however, it must be noted that as such rank requirements would be a multi-valued attribute. Therefore, the preferred implementation of a multi-valued attribute (creating a new entity for the multi-valued attribute) would result in the creation of the REQUIREMENT table anyway. So either way the student approaches the problem, it will eventually lead to the solution shown above.
9. The Journal of E-commerce Research Knowledge is a prestigious information systems research journal. It uses a peer-review process to select manuscripts for publication. Only about 10 percent of the manuscripts submitted to the journal are accepted for publication. A new issue of the journal is published each quarter. Create a complete ERD to support the business needs described below. Unsolicited manuscripts are submitted by authors. When a manuscript is received, the editor will assign the manuscript a number, and record some basic information about it in the system. The title of the manuscript, the date it was received, and a manuscript status of received are entered. Information about the author(s) is also recorded. For each author, the authors name, mailing address, e-mail address, and afliation (school or company for which the author works) is recorded. Every manuscript must have an author. Only authors that have submitted manuscripts are kept in the system. It is typical for a manuscript to have several authors. A single author may have submitted many different manuscripts to the journal. Additionally, when a manuscript has multiple authors, it is important to record the order in which the authors are listed in the manuscript credits. At her earliest convenience, the editor will briey review the topic of the manuscript to ensure that
the manuscripts contents fall within the scope of the journal. If the content is not within the scope of the journal, the manuscripts status is changed to rejected and the author is notied via e-mail. If the content is within the scope of the journal, then the editor selects three or more reviewers to review the manuscript. Reviewers work for other companies or universities and read manuscripts to ensure the scientic validity of the manuscripts. For each reviewer, the system records a reviewer number, reviewer name, reviewer e-mail address, afliation, and areas of interest. Areas of interest are pre-dened areas of expertise that the reviewer has specied. An area of interest is identied by a IS code and includes a description (e.g. IS2003 is the code for database modeling). A reviewer can have many areas of interest, and an area of interest can be associated with many reviewers. All reviewers must specify at least one area of interest. It is unusual, but it is possible to have an area of interest for which the journal has no reviewers. The editor will change the status of the manuscript to under review and record which reviewers the manuscript was sent to and the date on which it was sent to each reviewer. A reviewer will typically receive several manuscripts to review each year, although new reviewers may not have received any manuscripts yet. The reviewers will read the manuscript at their earliest convenience and provide feedback to the editor regarding the manuscript. The feedback from each reviewer includes rating the manuscript on a 10-point scale for appropriateness, clarity, methodology, and contribution to the eld, as well as a recommendation for publication (accept or reject). The editor will record all of this information in the system for each review received from each reviewer and the date that the feedback was received. Once all of the reviewers have provided their evaluation of the manuscript, the editor will decide whether or not to publish the manuscript. If the editor decides to publish the manuscript, the manuscripts status is changed to accepted and the date of acceptance for the manuscript is recorded. If the manuscript is not to be published, the status is changed to rejected. Once a manuscript has been accepted for publication, it must be scheduled. For each issue of the journal, the publication period (Fall, Winter, Spring, or Summer), publication year, volume, and number are recorded. An issue will contain many manuscripts, although the issue may be created in the system before it is known which manuscripts will go in that issue. An accepted manuscript appears in only one issue of the journal. Each manuscript goes through a typesetting process that formats the content (font, font size, line spacing, justication, etc.). Once the manuscript has been typeset, the number of pages that the manuscript will occupy is recorded in the system. The editor will then make decisions about which issue each accepted manuscript will appear in and the order of manuscripts within each issue. The order and the beginning page number for each manuscript must be stored in the system. Once the manuscript has been scheduled for an issue, the status of the manuscript is changed to scheduled. Once an issue is published, the print date for the issue is recorded, and the statuses of all of the manuscripts in that issue are changed to published. The solution for this case is shown in Figure P5.9 below.
It is not uncommon for students to want to make a separate subtype for each value that the manuscript status attribute can have. Students will often, rightly, point out that there are new attributes that come into play with different manuscript statuses. What the students are missing is that there is no described mechanism by which a manuscript that has been accepted can fail to be published. Therefore, once a manuscript is accepted, it does have all of the attributes in the ACCEPTED subtype the user just doesn't have a value for all of them yet. 10. Global Computer Solutions (GCS) is an information technology consulting company with many ofces located throughout the United States. The companys success is based on its ability to maximize its resourcesthat is, its ability to match highly skilled employees with projects according to region. To better manage its projects, GCS has contacted you to design a database so that GCS managers can keep track of their customers, employees, projects, project schedules, assignments, and invoices. The GCS database must support all of GCSs operations and information requirements. A basic description of the main entities follows: The employees working for GCS have an employee ID, an employee last name, a middle initial, a rst name, a region, and a date of hire. Valid regions are as follows: Northwest (NW), Southwest (SW), Midwest North (MN), Midwest South (MS), Northeast (NE), and Southeast (SE). Each employee has many skills, and many employees have the same skill. Each skill has a skill ID, description, and rate of pay. Valid skills are as follows: data entry I, data entry II, systems analyst I, systems analyst II, database designer I, database designer II, Cobol I, Cobol II, C++ I, C++ II, VB I, VB II, ColdFusion I, ColdFusion II, ASP I, ASP II, Oracle DBA, MS SQL Server DBA, network engineer I, network engineer II, web administrator, technical writer, and project manager. Table P5.9a shows an example of the Skills Inventory.
Skill Data Entry I Data Entry II Systems Analyst I Systems Analyst II DB Designer I DB Designer II Cobol I Cobol II C++ I C++ II VB I VB II ColdFusion I ColdFusion II ASP I ASP II Oracle DBA SQL Server DBA Network Engineer I Network Engineer II Web Administrator Technical Writer
Employee Seaton Amy; Williams Josh; Underwood Trish Williams Josh; Seaton Amy Craig Brett; Sewell Beth; Robbins Erin; Bush Emily; Zebras Steve Chandler Joseph; Burklow Shane; Robbins Erin Yarbrough Peter; Smith Mary Yarbrough Peter; Pascoe Jonathan Kattan Chris; Epahnor Victor; Summers Anna; Ellis Maria Kattan Chris; Epahnor Victor, Batts Melissa Smith Jose; Rogers Adam; Cope Leslie Rogers Adam; Bible Hanah Zebras Steve; Ellis Maria Zebras Steve; Newton Christopher Duarte Miriam; Bush Emily Bush Emily; Newton Christopher Duarte Miriam; Bush Emily Duarte Miriam; Newton Christopher Smith Jose; Pascoe Jonathan Yarbrough Peter; Smith Jose Bush Emily; Smith Mary Bush Emily; Smith Mary Bush Emily; Smith Mary; Newton Christopher Kilby Surgena; Bender Larry
VB II ColdFusion I ColdFusion II ASP I ASP II Oracle DBA SQL Server DBA Network Engineer I Network Engineer II Web Administrator Technical Writer Project Manager
Zebras Steve; Newton Christopher Duarte Miriam; Bush Emily Bush Emily; Newton Christopher Duarte Miriam; Bush Emily Duarte Miriam; Newton Christopher Smith Jose; Pascoe Jonathan Yarbrough Peter; Smith Jose Bush Emily; Smith Mary Bush Emily; Smith Mary Bush Emily; Smith Mary; Newton Christopher Kilby Surgena; Bender Larry Paine Brad; Mudd Roger; Kenyon Tiffany; Connor Sean
Project ID: 1 Company : See Rocks Start Date: 3/1/2010 Start Date 3/1/10 3/11/10 3/11/10 3/18/10 3/25/10 3/25/10
Quantity Required 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1
6/10/10
6/17/10
Implementation 3/25/10 3/25/10 5/20/10 6/7/10 System Coding & Testing System Documentation Final Evaluation Cobol I Cobol II Oracle DBA Technical Writer Project Manager Systems Analyst II DB Designer I Cobol II Project Manager Systems Analyst II DB Designer I Cobol II Project Manager 2 1 1 1 1 1 1 1 1 1 1 1 1
6/10/10
6/14/10
6/17/10 7/1/10
6/21/10 7/1/10
Initial Interview
Database Design
System Design
Database Implementati
Initial Interview
3/1/10
3/6/10
II DB Designer I DB Designer I Sys. Analyst II Sys. Analyst I Sys. Analyst I Oracle DBA
Database Design
3/11/10
3/15/10
System Design
3/11/10
4/12/10
102 Burklow S. 103Smith M. 104Smith M. 105 Burklow S. 106Bush E. 107Zebras S. 108Smith J. 109 Summers A. 110Ellis M. 111 Ephanor V. 112Smith J. 113Kilby S.
3/1/10 3/1/10
3/6/10 3/6/10
3/11/10
3/14/10
Database Implementati on
3/18/10
3/22/10
3/15/10
3/19/10
3/25/10
5/20/10
6/7/10
Tech. Writer Project Mgr. Sys. Analyst II DB Designer I Cobol II Project Mgr. Sys. Analyst II DB Designer I Cobol II Project Mgr.
3/25/10
Final Evaluation
6/10/10
6/14/10
6/17/10
6/21/10
7/1/10
7/1/10
Hours Bill Employee Week Assignment Name Ending Number Worked Number Burklow S. 3/1/10 1-102 4 xxx Connor S. 3/1/10 1-101 4 xxx Smith M. 3/1/10 1-103 4 xxx Burklow S. 3/8/10 1-102 24 xxx Connor S. 3/8/10 1-101 24 xxx Smith M. 3/8/10 1-103 24 xxx Burklow S. 3/15/10 1-105 40 xxx Bush E. 3/15/10 1-106 40 xxx Smith J. 3/15/10 1-108 6 xxx Smith M. 3/15/10 1-104 32 xxx Zebras S. 3/15/10 1-107 35 xxx Burklow S. 3/22/10 1-105 40 Bush E. 3/22/10 1-106 40 Ellis M. 3/22/10 1-110 12 Ephanor V. 3/22/10 1-111 12 Smith J. 3/22/10 1-108 12 Smith J. 3/22/10 1-112 12 Summers A. 3/22/10 1-109 12 Zebras S. 3/22/10 1-107 35 Burklow S. 3/29/10 1-105 40 Bush E. 3/29/10 1-106 40 Ellis M. 3/29/10 1-110 35 Ephanor V. 3/29/10 1-111 35 Kilby S. 3/29/10 1-113 40 Smith J. 3/29/10 1-112 35 Summers A. 3/29/10 1-109 35 Zebras S. 3/29/10 1-107 35 Note: xxx represents the bill ID. Use the one that matches the bill number in your database.
Your assignment is to create a database that will fulll the operations described in this problem. The minimum required entities are employee, skill, customer, region, project, project schedule, assignment, work log, and bill. (There are additional required entities that are not listed.) Create all of the required tables and all of the required relationships. Create the required indexes to maintain entity integrity when using surrogate primary keys. Populate the tables as needed (as indicated in the sample data and forms).
This is a complex database design case that requires the identication of many business rules, the organization of those business rules, and the development of a complete database model. Note that this database design case has three primary objectives: Evaluation of primary keys and surrogate keys. (When should each one be used?) Evaluation of the use of indexes on candidate keys to avoid duplicate entries when using surrogate keys. Evaluation of the use of redundant relationships. In some cases, it is better to have the foreign key attribute added to an entity, instead of using multiple join operations. We recommend that you use this problem as the basis for a two part case project. One way to work with this database case is to form small groups of two or three students and then let each group work the problem independently. The following bullet list provides a sample scenario: Divide the class in groups of three students per group. Distribute the GCS database case to all students. Assign a deadline for the groups to submit an initial design ERD with written explanations of the ERD components and features. This deadline should be two weeks from the assignment date. (While the groups are working on the design phase, students will be learning to use SQL to generate information.) The initial ERD must include: All the main entities with all primary/foreign keys clearly labeled. The identication of all relevant dependent attributes. For each table, the identication of all possible required indexes. Meet with each group and evaluate each design, paying close attention to: The propagation of primary/foreign keys and how surrogate keys would be useful to simplify the design. The use of indexes to minimize the occurrence of duplicate entries. By this time, students should be familiar with SQL. Ask questions about how a query would be written to generate information. You can use the sample queries provided in the GCSdata-sol.mdb teacher solution le. This database is located on your Instructors CD.) Please note that there are two database les available: The GCSdata.mdb database is located in the Student subfolder on the Instructors CD. This MS Access database contains the sample CUSTOMER, EMPLOYEE, REGION, and SKILL tables. You can either distribute this le to your students by copying it to a common drive in your lab or you can ask your students to download this le from the Course Technology website for this book. The GCSdata-sol.mdb database is located in the Teacher subfolder on the Instructors CD. This MS Access database contains the complete set of populated tables. In addition, the solution database contains some sample queries. You can use the sample queries as the basis for second part of this case, which may be used to complement the SQL coverage in chapters 7 and 8.
Figure P5-10a shows the sample tables in the GCSdata.mdb student database.
To help your students understand the ERD, use Table P5.10 to describe the main tables and the main indexes that are appropriate for this design implementation.
Region
region_id (surrogate)
unique(region_name)
Employee
emp_id (surrogate)
Skill
skill_id (surrogate)
unique(skill_description)
EmpSkill
Project
prj_id (surrogate)
unique(cus_id, prj_description)
task_id (surrogate)
unique(prj_id, task_descript)
TS (task schedule)
ts_id(surrogate)
unique(task_id, skill_id)
Assign
asn_id (surrogate)
TS (task schedule)
ts_id(surrogate)
unique(task_id, skill_id)
Assign
asn_id (surrogate)
Worklog
wl_id (surrogate)
unique(asn_id, wl_date)
The unique index on task_id and skill_id is to prevent duplicate listings for a single skill within a single task for a single project. The unique index on ps_id, emp_id, and ts_id is used to ensure that an employee cannot be assigned twice to perform the same skill on the same task for a given project. The unique indexes on asn_id and wl_date are used to ensure that no duplicate work log entries exist (for an employee) on a given date.
Bill
bill_id (surrogate)
It is important to point out to your students that the surrogate primary keys are usually not shown in the graphical user interfaces that are available to the end users. The only function of the surrogate primary key is to provide a single-attribute identier for each row in the table. The completed ERD for the GCS database is shown in Figure P5-9C.