PhysicalDesign1 PDF
PhysicalDesign1 PDF
Physical Database Design (Defined): Process of producing a description of the implementation of the database
on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient
access to the data, and any associated integrity constraints and security measures.
What does that mean for us? We will describe the plan for how to build the tables, including appropriate data
types, field sizes, attribute domains, and indexes. The plan should have enough detail that if someone else were
to use the plan to build a database, the database they build is the same as the one you are intending to create.
The conceptual design and logical design were independent of physical considerations. Now, we not only know
that we want a relational model, we have selected a database management system (DBMS) such as Access or
Oracle, and we focus on those physical considerations.
Logical vs. Physical Design: Logical database design is concerned with what to store; physical database
design is concerned with how to store it.
Although we have discussed user requirements, conceptual design, logical design, and physical design as if they
are sequential activities, they do not occur in isolation. In reality there is often feedback between physical
design, conceptual/logical design, and application designall of which are aimed at meeting the user
requirements. For example, decisions made during physical design to improve performance may include
merging tables together which in turn should be reflected in the logical model and will have an effect on the
application design.
Meeting the needs of the users is the gold standard against which we measure our success in creating a
database.
Underlying Concepts
Because physical design is related to how data are physically stored, we need to consider a few underlying
concepts about physical storage. One goal of physical design is optimal performance and storage space
utilization. Physical design includes data structures and file organization, keeping in mind that the database
software will communicate with your computers operating system. Typical concerns include:
Storage allocations for data and indexes
Record descriptions and stored sizes of the actual data
Record placement
Data compression, encryption
* not really design done after the system is built. But typically a PLAN for this is developed.
Now well add additional information. For each attribute, we will specify:
domain: consisting of data type, length, and constraints on the domain (well talk more about constraints
soon)
whether the attribute can hold nulls
default value for the attribute (optional)
any derived attributes and how they should be computed
We will create a data dictionary for the final project. More about this later in the module, but a large
component of the data dictionary is the table design. I like to organize this into Word or Excel tables, like
shown below for the Demog table in the SNDB, built in MS Access
Demog Table
Attribute Name
Description
Data Type
pt_id
name
Patient identifier.
Patient name
Text
Text
zip
Gender
Text
Text
Race
Patient race
Primary Key: pt_id
Text
Data
Domain
Length
4
Any positive integer
20
Formatted as
last name, first name
5
99999
1
Coded in the
listgender table
1
Coded.
Allow
Null?
No (PK)
Yes
Yes
Yes
Yes
Access uses number types, though--integer or long integer for whole numbers, and single or double for
numbers with a decimal component. Access also has a currency data type (all numbers and math done
with the number use only 2 digits, like when computing with money).
Access has an autonumber data type, which is a LONG INTEGER that is automatically generated by
the DB. You can have the system increment (count up sequentially) or generate a pseudo-random
number. I say pseudo random because its really based on a computation that is applied to the time
on the system clock.
Oracle is trickier for auto-generated numbers. You need to create a separate counter table, and retrieve
the number from that table with a trigger that pulls the number and then increments the counter (triggers
are code that run automatically).
For a nice description of Oracle data types, including the size associated with different data types, see
http://www.ss64.com/orasyntax/datatypes.html
For a nice description of MS Access data types, see
http://www.databasedev.co.uk/fields_datatypes.html
Not Null: The field is never allowed to be empty. Data must be entered at the time the row is created.
This is not the same as data that are needed eventually--requiring that data might be a policy measure.
Keep in mind that if you specify not null as a constraint, you cannot save the row of data if that
column is blank.
Unique: No duplicate data when you look across rows. A unique constraint can be used for a candidate
(alternate) keys (Access calls this the no duplicates property).
Primary Key: Adding the primary key constraint automatically includes not null constraint + unique
constraint (+ an index).
Foreign Key: using the foreign key constraint enforces referential integrity. Every foreign key must
match a primary key or a unique constraint on another table. The sequence of data entry is affected here
- you must enter data into the main table before it can be used as a foreign key. (Access uses
properties within the relationships window to enforce referential integrity. Oracle uses references
statement within the table constraints).
Oracle example when you create the table, you could list the check constraint in the create table statement:
Create table vitals ([list of attributes and data types],
SBP Number (3) check SBP between 0 and 350,
In Access you would create the SBP column in table design view, then in the properties for this column, create
the validation rule:
Some DBMS provide more facilities than others for defining enterprise constraints. An example of a complex
constraint that might work in Oracle, but that would not work in Access:
CONSTRAINT StaffNotHandlingTooMuch
CHECK (NOT EXISTS (SELECT staffNo
FROM PropertyForRent
GROUP BY staffNo
HAVING COUNT(*) > 100))
Triggers: These are stored procedures (code) that fire automatically when data are manipulated-when there is
an insert, update, or delete statement. Triggers can call (use) other code modules. Constraints are faster than
triggers but triggers can be more complex.
In Oracle, triggers are created within PL*SQL (the internal Oracle programming language).
Access does not allow triggers on tables. If you use forms, you can emulate the action of triggers by writing
code inside the form.
More about domain constraints
Domain constraints are limits on the data values. What business rules can we implement in some physical
manner? For example, suppose our business rule says that a valid SBP must be between 0 and 350. We can do
this with a check constraint (Oracle) or validation rule (Access) as discussed above.
You might enforce business rules in the user interface instead of at the table level: e.g., use a picklist of values
and restrict users to only choosing something from the picklist. This would work nicely when the list of choices
is known. For very small lists, you might consider coding the data numerically, and using radio buttons, like:
Gender:
Male Female
*radio buttons can only store NUMERIC data. Each choice is given a number code, and the user is allowed to
choose only ONE of the choices like the channel selection buttons on a car radio.
Check boxes only store true/false or yes/no values (sometimes called Boolean data). In Access there is a yes/no
data type. Yes (or true) is stored as -1 and no (or false) is stored as 0. Most versions of Oracle do not allow
a Boolean data type you would use a number field of size 1 and store the values, but youd have to know what
the numbers mean. A few DBMS allow a three-valued boolean like field, equating to yes, no, and unknown.
So radio buttons are an option for pick ONE of the following. Check boxes typically imply that any or all of
the boxes may be chosen you store a yes or no for each choice.
See how the application interface design might influence some of your physical design decisions?
Some business rules can only be enforced through policy/procedure. Consider the rule FirstName must be a
valid first name. This must be enforced through policy/procedure. With physical constraints, we can only
specify that this is text, and the length of the text, but we cant determine if the text is a valid first name.
Derived Data
Examine the user requirements, ERD, logical data model and data dictionary, and produce a list of derived
attributes. Derived attribute can be stored in database or calculated every time they are needed. Document what
is to be done! Your choice may be based on: space to store the derived data, effort to keep it consistent with the
data from which it is derived; versus cost (query time) to calculate the value each time it is required. Less
expensive option might be chosen subject to performance constraints.
Code tables list a code and the meaning of the code, like:
ShipCode Shipper
1
UPS
2
FedEx
3
USPS ground
In a main data table, you would store the CODE. Then you can look up the value.
The SNDB listGender table is another example.
When we build the interface for out DB, we can use code tables or look up tables (single column list of
choices) to populate list boxes and combo boxes. This can simplify data entry for your users those interface
boxes can be set to display the meaning to the user, but store the code in the table.
In your final project I ask you to create a code table for ONE column somewhere in the DB. Include that code
table in your data dictionary, and list it in the domain column of where it will be used. See the gender
description in the Demog table on the next page.
At this point, you can BEGIN your final project data dictionary. For each table, include
Table name
Attribute names
Description of the attribute if not apparent from the name
Data type must be consistent with the DBMS you will use to build the database
Length of the field. Use the data type links above to see the choices for length
Indicate the Primary Key
If appropriate, also list alt key(s) and foreign key(s)
Derived attributes and how to compute
You can do this in Word, with a separate Word table for each DB table, like the example below or similar
appropriate format
Demog Table
Attribute Name
Description
Data Type
pt_id
name
Patient identifier.
Patient name
Text
Text
zip
Gender
Text
Text
Race
Patient race
Primary Key: pt_id
Data
Domain
Length
4
Any positive integer
20
Formatted as
last name, first name
5
99999
1
Coded in the
listgender table
1
Coded.
Text
Allow
Null?
No (PK)
Yes
Yes
Yes
Yes
Adm_dx
Attribute
Name
pt_id
Description
Data Length
Domain
Allow Null?
Patient identifier.
Data
Type
Text
No
name
Patient name
Text
20
zip
Gender
Text
Text
5
1
Race
Patient race
Text
Any positive
integer
Formatted as
last name,
first name
99999
Coded in the
listgender
table
Coded.
Text
No
PK
Text
60
matches
demog table
medical
diagnosis
No
PK
pt_id
diagnosis
admitting diagnosis
Key/
Index
PK
Yes
Yes
Yes
Yes