Professional Documents
Culture Documents
Design is concerned mainly with base relations What are the criteria for "good" base relations?
- 2NF (Second Normal Form) - 3NF (Third Normal Form) - BCNF (Boyce-Codd Normal Form)
Informal Guidelines
Semantics of the attributes. Reducing the redundant values in tuples. Reducing the null values in tuples. Disallowing the possibility of generating spurious tuples.
Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret.
Mixing attributes of multiple entities may cause problems Information is stored redundantly wasting storage Problems with update anomalies
Insertion anomalies Deletion anomalies Modification anomalies
Update Anomaly: Changing the name of project number P1 from Billing to CustomerAccounting may cause this update to be made for all 100 employees working on project P1.
Insert Anomaly: Cannot insert a project unless an employee is assigned to . Delete Anomaly: When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.
Spurious Tuples
Spurious Tuples
Bad designs for a relational database may result in erroneous results for certain JOIN operations The "lossless join" property is used to guarantee meaningful results for join operations GUIDELINE 4: The relations should be designed to satisfy the lossless join condition. No spurious tuples should be generated by doing a natural-join of any relations.
FD Constraints
An FD is a property of the attributes in the schema R The constraint must hold on every relation instance r(R) If K is a key of R, then K functionally determines all attributes in R (since we never have two distinct tuples with t1[K]=t2[K])
Definitions: Prime attribute - attribute that is member of the primary key K. Full functional dependency - a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold - {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds
A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key R can be decomposed into 2NF relations via the process of 2NF normalization
M)39
M)40
M)41
EMPLOYEE DEPARTMENT
DEPARTMENT
LOCATION
COURSE Cosc 250 Cosc 250 Cosc 250 Cosc 250 Cosc 480
COURSE NAME Computer Science for Business Computer Science for Business Computer Science for Business Computer Science for Business Programming Languages
JOB CODE 1 1 2 3
JOB DESCRIPTION Write computer code. write computer code. Perform cost analysis. Design graphics.
M)47
M)48
M)49
SUPPLIER (supplier_no, status, city) FD supplier_no status supplier_no city city status
supplier_no supplier_name
supplier_no supplier_name part_no quantity S1 Bill P1 100 S1 Bill P2 200 S1 Bill P3 250 S2 Jones P1 300 Anomalies: INSERT: We cannot record the name of a supplier until that supplier supplies at least one part. DELETE: If a supplier temporarily stops supplying and we delete the last row for that supplier, we lose the supplier's name. UPDATE: If a supplier changes name, that change will have to be made to multiple rows (wasting resources and risking loss of consistency).
There exist relations that are in 3NF but not in BCNF The goal is to have each relation in BCNF (or 3NF)
INSTRUCTOR
COURSE
Y and Y
Z, then X
IR1, IR2, IR3 form a sound and complete set of inference rules
IR6. (Psuedotransitivity) If X
Y and WY
Example of Closure of FD
Find the closure of the following set of FDs A B, A C, CG H, CG I , B H
A CG AG H, as we saw by the transitivity rule. HI by the union rule. I by several steps: Note that A C holds. Then AG CG , by the augmentation rule. Now by transitivity, AG I.
Closure of FD
Computing the closure of a set of FDs can be expensive. (Size of closure is exponential in # attrs!) Typically, we just want to check if a given FD X Y is in the closure of a set of FDs F. An efficient check:
Compute attribute closure of X (denoted by X+ ) wrt F:
x Set of all attributes A such that X A is in x There is a linear time algorithm to compute this.
Check if Y is in
Does F = {A J B, B J C, CD J E } imply A J E?
i.e, is A J E in the closure F+ ? Equivalently, is E in A+?