Baixe o app para aproveitar ainda mais
Prévia do material em texto
Slide 1INF1383 BD1 INF1383 -Bancos de Dados Prof. Sérgio Lifschitz DI PUC-Rio Eng. Computação, Sistemas de Informação e Ciência da Computação Projeto de BD e Formas Normais Alguns slides são baseados ou modificados dos originais dos livros Elmasri and Navathe, Fundamentals of Database Systems © Pearson Education, Inc. e Database System Concepts, McGraw Hill © Silberschatz, Korth and Sudarshan Slide 2INF1383 BD1 Relational Database Design � Finding database schemes with good properties � Example: � Database for information on suppliers, parts supplied, and shipments � Information consists of: � S#: supplier number � SNAME: supplier name � SCITY: supplier city � P#: part number � PNAME: part name � PCITY: city where part is stored � QTY: quantity of shipment Slide 3INF1383 BD1 Relational Database Design � One possible database schema � BAD [S#, SNAME, SCITY, P#, PNAME, PCITY, QTY] � Why BAD is bad: � redundancy � SCITY is determined by S# However, the same SCITY could appear many times with the same S# in this relation � Functional dependency: S# → SCITY � update anomalies � SCITY could be changed in one place but not in another, for the same S#, resulting in inconsistency Slide 4INF1383 BD1 Relational Database Design � Why BAD is bad: � insertion anomalies � cannot record supplier info unless that supplier supplies a part � deletion anomalies � by deleting parts we can lose supplier info � Solution � New schema: S[S#, SNAME, SCITY] Key: S# P[P#, PNAME, PCITY] Key: P# SP[S#, P#, QTY] Key: S# P# � No other functional dependencies besides key dependencies � Normal forms: “nice” forms for schemas Slide 5INF1383 BD1 Decomposing Relation Schemes � Let R be a relation scheme � A decomposition of R is a set ρ={R1, …, Rk} of relation schemes such that: att(R) = ∪∪∪∪ki = 1 att(Ri) � Example � {S, P, SP} is a decomposition of the relation schema BAD (see earlier example) Slide 6INF1383 BD1 Conditions for a “Reasonable” Decomposition � Do S, P, SP contain the same info as BAD? BAD = piS(BAD) >< piP(BAD) >< piSP(BAD) ? � Dependencies for BAD can be enforced by “local” dependencies on S, P, SP? � Let F be the set of FDs for BAD � F+ denotes the set of FDs implied by F � piX(F+) = {V→W ∈F+ | VW ⊂ X} Is piS(F+) ∪ piP(F+) ∪ piSP(F+) equivalent to F ? Slide 7INF1383 BD1 Functional-Dependency Theory � We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies. � We then develop algorithms to generate lossless decompositions into BCNF and 3NF � We then develop algorithms to test if a decomposition is dependency-preserving Slide 8INF1383 BD1 8 Functional Dependencies � Functional dependency over R: � expression X → Y where X, Y ⊆ att(R) � A relation R satisfies X → Y iff whenever two tuples in R agree on X, they also agree on Y e.g. SCHEDULE THEATER TITLE Roxy Tropa Artplex Rambo Satisfies THEATER → TITLE SCHEDULE THEATER TITLE Roxy Tropa Artplex Rambo Artplex Platoon Violates THEATER → TITLE, satisfies TITLE → THEATER Slide 9INF1383 BD1 Lossless-join Decomposition � For the case of R = (R1, R2), we require that for all possible relations r on schema R r = ∏R1 (r ) ∏R2 (r ) � A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+: � R1 ∩ R2 → R1 � R1 ∩ R2 → R2 Slide 10INF1383 BD1 Dependency Preservation � Let Fi be the set of dependencies F + that include only attributes in Ri. � A decomposition is dependency preserving, if (F1 ∪ F2 ∪ … ∪ Fn )+ = F + �If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive. Slide 11INF1383 BD1 11 Technical Problem � Checking if X→A is implied by a set F of fds (F |= X→A) � Notation: F+ = {X→Y | F |= X→Y} � Needed to compute keys and check satisfaction of BCNF and other normal forms. � Solution: “closure” of a set of attributes wrt F: X+ � X+ = {A | F |= X→A} (all attributes determined by X) � So, X→A ∈ F+ iff A ∈ X+ Slide 12INF1383 BD1 12 Example of Closure Computation � R = ABCDEF � F = {A→C, BC→D AD→E} � X = AB � X(0) = AB � X(1) = ABC � X(2) = ABCD � X(3) = ABCDE � X(4) = X(3) � X+ = ABCDE � To check if X is key in R: X+ = R A o B o A o B o C o A o B o C o D o E o A o B o C o D o Slide 13INF1383 BD1 Computing the closure of a set of attributes wrt a set of fd’s � Let F be a set of fd’s over R, and x⊆R. The following computes the closure x+ of x wrt FL 1. X(0) ← X 2. X(i+1) ← X(i) ∪ {Z | V→Z ∈ F, V ⊆ X(i)} We get: X(0) ⊆ X(1) ⊆ … ⊆ X(i) ⊆ X(i+1) ⊆ … ⊆ att(R) Since att(R) is finite, there must exist a k≥0 such that X(k) = X(k+1) (and thus, X(j) = X(k) ∀j≥k). Then, X+ = X(k) Slide 14INF1383 BD1 14 Dependency Preserving Decompositions � Notation: � if F is a set of fd’s over R and X⊆R then piX(F+) = {V→W∈F+ | VW ⊆ X}. these are the fds that apply to the set of attributes X (“local” to X) � Definition: � Let ρ = (R1, …, Rk) be a decomposition for R and F a set of fd’s over R. Then ρ preserves F iff � In other words, all fds in F are implied by ∪ki=1piRi(F+) the set of local fds ∪ki=1piRi(F+) is equivalent to F Slide 15INF1383 BD1 15 Normal Forms � Terminology: � Let R be a relation schema and F a set of fd’s over R. � Key: X⊆ att(R) such that X→att(R) ∈ F+. � Minimal key: X ⊆ att(R) s.t. X→att(R) ∈ F+ and there is no Y ⊂≠ X such that Y→att(R) ∈ F+. � A ∈ att(R) is prime: A ∈ X where X is a minimal key � A is non-prime: A is not a member of any minimal key. Slide 16INF1383 BD1 16 Purpose of normal forms � Eliminate problems of redundancy and anomalies. � Normal forms: � first, second, third, Boyce-Codd � Boyce-Codd Normal Form (BCNF) � A relation scheme R is in BCNF wrt a set of fd’s F over R, iff whenever X→A ∈ F+ and A ∉ X, X is a key for R � Only nontrivial dependencies are those induced by keys Slide 17INF1383 BD1 17 Example � BAD(S#, P#, SNAME, PNAME, SCITY, PCITY, QTY) � not in BCNF wrt F = S#→SNAME SCITY P#→PNAME PCITY S# P#→QTY � S(S#, SCITY, SNAME) is in BCNF wrt S#→SNAME SCITY � P(P#, PCITY, PNAME) is in BCNF wrt P#→PNAME PCITY � SP(S# P# QTY) is in BCNF wrt S# P# → QTY Slide 18INF1383 BD1 Decomposition of a relation schema into BCNF relation schemas, with lossless join � Any relation scheme has a decomposition into BCNF relation schemes, with lossless join � not always dependency preserving � Example R = CITY ZIP ST F = CITY ST → ZIP, ZIP → CITY � R has no decomposition into BCNF schemas, which preserves F � (CITY ST → ZIP is never preserved) Slide 19INF1383 BD1 Algorithm to obtain a lossless join decomposition of a given R wrt F Start with ρ = {R} Apply recursively the following procedure: for each S in ρ not in BCNF, let X→A ∈ F+ be such that XA ⊆ S, A ∉ X, X is not a key of S replace S by S1 and S2, where S1 = XA S2 = S − A until all relation schemes are in BCNF Note lossless join: X → A∈ F+ X is a key for S1 S S1 S2 Slide 20INF1383 BD1 Note � a) � b) If ρ is a lossless join decomposition of R wrt F and ρ ⊆ ρ’ then ρ’ is a lossless join decomposition as well. R RiR1 Rk Si2Si1 Sim Lossless join wrt F Lossless join (for Ri) wrt piRi(F+) R Ri-1R1 RkSi2 … Si1 Sim … R2 … Lossless join wrt F Slide 21INF1383 BD1 Example R = C T H R S G C = course T = teacher H = hour R = room S = student G = grade F: C → T HR → C HT → R CS → G HS → R only key: HS Slide 22INF1383BD1 Example CTHRSG Key = HS, C→T CS →G, HR →C, HS →R, TH →R CSG key = CS CS →G CTHRS key = SH C→T TH→R HR →C HS →R CT key = C C→T CHRS key = SH CH →R HR →C HS →R CHR key = CH, CH →R HR HR →C CHS key = SH SH →C Slide 23INF1383 BD1 Remarks (drawbacks) � Decomposition is not unique � CHRS could be decomposed into CHR and CHS or CHR and RHS � The decomposition does not preserve TH→R � local fds: G = {CS→G C→T CH →R HR →C SH →C} which does not imply TH → R Slide 24INF1383 BD1 BCNF and Dependency Preservation � R = (J, K, L ) F = {JK → L L → K } Two candidate keys = JK and JL � R is not in BCNF � Any decomposition of R will fail to preserve JK → L This implies that testing for JK → L requires a join It is not always possible to get a BCNF decomposition that is dependency preserving Slide 25INF1383 BD1 25 Efficiency � Is there an efficient algorithm for BCNF decomposition? � Most likely NO: It is NP-complete to decide whether a relation scheme R is in BCNF wrt a set of fd’s. Any algorithm for BCNF is most likely exponential Slide 26INF1383 BD1 26 3NF � Problem with BCNF: � Not every relation schema can be decomposed into BCNF relation schemas which preserve the dependencies and have lossless join. � Third Normal Form � A relation scheme R is in Third Normal Form wrt a set F of fd’s over R, if whenever X→A holds in R and A ∉ X then either X is a key or A is prime Weaker than BCNF Slide 27INF1383 BD1 Third Normal Form: Motivation � There are some situations where � BCNF is not dependency preserving, and � efficient checking for FD violation on updates is important � Solution: define a weaker normal form, called Third Normal Form (3NF) � Allows some redundancy (with resultant problems; we will see examples later) � But functional dependencies can be checked on individual relations without computing a join. � There is always a lossless-join, dependency-preserving decomposition into 3NF. Slide 28INF1383 BD1 28 Computing a 3NF decomposition � Minimal covers � A set F of fd’s is minimal iff 1) X→Y ∈ F ⇒ Y is a single attribute 2) for no X→A ∈F is F−{X→A} equivalent to F (no fd can be removed from F) 3) if X→A ∈ F then there is no proper subset Z of X such that F−{X→A}∪{Z→A} is equivalent to F (attributes cannot be removed from fd’s in F) � Fact:. Every set of fd’s is equivalent to some minimal set of fds (called a minimal cover). Slide 29INF1383 BD1 Dependency preserving decompositions into Third Normal Form R a schema F minimal set of fd’s over R ρ = {XA1 … Am | X→Ai∈F}∪{B∈R | B does not occur in F} � Example � R = CTHRSG (see earlier example) � F = C→T CS →G HR →C HS →R (minimal) HT →R � Then ρ = {CT, HRC, HTR, CSG, HRS} Slide 30INF1383 BD1 3NF Decomposition Algorithm (Cont.) � 3NF decomposition algorithm ensures: � each relation schema Ri is in 3NF � decomposition is dependency preserving and lossless-join Slide 31INF1383 BD1 31 Formal Results � Theorem. ρ preserves F and each Ri in ρ is in 3NF wrt piRi(F+). � To obtain decomposition that also has lossless join: � add to ρ a set K where K is a minimal key for R …unless there already is a “piece” in the decomposition whose attributes form a key for R � Theorem ρ ∪ {K} is a 3NF decomposition which is dependency preserving and has lossless join. Slide 32INF1383 BD1 Extraneous Attributes � Consider a set F of FDs and the functional dependency α → β in F. � Attribute A is extraneous in α if A ∈ α and F logically implies (F – {α → β}) ∪ {(α – A) → β}. � Attribute A is extraneous in β if A ∈ β and the set of functional dependencies (F – {α → β}) ∪ {α →(β – A)} logically implies F. � Example: Given F = {A → C, AB → C } � B is extraneous in AB → C because {A → C, AB → C} logically implies A → C (i.e. the result of dropping B from AB → C). � Example: Given F = {A → C, AB → CD} � C is extraneous in AB → CD since AB → C can be inferred even after deleting C Slide 33INF1383 BD1 Testing if an Attribute is Extraneous � Consider a set F of functional dependencies and the functional dependency α → β in F. � To test if attribute A ∈ α is extraneous in α 1. compute ({α} – A)+ using the dependencies in F 2. check that ({α} – A)+ contains A; if it does, A is extraneous � To test if attribute A ∈ β is extraneous in β 1. compute α+ using only the dependencies in F’ = (F – {α → β}) ∪ {α →(β – A)}, 2. check that α+ contains A; if it does, A is extraneous Slide 34INF1383 BD1 Canonical (Minimal) Cover � Sets of functional dependencies may have redundant dependencies that can be inferred from the others � For example: A → C is redundant in: {A → B, B → C} � Parts of a functional dependency may be redundant � E.g.: on RHS: {A → B, B → C, A → CD} can be simplified to {A → B, B → C, A → D} � E.g.: on LHS: {A → B, B → C, AC → D} can be simplified to {A → B, B → C, A → D} � Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies Slide 35INF1383 BD1 Canonical (minimal) Cover � A canonical cover for F is a set of dependencies Fc such that � F logically implies all dependencies in Fc, and � Fc logically implies all dependencies in F, and � No functional dependency in Fc contains an extraneous attribute, and � Each left side of functional dependency in Fc is unique. � To compute a canonical cover for F: repeat Use the union rule to replace any dependencies in F α1 → β1 and α1 → β2 with α1 → β1 β2 Find a functional dependency α → β with an extraneous attribute either in α or in β If an extraneous attribute is found, delete it from α → β until F does not change � Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied Slide 36INF1383 BD1 Comparison of BCNF and 3NF � It is always possible to decompose a relation into a set of relations that are in 3NF such that: � the decomposition is lossless � the dependencies are preserved � It is always possible to decompose a relation into a set of relations that are in BCNF such that: � the decomposition is lossless � it may not be possible to preserve dependencies. Slide 37INF1383 BD1 Design Goals � Goal for a relational database design is: � BCNF. � Lossless join. � Dependency preservation. � If we cannot achieve this, we accept one of � Lack of dependency preservation � Redundancy due to use of 3NF � Interestingly, SQL does not provide a direct way of specifying functional dependencies other than superkeys. Can specify FDs using assertions, but they are expensive to test � Even if we had a dependency preserving decomposition, using SQL we would not be able to efficiently test a functional dependency whose left hand side is not a key. Slide 38INF1383 BD1 38 A (simplified) overview of schema design 1. Choose attributes in R 2. Specify dependencies F (tool: “Armstrong relations”) 3. Find a lossless-join, dependency preserving decomposition into Normal Form schemas � BCNF, if possible � 3NF, if not BCNF
Compartilhar