Question :
I am trying to solve this example, but for now I can’t. Can somebody show me, step by step how to decompose this to 2NF and then to 3NF?
R = {ABCDEF, ABC -> EF, B -> E, E -> D}
key of this relation is:
{ABC}
So now I try to split R into 2 relations:
R0 = {ABCEF, ABC -> EF}
R1 = {BED, B -> E, E -> D}
but its all wrong… i got other answers in my book. I dont even try to make 3NF out of it.
Answer :
The goal of 3NF (and BCNF) is to avoid situations when a set of attributes is functionally determined by something that is not a key (or, more formally, a superkey), which may lead to anomalies and redundancy.
Keeping that in mind, we can describe certain functional dependencies as either “bad” (if they prevent the relation from being in 3NF) or “good” (if they do not).
So your solution is wrong, because “your” R1 has the E->D
dependency and E is not R1’s key. Determining a key for a relation is another topic which, I suppose, is beyond the scope of the task you need to accomplish (we are told that ABC is a key). But if you are interested, you can look the method up using “attribute closure” keywords.
Anyway, the E->D
dependency is “bad”. But that is not the only problem. Another one is that your decomposition is not a “lossless-join” one. In other words, if we have some real world data, split the relation according to your proposition and try to join R0 and R1 back, the data will be corrupted. Data corruption is not what normalization is designed for.
We have to solve 2 problems:
- Decide which dependencies are “bad” (the definition of normal forms is the answer to this)
- Decide how to split relations to prevent data corruption (we have Heath’s theorem to help us)
Getting back to your problem, finally.
In R, neither B, nor E is the key. So both B->E
and E->D
are “bad”, and we need to take measures. On the contrary ABC->EF
is “good”, because ABC is a key.
Currently R is in 1NF, because B is a part of a key and functionally determines E, which is not a part of a key. This kind of situation is unacceptable in 2NF.
E->D
dependency is not that “bad”, because E is not a part of a key.
Another thing to mention at this point is that a set of dependencies {B->E, E->D}
implies the dependency B->ED
(according to so called Armstrong axiom of transitivity).
So we decide to split using B->ED
dependency as “the worst” one. That’s the way we do this, according to Heath:
R = ABCDEF
R0 = BED (B->E, E->D, B is a key)
R1 = ABCF (ABC->F, ABC is a key)
Now we have R1, which is in 3NF and R0, which is in 2NF (it has a transitive dependency B->E->D
, which is unacceptable in 3NF).
We need to split R0 and to do so we have to choose a “victim” dependency among B->E
and E->D
. As we see, B->E
is now a “good” one (B is a key) and E->D
is still “bad”.
Our split:
R0 = BED
R2 = BE (B->E, B is a key)
R3 = ED (E->D, E is a key)
R1, R2 and R3 are the final decomposition.
Using something that looks a little more like ZFC:
R = { ABCDEF, ABC → EF, B → E, E → D }
Will be presented as:
R = { A, B, C, D, E, F };
[ ∀ x ∈ ( A × B × C ), ∀ y ∈ ( E × F ) ] ∈ x R1 y;
[ ∀ x ∈ ( B ), ∀ y ∈ ( E ) ] ∈ x R2 y;
[ ∀ x ∈ ( E ), ∀ y ∈ ( D ) ] ∈ x R3 y;
The phasing is that for set R, composed of sets A, B, C, D, E and F; for any tuple x in the the cross product of sets A, B and C and any tuple y in the cross product of E and F, there exists a relationship R1 from tuple x to tuple y; for any tuple x in the set of B and any tuple y in the set of E, there exists a relationship R2 from x to y; for any tuple x in the set of E and any tuple y in the set of F, there exists a relationship R3 from x to y. Simple, right? So let’s get started.
Assertion of the first normal form ( for… fun! ) is as follows:
∀ S ∈ ( R ) ↔ ( ∀ T ∈ S ↔ | T | = 1 );
Which just means S is a set in R if and only if for any set T in S, the carnality of T is equal to one. This is the A of ACID. The 1NF assertion is to say that in set S, defined like { { A }, { B }, { C } }, neither A, B, nor C can contain more than one element, then goes on to assert that set R contains sets only like set S.
The goal here is to ensure that the relationship list R contains no dependencies such that there exists a set y ( non-prime attribute ) in a relationship with a set x ( candidate key ) where x is a proper subset of the superkey.
∀ y ∈ x Rn y ↔ ¬( ∀ z ∈ z Rn y | x ⊂ z );
Using this definition, R2 is in violation, having a dependency y on x where x is indeed a proper subset of the minimal candidate key of R, A × B × C. Arguing that by evaluating sets A and C as ∅ ( the null set ) merely makes the terms subsets rather than proper subsets and the assertion we made by establishing 1NF above means that while those sets can contain ∅, they cannot themselves be ∅, as that would break the set element carnality requirement.
B ⊆ ( A × B × C );
∀ T ∈ S ↔ | T | = 1;
T = { ∅ } → | T | = 1;
T = ∅ → | T | = 0;
A, B, C ≠ ∅;
∴ B ⊂ ( A × B × C ); E ⊂ ( E × F );
All that of course, is well and good, “but what does it mean for moving the original set R to a 2NF compliant multiset,” asked nobody, ever? Splitting the offending relationship off the original set R yields the following results, peeling the E set out of the R1 non-primary attribute list, taking R3 and corresponding non-primary attribute set D, along with it.
R1 = { A, B, C, F };
[ ∀ x ∈ ( A × B × C ), ∀ y ∈ ( F ) ] ∈ x R1 y;
R2 = { B, E, D };
[ ∀ x ∈ ( B ), ∀ y ∈ ( E ) ] ∈ x R2 y;
[ ∀ x ∈ ( E ), ∀ y ∈ ( D ) ] ∈ x R3 y;
Or, as you may perfer:
R1 = { ABCF, ABC → F };
R2 = { BED, B → E, E → D };
Many people feel that 3NF is the first normalization level of any acceptable quality, dealing with entailment or more broadly, logical consequence, stating that the relationship list R may not have any non-prime attribute that is transitively dependent on the superkey of any multiset.
∀ y ∈ x Rn y ↔ ¬( ∀ z ∈ x Rn z | y ⊧ x );
Applying this to our 2NF structure, we see that R3 is now in violation, as set D is dependent on set E, which in turn is dependent on R3‘s set B.
B → E;
E → D;
∴ D ⊧ B;
So we pull that offending relationship out and are given our final results:
R1 = { A, B, C, F };
[ ∀ x ∈ ( A × B × C ), ∀ y ∈ ( F ) ] ∈ x R1 y;
R2 = { B, E };
[ ∀ x ∈ ( B ), ∀ y ∈ ( E ) ] ∈ x R2 y;
R3 = { E, D };
[ ∀ x ∈ ( E ), ∀ y ∈ ( D ) ] ∈ x R3 y;
In the original notation:
R1 = { ABCF, ABC → F };
R2 = { BE, B → E };
R3 = { ED, E → D };