Question :
As an example, let’s say I have a fact table with two dimensions and one measure
FactMoney table
ProjectKey int
PersonKey int
CashAmount money
The two dimensions are defined like this:
DimProject (a type 0 dimension – i.e. static)
ProjectKey int
ProjectName varchar(50)
DimPerson (a type 2 slowly changing dimension)
PersonKey int
PersonNaturalKey int
PersonName varchar(50)
EffectiveStartDate datetime
EffectiveEndDate datetime
IsCurrent bit
Pretty straightforward so far. Now I’ll introduce a Person Category concept.
DimCategory
CategoryKey int
CategoryName varchar(50)
And build an M2M relationship between DimPerson and DimCategory
BridgePersonCategory
PersonKey int
CategoryKey int
So – people can have 1..n categories.
My problem is – as Person is a slowly changing dimension, when a person’s name changes, we add a new person row and update our effective dates and is current flags, no big deal.
But what do we do with the person’s categories? Do we need to add more rows to the bridge table every time a new person version pops up?
And as a corollary, if a person’s categories change, does that mean we need to create a new row in the person table?
(please note I’ve also asked this question at stack-O and as directed by the first commenter, now posted this question here.)
Answer :
Does DimCategory have any Type 2 attributes that you are tracking?
If it is not, then you would only need to add a new record to your bridge table based on whenever a Type 2 attribute in DimPerson changes since you will be inserting a new record in DimPerson with a new surrogate key. You would then need to add or update your bridge table with this new surrogate key for whichever categories are assigned.
However, if an attribute on DimCategory changes, you would just update that record and there would be no change in the surrogate key so the bridge table would be unchanged.
Now if you want to track things like when a Category assignment to a Person changes as a Type 2 attribute, then you would need to build the Effective Start and End Dates in your bridge table to be able to tell at what point in time a Person’s Categories change.
Why new Surrogate Key for SCD 2 .
Soft RI .
MVs , Join Indexes , Cubes etc r nearly relic except projected MVS ,Join Indexes etc with the advent of columnar DBs.
Kimbal design is Relic for many reasons
-
With redundancy denormalization it can never be DW . It is not DW but a physical semantic layer MDM bad use of surrogate Keys Columnar DBs at PHYSICAL SEMANTIC LAYER for performance i.e. denorm without Row Oriented cost business user Reporting Layers r now in Memory as well as huge capability . Basic aggregation on contexts if volume can be reduced will work wonder .
Still All will depend upon data Hierarchy , data changes , user requirement of history .
Regards