We have a goofy process (don’t we all?) that daisy chains (or cascades) on a table that needs flattening out for our data warehouse.
The field “alt_key” is an identifier field. It ends up being the parent to other alt_key and alt_key2 data.
What I have below is an example of how the data looks. This table is named MEMBER
The record where alt_key = 05526396 is our “master” record, and the other records daisy chain down alternating between the alt_key and alt_key2 fields as shown in the picture below:
[![Record Daisy Chain]]
In the end, the records need to look like this. alt_key will be named primary_pmi and alt_key2 will be named pmi. if the row is a parent, alt_key and alt_key2 will be the same as will the primary_member_nbr and member_nbr fields. pmichain_id and pmi_order can be ignored:
[![Final Daisy Chain Transformation]]
Not sure how I can get to the end dataset. I’ve tried a recursive CTE but I think I have too many records to process. How else can I go about this? Records can have many children
A simple way to approach this would be to use a loop.
Start by identifying your records where alt_key does not show up in any alt_key2 values. Load those into a staging table with pmi_order 1, and mark the records as processed in the source table (add a bit field to track this). These are your parent records.
Now start your loop with an index (n) of 2, and keep going up by one for each loop. For each staging record with pmi_order = (n-1), find the unprocessed main table record whose alt_key value matches the staging alt_key2 value. This becomes your pmi_order 2 (i.e. current value of n) record for that primary_member_nbr. (Note, you will alternate matching staging.alt_key2 to main.alt_key and staging.alt_key to main.alt_key2 for each execution of the loop.)
After each loop completes, check to see if any unprocessed records remain in the main table. Once you have exhausted the main table, you are done.