Data that are sometimes entered and sometimes computed

Posted on

Question :

Suppose I have a database of StorageComplex objects. StorageComplex has a guid Id, plus two data columns: Capacity (a number), and Location (a string).

My data also contain a table of StorageTank objects. A StorageTank also has an Id, Capacity and a Location. Application code will frequently treat a collection of StorageTanks at the same Location as an aggregated StorageFacility. The Capacity is the sum of the capacities of the individual StorageTanks.

It is guaranteed that there will never be a StorageTank at a location that also has a non-aggregated StorageFacility.

If it’s relevant, the applications in question access the database via EntityFramework.

My question is about the advantages and disadvantages of different strategies to manage this situation. One could imagine maintaining aggregated StorageFacility objects in the StorageFacility table. Whenever a StorageTank at a location is added, updated, or deleted, one would have to apply a corresponding change to the aggregated StorageFacility at that location. Alternatively, one could imagine not maintaining such entries. Then every query for StorageFacilities would also need to check the StorageTank table.

In the real-life applications, everything is more complicated than in this toy example. There are a lot of fields that have to match in order for StorageTanks be aggregated together, there are multiple fields that need to be aggregated, and the rules for the actual aggregation are more complicated than just adding up some numbers. The StorageTanks have fields that StorageFacilities lack. Additionally, there are a lot of different queries that access the StorageFacilities. For some purposes, an aggregated StorageFacility should be treated the same as a non-aggregated one. For other purposes, they are different.

Answer :

There’s no hard-and-fast rule here.

This pattern is very common, and here you could either:

  1. Have application logic update the StorageFacility.Capacity (etc) every time a StorageTank is added, removed, or perhaps undergoing maintenance.


  1. Store the data only at the StorageTank and roll it up as needed.

Things like the frequency of changes, the level of complexity of the roll-up rules, frequency of query, etc should make your decision.

In either case I don’t believe it’helpful that “Application code should treat a collection of StorageTanks at the same Location as an aggregated StorageFacility”. You should simply have two seperate entities.

Leave a Reply

Your email address will not be published. Required fields are marked *