Is there a well-known name for this “poor man’s ref. integrity” schema design pattern?

Posted on

Question :

Is there a name for the following database schema design/pattern? My eventual goal is to find more literature about the subject. Today’s cursory net search was too full of generic words to be able pin down the term (if any exists) for this kind of thing:

Fruit (id, farm)

Apple (fruit_id, color)
    [fruit_id =>]

Banana (fruit_id, length)
    [fruit_id =>]

Orange (fruit_id, is_seedless)
    [fruit_id =>]

FruitPack (id, destination)

FruitPackFruits (fruitpack_id, fruit_id, fruit_type)
    [fruit_id =>, fruit_type => VARCHAR]

Where fruit_type would be a varchar column filled with values like “Apple, Banana, Orange, Cherry”. It’s some kind of “poor man’s referential integrity”. Obviously, one the failures of this kind of design is being able to insert values that don’t resolve out to a useful join (ie: there are no cherries to speak of here).

Here’s another example of such a pattern: A single “log (id, table_name, record_id, timestamp)” table that acts as a sort of tracker for modification-times in various other tables. Strictly speaking, it’s got no ref integrity, but, the (table_name, record_id) part is supposed to refer to some record in another table, requiring a join to actually get the full data.

I’m going to take for granted that the schema is a sufficient caricature of some sort of collection of groups of items for the people here.

The question is: What’s this kind of “poor man’s referential integrity” called?

I’m not trying to learn about referential integrity. I want to identify this poor design’s name and look further into the “let’s design a database schema” aspects (ex: pros, cons, opinions, teachings, etc) that have to do with this commonly seen disaster of a schema.

Answer :

Your design looks a bit like the “supertype/subtype” pattern. Search for that and for “table inheritance”. It needs quite a lot of work to be able to enforce integrity constraints though.

You are missing a generic Fruit table (that’s the “supertype”) and a FruitType table to store the alllowed fruit types:

    fruit_type PK

    fruit_type PK, FK -> FruitType (fruit_type)
    fruit_id   PK

Then the 3 (or 4 or more) tables would be (the “subtype” tables):

    fruit_id PK
    (fruit_type, fruit_id) FK -> Fruit (fruit_type, fruit_id)
    CHECK (fruit_type = 'Apple')

    fruit_type PK
    fruit_id PK
    (fruit_type, fruit_id) FK -> Fruit (fruit_type, fruit_id)
    CHECK (fruit_type = 'Banana')

    fruit_type PK
    fruit_id PK
    (fruit_type, fruit_id) FK -> Fruit (fruit_type, fruit_id)
    CHECK (fruit_type = 'Orange')

And any other table can reference the Fruit table:

    fruitpack_id PK 

    fruitpack_id FK -> FruitPack (fruitpack_id)
    (fruit_type, fruit_id) FK -> Fruit (fruit_type, fruit_id)

It doesn’t look very nice and one column in every “fruit” table seems redundant as it has one and only one allowed value. And every time you need to add a new fruit (say Cherry), you have to add a row in the table FruitType and a new table (Cherry), similar to the other ones. So, it works better if your design is more or less stable. If you find that you may need to add a new “fruit” every few days or if you have a thousand (or more!) different fruits, it’s not the best way.

On the other hand, it enforces integrity and you can’t insert cherries into the Bananas or oranges into the Apples.

Aha! Thanks to @ypercube’s comment about “inheritance”, I managed to make a logic/context leap and found some concrete examples that give the name “Polymorphic Associations” to this kind of schema design.

Indeed, this is “a thing” to be reckoned with in the DBA world, and, IMHO, this is also something to be hated and even burned with hellfire.

In the “let’s log everything” scenario, it’s probably good to follow the wisdom of IBM and create a set of 1-to-1 “_history” (appended) tables plus triggers. It seems this would also apply to tables that act as “i18n + L10n” translation storages.

Not sure what to think about the “let’s group different types of items” scenario at this point, other than “polymorphic associations is probably a bad idea in the long term”. They’re extremely unpleasant to trace/follow/understand/query, in my context of an MVC-based application.

Leave a Reply

Your email address will not be published. Required fields are marked *