Inheritance discriminator column better int or varchar or enum?

Posted on

Question :

What is the best practice for storing a discriminator column ? Does it make any difference ?

Any pros/cons of using one over the other ?

I am using MySql with Entity Framework but the question goes beyond the underlying technology i think.

Answer :

Use always the smallest data types that fits the values it will hold. ‘Fit’ is a flexible term, depending on thing like:

  • Can it hold all current values of the table?
  • Can it hold future values of the table?
  • Is the data in a normal form? Can I create duplicate values? (think string vs. integer id that is a foreign key to an external table)
  • Can I insert undesired values? Will I control that at db or application level?
  • Is it easy to use/compatible with my application? (think ASCII encoding vs. utf8) Some datatypes are not “sql standard” and some frameworks do not recognize them.

Once you have those under consideration, use the smallest data type, as it will reduce memory and disk footprint (potentially reducing IOPS).

  • An int will take 4 bytes always, and it can store values from -2147483648 to 2147483647 (or from 0 to 4294967295, if using unsigned).

  • A varchar will take 1 + char_encoding * length (or 2 + char_encoding * length if it is larger than 255). char_encoding can be between 1 and 4 bytes, depending on the character and the encoding. Also, that assumes dynamic storage, fixed row engines like memory will store it with the maximum theoretical size. Usually inefficient for numeric values.

  • An enum will take 1 byte (or 2, if it holds more than 255 values), but will not allow for values not defined at creation time without an ALTER.

So it is always a trade-off between flexibility and performance.

Inheritance serialization, however, deserves its own discussion.

Leave a Reply

Your email address will not be published. Required fields are marked *