Postgresql: large table or union of small tables

Posted on

Question :

I have a table which have PK column and json column.
I received new request for versioning of the Json (not in the meaning of history but meaning of the same data just in different json schema) and decided to save the Json in it’s different versions.

From performance perspective, is it better to add new column to the table for each version, or create new table for each version and union the results when needed?

Answer :

Not really clear from the question what you are trying to accomplish, but given that you tagged the question as postgresql-performance here are some general thoughts relevant to your question for you to consider.

  • Postgresql is a row database. Data is stored as rows. See this link on difference between row and column DBs.

  • In a row database you want to have data that needs to be retrieved and used at the same time, or most of the time. Unless your different versions of the JSON need to be read at the same time then you don’t want it in the same row.

  • For keeping track of older / different versions of data you can either use different rows or a separate table. If older versions are not used often, or only for future reference and no current use, you may want to consider having a separate table so that data is not read/cached in memory. This is more relevant if you have a large table. If you have a small table, les than GBs worth of data, then you could use different rows and likely won’t really matter much.

  • Check the rules for normalizing tables. For example here. Normalization is NOT always the answer to all designs, specially when you are doing analytics/OLAP, but it is good to at least be familiar with the concepts as many common design mistakes can be avoided by knowing about normalization.

Leave a Reply

Your email address will not be published. Required fields are marked *