Question :
I’ve started working with an application that has an usual design:
In addition to the tables that contain the actual data, there is one metadata table that contains all the information about the other tables: their names, their columns, which columns link them to which other tables, and so on… all in a single table.
The application mostly queries this table to figure out how to query for other entities, how to query or update related tables, how to insert, delete.
The result of this is that the application’s data access code is quite generic. Most of the data access functions just take a few parameters which represent the names of things (concrete tables and columns) to look up in the metadata table.
The main drawback, as far as I can tell is that when the schema changes, the concrete structures and the content of the metadata table need to be kept in sync or the application could break, although it seems that the metadata table is generated from a custom-written script that also creates/alters the other tables. Maintenance of that script and the original data model is another concern.
I also have to consider that this data access component may either replaced or preserved for new applications. If it were to be replaced, it would probably be with something more “normal” like hibernate or some other framework. I’m quite torn on which way to go since the current code, while a little odd-looking, seems to work fine.
Are there any hidden advantages/disadvantages that I might have missed that might help me make a decision?
Actually, what is this design pattern even called, in case I want to look up other implementations of it? I don’t think I’ve ever seen anything like it before, I don’t really know what to call it or how to research it.
Answer :
I believe this is called a “Big Ball of Mud”. The not so hidden disadvantage is the system appears to have done away with normalization of the data model for the metadata. This is likely lead likely lead to inconsistencies over time. However, it is possible that the script you described, just rebuilds it from the database’s metadata.
All the databases I have worked with, have a nice normalized data structure for the meta data in place of the single metadata table you are describing. This involves the appropriate tables for the appropriate metadata. Most cache the query and resulting optimized execution plan for recently used queries.
If the code is figuring out the required to query on the fly, by querying the metadata table, it is likely slower than it needs to be.
A query like select a,b,c from T, where d = ?
can be supported by a single query with a parameter. This also prevents SQL injection when someone enters a value like 1 or 1 == 1
. On a desktop application it may not be that serious, but many companies have learned how serious it can be.
There is a reason many developers use hibernate, Spring, and other data access layers. Some of these can build the database required to support the application, and also work with databases that have already been built.