Is it ok to have 40+ columns in a table

Posted on

Question :

I am currently re-designing some of DB’s for a project. There is one DB in particular that is giving me a little of a headache. The idea is that for every client_id and date (client_id + date would be the Pri. Key) each column would contain the usage for each product that we offer (around 12 products, and for each one Cost, Margin, Total needs to be stored). I want to do domething like this:

| client_id |    date    | prod_1_Total | prod_1_Cost | prod_1_Paid | .... | prod_14_Paid |
| aa1       | 2014-01-01 | 245          | 54.97       | 50.58       | ...  | 431.23       |
| ...       | ...        | ...          | ...         | ...         | ...  | ...          |

We have to note that for a record to exist that client_id has to have had some activity in the given date, in other words, given a client_id and a date, the rest of the columns can’t be all NULL, otherwise that record won’t be on the table.

It seems to me a bad practice to merge all this products in one table like above, specially, when for a given client_id and date many of them will be NULL.

But the alternative that occurs to me, creating separate tables for each of the products, presents the problem that to calculate for example total_paid for all producst for a client_id across some dates, we have to do 14 left joins, moreover if we use lets say product_1 in the query as the main query and we left join against it the result would be incorrect since most-likely the for some days the client used product_x and not product_1 (Full-outer joins are not supported in MySQL)

What design strategy would you recommend?

Thanks for your suggestions.

Answer :

Considering that MySQL is still a relation database management system an (RDBMS if you will), you should considering the following things:

  • Your tables are defined by the data they contain and the relationships to other tables
  • Your data should be as non-redundant as possible

This last point leads to the process of normalization, which is a very important thing to do in a RDBMS.

Have a look at the Wikipedia article on Normalization to have a basic understanding of what that process entails:

I find that, oftentimes, it can be a good idea to take a step back from the “storage” aspect of your database. Try to look at your application, the data models used within and their relationships. Creating such an abstraction is called an ERM: Entity-Relationship Model, or an ERD: an Entity-Relationship Diagram.
More info on that:

For your case specificially:

Apart from the fact that your not using a design that fits a relational database, please consider the fact that you are creating a very inflexible design!
What if — for example — you were to create a new product in the future? Would you add a new column to the table? Surely you’ll agree that that’s pretty stupid.

Instead, it makes MUCH more sense to create something along the lines of

| client_id | client_name | client_address |
| 1         | Snake Logan | ...            |
| 2         | Mad Max     | ...            |
| ...       | ...         | ...            |
+-----------+---------- --+----------------+

| product_id | product_desc  | product_price |
| 1          | Silver Bullet | 14.00         |
| 2          | Hand grenade  | 7.00          |
| ...        | ...           | ...           |

| purchase_id | client_id | product_id | purchase_amount |
| 1           | 1         | 1          | 3               |
| 2           | 1         | 2          | 1               |
| 3           | 2         | 2          | 5               |

I’m not saying this is the greatest design ever, but this is more elegant.

If you want to have more info concerning the purchase, I’d recommand building a view that contains the necessary info by joing the tables together. Please don’t create a field like purchase_total, I think that would once again be redundant data. The data is already there (purchase_amount and product_id), you should need to do the math (purchase_amount * product_price). You can put that in the view if you want. Tables are for storing data, views are there to present said data.

This is perfect candidate for normalization. Given products table with product_id, design your table something similar to this.

  • client_id
  • date
  • product_id
  • prod_total
  • prod_cost
  • prod_paid
  • misc info

Join should only involve 2-3 tables, not 14. Let me know if you have any questions.

Leave a Reply

Your email address will not be published. Required fields are marked *