How using many tables and JOINS affect performance vs. using limited number of tables and lesser JOINS?

Posted on

Question :

I want to design a db schema concerning a schedule.

I. I can use all fields in the same table, for example:

id | shop_id | mo (varchar) | tu (varchar) | we | th | fr | sa | su 
 1 | 1       | 08:00-12:00  | ...

II. or I can use more tables one for schedule with structure:

id | shop_id | mo_id | tu_id | we_id | th_id | fr_id | sa_id | su_id

and then one table for each day… for ex: monday with structure:

id | form_time | to_time 

III. or I can use one table for schedule with structure:

id | shop_id | weekday_id

and then only one table for weekdays with structure:

id | day | form_time | to_time 

Which structure of these 3 structures you think is better in performance?

Answer :

You are obviously familiar with data normalization and joining tables. Not to dismiss your question, but I would focus more on a solid design that captures the information in the simplest way and without putting undo constraints on your application. I think you’ll find then the performance works itself out. I’m wondering what’s wrong with this:

TABLE events
  id | shop_id | title | desc | from (DATETIME) | to (DATETIME)

To figure out a day’s schedule, you then have to do some work on those DATETIME fields to figure out whether they were on Mon, Tue, etc. But that’s pretty easy, and now there’s all sorts of other ways you could slice this data. You could even create 7 VIEWS (S,M,T,W,T,F,S) that return only events on each day of the week, if that’s so important.

It’s going to depend on many factors.

First, you should understand that JOINs don’t necessarily hurt performance. If you have proper indexes they can work very well.

The structure you pick should depend on how your actual data is structured.

If you use the first option, is it likely you will need the row to be inserted all at once or will you be updating later?

How many fields do you need to return with your queries? Will you always need all the fields from all related tables, or just a certain weekday?

A table structure like

id | shop_id | mo (varchar) | tu (varchar) | we | th | fr | sa | su 
 1 | 1       | 08:00-12:00  | ...

looks wrong to me. Example of one problem: what if a shop has two time slots in one day?

Your schema number III looks better but how will you handle something like “Shop A will change its schedule (eg close on Mondays instead of Sundays) during next Christmas period”? So you may want to add validity period to your schedule table.

Anyway, you should not worry about joins and performance at this stage. You will have to cache all this information anyway.

Leave a Reply

Your email address will not be published. Required fields are marked *