Should a counter column with frequent update be stored in a separate table?

Posted on

Question :

I have a MySQL/MariaDB database where posts are stored. Each post has some statistical counters such as the number of times the post has been viewed for the current day, the total number of views, number of likes and dislikes.
For now, I plan to have all of the counter columns updated in real-time every time an action happens – a post gets a view, a like or a dislike. That means that the post_stats table will get updated all the time while the posts table will rarely be updated and will only be read most of the time.

The table schema is as follows:
posts(post_id, author_id, title, slug, content, created_at, updated_at)
post_stats(post_id, total_views, total_views_today, total_likes, total_dislikes)

The two tables are connected with a post_id foreign key. Currently, both tables use InnoDB. The data from both tables will be always queried together to be able to show a post with its counters, so this means there will be an INNER JOIN used all the time. The stats are updated right after reading them (every page view).

My questions are:

  1. For best performance when the tables grow, should I combine the two tables into one since the columns in post_status are directly related to the post entries, or should I keep the counter/summary table separate from the main posts table?
  2. For best performance when the tables grow, should I use MyISAM for the posts table as I can imagine that MyISAM can be more efficient at reads while InnoDB at inserts?

This problem is general for this database and also applies to other tables in the same database such as users (counters such as the total number views of their posts, the total number of comments written by them, the total number of posts written by them, etc.) and categories (the number of posts in that category, etc.).

P.S. The views per day counters are reset once daily at midnight with a cron job. One reason for having posts and post_stats as two tables is concerns about caching.

Answer :

I would consider separate tables. Think about all of the data analysis you may ever need to do in the future, and plan a schema that supports that. With single counts in the post_stats table, you’ll only ever know a post was viewed x time per day, or y times overall, etc..

Instead, consider dropping that table all together, and creating a tables called Views and Likes. For example. the Views table could be foreign keyed to the posts table. In this table you could store details such as…

  • view_id
  • post_id
  • viewed_by
  • datetime_viewed

With this schema, you could query total views by post or by user, and for any time period you choose. For example…

  • views by post for today
  • views by user for the past week
  • views of user’s posts for the past week
  • etc..

You could then create various views to return stats from those tables, as you see the need. Going with this strategy, you’ll will not be limited on what you can analyze later.

Additionally, it would would just be a simple insert anytime an action happens, vs an update that may conflict with another simultaneous update.

The same concept would apply for Like, etc..

Leave a Reply

Your email address will not be published. Required fields are marked *