I’m currently working on a MySQL database where we are seeing a large number of invalidations from the query cache, primarily because of the high number of INSERT, DELETE and UPDATE statements that are being executed on many of the tables.
What I’m trying to determine is whether or not there is any benefit at all to allowing the query cache to be used for SELECT statements that are being run against these tables. Since they get invalidated so quickly, it seems to me the best thing would be to just use SQL_NO_CACHE on SELECT statements with these tables.
Is the overhead of frequent invalidation ever worth it?
Edit: At the request of the user @RolandoMySQLDBA below, here’s the info on MyISAM and INNODB.
- Data Size: 177.414 GB
- Index Size: 114.792 GB
- Table Size: 292.205 GB
- Data Size: 379.762 GB
- Index Size: 80.681 GB
- Table Size: 460.443 GB
- Version: 5.0.85
- query_cache_limit: 1048576
- query_cache_min_res_unit: 4096
- query_cache_size: 104857600
- query_cache_type: ON
- query_cache_wlock_invalidate: OFF
- innodb_buffer_pool_size: 8841592832
- 24GB of RAM
You should just disable the query cache with
[mysqld] query_cache_size = 0
and then restart mysql. Why would I suggest that ???
The Query Cache will always butt heads with InnoDB. It would be nice if InnoDB’s MVCC would let queries be served from the query cache if modifications do not affect repeatable reads for other transactions. Unfortunately, InnoDB just does not do that. Apparently, you have a lot of queries that get invalidated rather quickly and are probably not being reused.
For InnoDB under MySQL 4.0, the query cache was disabled for transactions. For MySQL 4.1+, InnoDB plays traffic cop when allowing access to the query cache on a per-table basis.
From the perspective of your question, I would say that the justification of removing the query cache is not so much the overhead, but how InnoDB manages it.
For more information on how InnoDB interacts with the query cache, please read pages 213-215 of the book “High Performance MySQL (Second Edition)”.
If all or the majority of your data is MyISAM, you could go with your original idea of using SQL_NO_CACHE.
If you have a mix of InnoDB and MyISAM, you will have to find the right balance for your application based on on how high your cache misses are. In fact, pages 209-210 of the same book point out reasons for cache misses :
- The query is not cacheable, either because it contains a nondeterministic construct (such as CURRENT_DATE) or because its
result set is too large to store.Both types of uncacheable queries
increment the Qcache_not_cached status variable.
- The server has never seen the query before, so it never had the chance to cache its result.
- The query’s result was previously cached, but the server removed it. This can happen because there wasn’t enough memory to keep it, because
someone instructed the server to remove it, or because it was
and the root causes of high cache misses with few uncacheable queries may be :
- The query cache is not warm yet. That is the server hasn’t had a chance to fill the cache with result sets.
- The server is seeing queries it hasn’t seen before. If you do not have a lot of repeated queries, this can happen even after the cache
is warmed up.
- There are a lot of cache invalidations.
UPDATE 2012-09-06 10:10 EDT
Looking your latest updated info, you have
query_cache_limit set to 1048576 (1M). This limits any result set to 1M. If you retrieve anything bigger, it will simply not be cached. While you have have
query_cache_size set to 104857600 (100M), this only allows for 100 cached results set in a perfect world. If you perform hundreds of queries, fragmentation will come about rather quickly. You also have 4096 (4K) as the minimum size result set. Unfortunately, mysql has no internal mechanism for defragmenting the query cache.
If you must have the query cache and you have so much RAM, you could execute the following:
SET GLOBAL query_cache_size = 0; SELECT SLEEP(60); SET GLOBAL query_cache_size = 1024 * 1024 * 1024;
in order to purge the query cache. You lose all cached results, so run these lines during off-peak hours.
I would also assign the following:
- query_cache_size = 1G
- query_cache_limit = 8M
That leaves 23G of RAM. I would raise the following:
- innodb_buffer_pool_size = 12G
- key_buffer_size = 4G
That leaves 7G. This should be adequate for OS and DB Connections.
Keep in mind that the key buffer caches only MyISAM index pages, while the InnoDB Buffer Pool caches data and indexes.
One more recommendation: upgrade to MySQL 5.5 so you can configure InnoDB for multiple CPU and multiple threads for read/write I/O.
See my earlier posts on using MySQL 5.5 in conjunction with accessing multiple CPUs for InnoDB
Nov 24, 2011: Why mysql 5.5 slower than 5.1 (linux,using mysqlslap)
Oct 05, 2011: Query runs a long time in some newer MySQL versions
Sep 20, 2011: Multi cores and MySQL Performance
Jun 19, 2011: How do I properly perform a MySQL bake-off?
UPDATE 2012-09-06 14:56 EDT
My method for clearing the query cache is rather extreme because it hoses cached data and forms a completely different segment of RAM. As you did point out in your comment,
FLUSH QUERY CACHE (as you suggested) or even
RESET QUERY CACHE would be better. For clarification, when I said “no internal mechanism,” I meant exactly that. Defragmentation is needed and has to be done manually. It would need to be crontab’d.
If you do DML (INSERTs, UPDATEs, DELETEs) on InnoDB more often than on MyISAM, I would say remove the query cache altogether, which I said in the beginning.
BAD: query_cache_size = 1G
Why? Because of how long a flush will take. That is, when some write occurs, the whole 1GB will be scanned to find any references to the table that was modified. The bigger the QC, the slower this is. I recommend a size of no more than 50M, unless your data rarely changes.
The QC is overhead for both MyISAM and InnoDB. It takes out a global Mutex, and takes it out too soon. This mutex is one reason that MySQL cannot make effective use of more than about 8 cores.
SQL_NO_CACHE is not noticed until after the Mutex is locked ! About the only use for that flag is for benchmarking.
Often it is better to give the RAM to some other cache.
I can think of a perfect case for it, and we have tested thoroughly and run it in production…I call it the “fast lane” clustering strategy:
If you do read-write splitting with a proxy like MaxScale, or your application is capable, you can send some of the reads for those seldom invalidated tables only to slaves that have the query cache turned on, and the rest to other slaves with it turned off.
We do this and handle 4M calls per minute to the cluster during our load tests (not benchmark…the real deal) as a result. The app does wait on master_pos_wait() for some things, so it is throttled by the replication thread, and although we have seen it with a status of waiting on Qcache invalidation at very high throughput, those throughput levels are higher than the cluster is even capable of without Qcache.
This works because there’s rarely anything relevant in the tiny query cache on those machines to invalidate (those queries are only relevant to infrequently updated tables). These boxes are our “fast lane”. For the rest of the queries that the application does, they don’t have to contend with Qcache since they go to boxes without it turned on.