I have a query that takes around 1 second on first execution and 15-50 ms after running the same query again. The only difference between EXPLAIN outputs is that the second execution contains a lot of
Buffers: shared hit which are probably (and correct me if I’m wrong) responsible for the noticeable speed up.
So I was wondering, when people talk about queries taking x milliseconds, are they usually referring to the time it takes on first execution?
Do you care about a slow query that after caching is actually quite fast? This assumes that your application requires reasonable response times (e.g. e-commerce, real-time analytics, etc).
I’m using PostgreSQL 9.5, but I intended to ask something general enough to be widely useful.
If the reporter had any thoughts of rigor I would expect the hot cache / cold cache information to be included in the report. Otherwise it is like saying a car costs $y without saying which option, warranty, service or taxes that includes. Similarly I would expect to see it stated that subsequent tests were performed under the same conditions and how that was achieved. In less formal discussion I would assume hot cache, as that is the usual operating condition, unless told otherwise.
The two conditions emphasise different aspects of the host configuration. For a cold cache one sees more of an impact from the disk subsystem. For hot cache it is more about memory pressure and efficiency of the query plan. It would be unusual to find a workload that was purely one or the other in a real-world scenario, however.
SQL Server, for one, assumes a cold cache when costing a query plan.