Question :
I have a challenge I set out to do that seemed initially trivial. Not so for my developper brain.
Consider the following simple view, used to validate a cron I wrote that queries a subset of 200 000 statements every saturday.
It goes as follows:
mysql> SELECT
-> DATE_FORMAT(s.created, "%Y-%m-%d") as "Date",
-> count(s.id) AS "Accounts credited",
-> sum(s.withdrawal) "Total Credited"
-> -- 100 * (sum(s.withdrawal) - sum(prev.withdrawal))
-- / sum(prev.withdrawal) "Difference in %"
-> FROM statements s
-> -- LEFT JOIN prev
-> -- s.created - interval 7 DAY
-> -- ON prev.created = s.created - interval 7 DAY
-- AND (prev.status_id = 'OPEN'
-- OR prev.status_id = 'PENDING')
-> WHERE (s.status_id = 'OPEN' OR s.status_id = 'PENDING')
-> GROUP BY YEAR(s.created), MONTH(s.created), DAY(s.created)
-> ORDER BY s.created DESC
-> LIMIT 8;
+------------+-------------------+----------------+
| Date | Accounts credited | Total Credited |
+------------+-------------------+----------------+
| 2019-01-19 | 18175 | 3173.68 |
| 2019-01-12 | 18135 | 4768.43 |
| 2019-01-05 | 17588 | 6968.49 |
| 2018-12-29 | 17893 | 5404.18 |
| 2018-12-22 | 17353 | 7048.18 |
| 2018-12-15 | 16893 | 7181.34 |
| 2018-12-08 | 16220 | 9547.09 |
| 2018-12-01 | 15476 | 7699.59 |
+------------+-------------------+----------------+
8 rows in set (0.79 sec)
As is, the query is efficient and practical. I merely would like to add a column, difference in percentage
, from previous week’s total, as seen with the — commented out code.
I have tried various approaches, but because of the GROUP BY
, adding an inline column to get the sum(withdrawal)
of previous week makes the query run … forever.
I then tried the LEFT JOIN
approach, but this has the same problem, Obviously. I think the added JOIN
has to fetch the sum of previous week for every row of the outer select.
I then had the (not so smart) idea of querying my view, even but then it seems I would have the same issue.
I assume there are much more optimal approaches out there to this simple task.
Is there an elegant way to calculate a percentage from such a query?
Would a stored procedure or some other ‘non-plain-sql’ approach be more optimal?
Answer :
When you use MySQL >= 8.0, you can use CTEs (common table expressions)
;
WITH
(
SELECT DATE(s.created) as "Date",
COUNT(s.id) AS "Accounts credited",
SUM(s.withdrawal) "Total Credited"
FROM statements s
WHERE s.status_id IN ('OPEN', 'PENDING')
GROUP BY DATE(s.created)
ORDER BY s.created DESC
) AS cte
SELECT cte.*,
100 * (cte.`Total Credited` - prev.`Total Credited`) /
prev.`Total Credited` AS "Difference in %"
FROM cte
LEFT JOIN cte prev ON cte.`Date` = prev.`Date` - INTERVAL 7 DAY;
If you use an earlier version, just replace the CTE with appropriate subqueries, like
SELECT <your_fancy_percentage_calculation>
FROM (<cte from above>) a
LEFT JOIN (<cte from above>) b ON a.Date = b.Date - INTERVAL 7 DAY;
MySQL is smart enough to not execute the same query twice, I think.
Or you create a view and use this.
With MySQL 8.0 or MariaDB 10.2, and assuming you have no gaps in the dates, LAG(7)
is a “windowing” function that might provide a straightforward way to do the task.