Question :
I have following MySQL db structure:
content
=======
id | tags
---------
1 | ,abc,def,ghi,
2 | ,abc,foo,
3 | NULL
4 | ,foo,bar,
categories
==========
id
--
1
2
categories_to_content
=====================
id | category_id | content_id
-----------------------------
1 | 1 | 1
2 | 1 | 2
3 | 2 | 3
4 | 2 | 4
What I’m trying to do is to SELECT tags assigned to category and count them. Result for given schema:
tag | category | count
----------------------
abc | 1 | 2
def | 1 | 1
ghi | 1 | 1
foo | 1 | 1
foo | 2 | 1
bar | 2 | 1
Decomposition isn’t possible (I hook up into existing CMS database).
My existing query:
SELECT tags, category_id as category FROM content
JOIN categories_to_content ON content_id = content.id
Answer :
You can achieve your goal by using a tally (number) table, which you can easily create in the following way
CREATE TABLE tally(n INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
INSERT INTO tally (n)
SELECT NULL
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
;
This will create a tally table and fill it with 100 rows with values
1 to 100 accordingly.
You should adjust the number of rows in your table to be equal or more (e.g. double the number to be on the safe side) than a maximum number of delimiters (comma) in content table.
You can figure out the maximum number with a simple query
SELECT MAX(CHAR_LENGTH(tags) - CHAR_LENGTH(REPLACE(tags, ',', ''))) max_tags
FROM content;
Output:
| MAX_TAGS | |----------| | 4 |
Now you can use it to split tags
SELECT category_id, tag, COUNT(*) count
FROM
(
SELECT cc.category_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(c.tags, ',', n.n), ',', -1) tag
FROM categories_to_content cc JOIN content c
ON cc.content_id = c.id CROSS JOIN tally n
WHERE n.n <= 1 + (CHAR_LENGTH(c.tags) - CHAR_LENGTH(REPLACE(c.tags, ',', '')))
) q
WHERE CHAR_LENGTH(tag) > 0
GROUP BY category_id, tag
Output:
| CATEGORY_ID | TAG | COUNT | |-------------|-----|-------| | 1 | abc | 2 | | 1 | def | 1 | | 1 | foo | 1 | | 1 | ghi | 1 | | 2 | bar | 1 | | 2 | foo | 1 |
Here is SQLFiddle demo