Question :
I would like to specify which row to return first based on whether the uid value is equal to a value I specify, when using count with group by.
I have 5 tables. Two tables are ‘intersection entity’ tables that hold foreign keys to the 3rd and 4th tables poster and albums, as well as the foreign key to the fifth table, link which holds the user_id.
if a user has created a link, I need that user_id to be the value in the user_id column, otherwise it doesn’t matter which row is used. I have tried many different approaches and the only one that comes close is using group by imageId, uid = 9, but this breaks the count and I get duplicate rows with a count of 1, instead of the actual count. I apologize if I am missing something obvious as I haven’t used mysql significantly in quite awhile.
SELECT * FROM (
SELECT poster.*, l.user_id as uid, count(*) as c
FROM `poster_links` pl
INNER JOIN poster ON poster.posterID = pl.poster_id
inner join link l on pl.link_id = l.link_id
WHERE pl.poster_id != 167 and l.link_id in(
SELECT pl.link_id from poster_links as pl where poster_id = 167)
group by imageId, uid = 9
UNION ALL
SELECT albums.*, l.user_id as uid, count(*) as c
FROM `album_links` al
INNER JOIN albums ON albums.album_id = al.album_id
inner join link l on al.link_id = l.link_id
WHERE al.album_id != 'null' and l.link_id in(
SELECT pl.link_id from poster_links pl where poster_id = 167)
group by imageId, uid = 9
) T
Using group by imageId, uid = 9(current user) is the only way I have found to force the uid to be equal to the current user if that user is one of the multiple users who have created the link. I am trying to avoid an additional query to get the current users ‘links’, then comparing both query results on the back end.
I am grouping by imageId because it is the only unique field. Also I have tried using case in the select-
CASE l.user_id
WHEN 9 THEN 9
ELSE null
END as uid,
but I only got null values probably because mysql only checks the first value returned which could be any uid.
CREATE TABLE `poster` (
`posterID` int(11) NOT NULL,
`imageId` varchar(255) DEFAULT NULL,
`title` varchar(64) NOT NULL,
`user` int(11) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `poster` (`posterID`, `imageId`, `title`, `user`) VALUES
(71, 'C44FB5C807FFC1768271D9988FF41270054D9B6F', 'Testarossa', 1),
(72, 'E009F63FD85E1EE98BE1EED77EA4F8A8E96BEA4F', 'Lambo', 1 ),
(73, 'E192493089403C89E9DAA8B64DDBED12FFF818E5', 'Ferrari', 1),
(74, '8AFFA17561689FCA48BB9000B81AE13CA46E9A05', 'Koenisegg', 1),
(167, 'E13CA46E98AFFA17561689FCA48BB9000B81AA05', 'egg', 1);
CREATE TABLE `albums` (
`album_id` int(11) NOT NULL,
`imageId` varchar(255) DEFAULT NULL,
`title` varchar(255) NOT NULL,
`user_id` int(11) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `albums` (`album_id`, `imageId`, `title`, `user_id`) VALUES
(45, '6FJIVC7L4OKNM8YAP1XGZ23DW5SUR0QEHB9', 'Cars', 3),
(46, 'FU6D7MVTZK85SRPLIE1CXGY2WO4QAN03JHB', 'Misc', 3),
(47, '5W498FRDAQGOZN2XUIP7VL3CKJTHSBM0E16', 'SCars', 3),
(48, 'SG8MVX1DLAUHENFOI5JB6W0C2RP379QYZ4K', 'MCars', 3);
CREATE TABLE `link` (
`link_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `link` (`link_id`, `user_id`, `timestamp`) VALUES
(11, 6, '2019-10-29 09:11:25'),
(12, 3, '2019-10-29 07:57:24'),
(13, 9, '2019-10-29 07:59:55'),
(14, 9, '2019-10-29 09:10:25'),
(15, 7, '2019-11-03 11:05:51'),
(16, 9, '2019-11-03 12:05:51');
CREATE TABLE `album_links` (
`link_id` int(11) NOT NULL,
`album_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `album_links` (`link_id`, `album_id`) VALUES
(12, 45),
(13, 45),
(13, 46),
(14, 47),
(15, 48);
CREATE TABLE `poster_links` (
`link_id` int(11) NOT NULL,
`poster_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `poster_links` (`link_id`, `poster_id`) VALUES
(11, 167),
(11, 71),
(12, 167),
(14, 167),
(12, 71),
(16, 167),
(16, 71),
(14, 72),
I expect to get all the posters and albums grouped by imageId with the addition of a count column and a uid column that should display the value 9, if that is one of the values in the link table. Instead it shows a different uid.
poster | uid | c |
C44FB5.. 9 3
E009F6.. 9 1
6FJIVC.. 3 1
5W498F.. 9 1
Answer :
I’m still having trouble to understand the logic behind your expected result, but here is a wild shot:
WITH tmp as (
SELECT poster.imageId, l.user_id as uid
FROM poster_links t
JOIN poster
ON poster.posterID = t.poster_id
JOIN link l
ON t.link_id = l.link_id
WHERE t.poster_id <> 167
AND t.link_id IN (
SELECT pl.link_id
FROM poster_links as pl
JOIN link AS l
ON l.link_id = pl.link_id
WHERE poster_id = 167
)
UNION ALL
SELECT albums.imageId, l.user_id as uid
FROM `album_links` s
JOIN albums
ON albums.album_id = s.album_id
JOIN link l
ON s.link_id = l.link_id
WHERE s.album_id <> 'null'
AND s.link_id in (
SELECT pl.link_id
FROM poster_links AS pl
JOIN link AS l
ON l.link_id = pl.link_id
WHERE poster_id = 167
)
)
SELECT imageId, uid, cnt
FROM (
SELECT imageId, uid
, COUNT(*) OVER (PARTITION BY imageId) as cnt
FROM tmp t1
) as t
WHERE uid = 9
UNION ALL
SELECT imageId, uid
, COUNT(*) OVER (PARTITION BY imageId) as cnt
FROM tmp t1
WHERE NOT EXISTS (
SELECT 1 FROM tmp t2
WHERE uid = 9
AND t1.imageId = t2.imageId
);
I placed you original query in a Common Table Expression (CTE) and used a Window function for the count.
I split it into 2 cases, where the uid = 9 and images where uid = 9 does not have a link:
imageId uid cnt
5W498FRDAQGOZN2XUIP7VL3CKJTHSBM0E16 9 1
C44FB5C807FFC1768271D9988FF41270054D9B6F 9 3
E009F63FD85E1EE98BE1EED77EA4F8A8E96BEA4F 9 1
6FJIVC7L4OKNM8YAP1XGZ23DW5SUR0QEHB9 3 1
I took a 2 day break and when I checked back this morning I didn’t see any additional alerts so I decided to give up and break the query into several different queries and perform the needed logic in my back end code. While working on this I made a few more attempts and found a solution! It seems all I needed to do was put an additional aggregate function in the field list and test for the needed uid.
SUM(CASE WHEN l.user_id = 9 THEN 9 ELSE 0 END) as uid
Here is what I did-
SELECT * FROM (
SELECT poster.*, 1 AS type, SUM(CASE WHEN l.user_id = 9 THEN 9 ELSE 0 END) as uid, l.link_id, count(*) as c
FROM `poster_links` t
INNER JOIN poster ON poster.posterID = t.poster_id
inner join link l on t.link_id = l.link_id
WHERE t.poster_id != 167 and t.link_id in(
SELECT pl.link_id from poster_links as pl join link as l on l.link_id = pl.link_id where poster_id = 167)
group by imageId
UNION ALL
SELECT albums.*, 2 AS type, SUM(CASE WHEN l.user_id = 9 THEN 9 ELSE 0 END) as uid, l.link_id, count(*) as c
FROM `album_links` s
INNER JOIN albums ON albums.album_id = s.album_id
inner join link l on s.link_id = l.link_id
WHERE s.album_id != 'null' and s.link_id in(
SELECT pl.link_id from poster_links as pl join link as l on l.link_id = pl.link_id where poster_id = 167)
group by imageId
) T
Unfortunately I have no idea how efficient this is, or will be when the tables become enormous, but I am happy I found a simple solution to the problem.
I am also curious if there is a way to do this without running the exact same subquery twice as both select poster.* and select album.* queries are dependent on the same subquery. Any suggestions?
If anyone has any comments on my table structure or the efficiency of this query, or anything else, please provide feedback!