Question :
I have TWO QUERY which work perfect, but when i join those two query as LEFT JOIN
the output become wrong. So what is the problem.?
Query 1:
select Batsman.innings_no,
bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler, B.runs_scored) as Batsman
group by innings_no, bowler order by bowler;
Query 2:
select innings_no,
bowler,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
sum(Extra_runs) as Extra
from (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler, E.extra_type_id) as Extra
group by innings_no, bowler order by bowler;
Join Query:
select Batsman.innings_no,
Batsman.bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
ifnull(sum(Extra_runs), 0) as Extra
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler, B.runs_scored) as Batsman
LEFT JOIN (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler, E.extra_type_id) as Extra using (innings_no, bowler)
group by innings_no, bowler order by bowler;
Query 1 Output:
Query 2 Output:
Join Query Output:
If you saw Query 1 and 2 screenshot and compare with join query for specific Bowler
, then it’s clear that when i use left join as sub query the result don’t show exactly what it shows in single query.
So what is the problem.? What i need to do to solve this problem and why this problem happening?
Answer :
You were close, here is what I think you were going for:
select Batsman.innings_no,
Batsman.bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
ifnull(sum(Extra_runs), 0) as Extra
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler) as Batsman
LEFT JOIN (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler) as Extra using (innings_no, bowler)
group by innings_no, bowler order by bowler;
The difference in my query and your Join Query is that each of the subqueries is being grouped by just innings_no
and bowler
in my query.
How does group by work
The GROUP BY
clause will restrict the resulting query to return just a single row for each unique combination of the columns listed in the GROUP BY
.
In all of your queries, you are initially grouping by 3 columns, which means you are getting a separate row for each combination of innings_no
, bowler
, and a 3rd column. For Query 1 this 3rd column is runs_scored
, for Query 2 this column is extra_type_id
. This is generally not a valid way to use GROUP BY
with aggregate functions, like SUM
, that you used.
However, in both Query 1 and Query 2, you then wrap them in ANOTHER query, having just innings_no
and bowler
as the GROUP BY
, so you are getting the output you expected.
But in Join Query you are applying the LEFT JOIN
before applying that outer GROUP BY
, which was fixing your results in Query 1 and Query 2. Therefore, your Join Query is joining together different results than the output of the other two queries. This is why it doesn’t appear to work as expected.
My query should fix things, but if it doesn’t provide the correct result (because I missed some valid reason you might have had for grouping the way you did) then just use the LEFT JOIN
after applying the outer GROUP BY
s. That will likely result in worse performance, that depending on your data, that may not matter to you.
LEFT JOIN Query 2 to Query 1
Here is how to simply do a LEFT JOIN of the two queries, which should yield the result you want, based on comments.
SELECT * FROM
(
select Batsman.innings_no,
bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler, B.runs_scored) as Batsman
group by innings_no, bowler
) q1
LEFT JOIN (
select innings_no,
bowler,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
sum(Extra_runs) as Extra
from (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler, E.extra_type_id) as Extra
group by innings_no, bowler
) USING (innings_no, bowler)
ORDER BY bowler
All I did to create this query was take your Query 1 and Query 2, stuck a LEFT JOIN between them, and added a SELECT * FROM
to the start and a USING
clause for the join. I also placed the ORDER BY
outside, as it won’t have any effect inside, unlike the GROUP BY
.
Be aware that there is likely simpler way to get the result you want, but it requires a deeper understanding of what output you want and how your data is organized, and so is beyond the scope of this question.