What is a “PARTITIONED OUTER JOIN”?

Posted on

Question :

This just came up in a question on Reddit, and I’m wondering

  • What is a PARTITIONED OUTER JOIN in Oracle? (definition)
  • What does a simple example look like? (use)
  • How would you write it in terms of PostgreSQL or Standard SQL that otherwise lacks PARTITIONED OUTER JOIN? (equivalence)

Answer :

{1} Partitioned Outer Join: Definition

… “Such a join extends the conventional outer join syntax by applying the outer join to each logical partition defined in a query. Oracle logically partitions the rows in your query based on the expression you specify in the PARTITION BY clause. The result of a partitioned outer join is a UNION of the outer joins of each of the partitions in the logically partitioned table with the table on the other side of the join.” (documentation)

{2} Simple Example

“Data is normally stored in sparse form. That is, if no value exists for a given combination of dimension values, no row exists in the fact table. However, you may want to view the data in dense form, with rows for all combination of dimension values displayed even when no fact data exist for them.”

… “For example, if a product did not sell during a particular time period, you may still want to see the product for that time period with zero sales value next to it.” (quotes from the documentation)

Test tables and data (INSERTs)

-- Oracle 12c
create table sales (
  date_ date
, location_ varchar2( 16 )
, qty_ number
);

create table locations (
  name varchar2( 16 )
);

-- dates for locations are "gappy": 
-- none of the locations has entries for all 3 dates
-- ( date range: 2019-01-15 - 2019-01-17 )
insert into sales ( date_, location_, qty_ ) 
  values ( date '2019-01-17', 'London', 11 ) ;
insert into sales ( date_, location_, qty_ ) 
  values ( date '2019-01-15', 'London', 10 ) ;
insert into sales ( date_, location_, qty_ ) 
  values ( date '2019-01-16', 'Paris', 20 ) ;
insert into sales ( date_, location_, qty_ ) 
  values ( date '2019-01-17', 'Boston', 31 ) ;
insert into sales ( date_, location_, qty_ ) 
  values ( date '2019-01-16', 'Boston', 30 ) ;

-- locations
insert into locations ( name ) values ( 'London' );
insert into locations ( name ) values ( 'Paris' );
insert into locations ( name ) values ( 'Boston' );

Required output

date_       location_  qty_
2019-01-15  London     10
2019-01-15  Paris       0    -- not INSERTed!
2019-01-15  Boston      0    -- not INSERTed!
2019-01-16  London      0    -- not INSERTed!
2019-01-16  Paris      20
2019-01-16  Boston     30
2019-01-17  London     11
2019-01-17  Paris       0    -- not INSERTed!
2019-01-17  Boston     31

Query (partitioned outer join)

select S.date_, S.qty_, L.name
from sales S partition by ( date_ ) 
  right join locations L on S.location_ = L.name
;

-- result
DATE_      QTY_  NAME    
15-JAN-19  NULL  Boston  
15-JAN-19  10    London  
15-JAN-19  NULL  Paris   
16-JAN-19  30    Boston  
16-JAN-19  NULL  London  
16-JAN-19  20    Paris   
17-JAN-19  31    Boston  
17-JAN-19  11    London  
17-JAN-19  NULL  Paris 

Query (version 2, same join)

-- same as above, using NVL(), column aliases, and ORDER BY ...
select S.date_, nvl( S.qty_, 0 ) as sold, L.name as location
from sales S partition by ( date_ ) 
  right join locations L on S.location_ = L.name
order by S.date_, L.name
;

DATE_           SOLD LOCATION        
--------- ---------- ----------------
15-JAN-19          0 Boston          
15-JAN-19         10 London          
15-JAN-19          0 Paris           
16-JAN-19         30 Boston          
16-JAN-19          0 London          
16-JAN-19         20 Paris           
17-JAN-19         31 Boston          
17-JAN-19         11 London          
17-JAN-19          0 Paris           

Dbfiddle here (Oracle 18c)

{3} Equivalence

The following query does roughly the same job as the PARTITION BY ( date_ ) outer join. We are using a combination of a CROSS JOIN (inner SELECT) and LEFT OUTER JOIN.
(Conversion of NULLs to 0s omitted)

Oracle

select SL.*, S.qty_
from
(
  select *
  from (
    select unique date_ from sales
  ) , (
    select unique name from locations
  )
) SL left join (
  select date_, location_, qty_ from sales
) S on SL.name = S.location_ and SL.date_ = S.date_ 
order by SL.date_, SL.name
;

DATE_      NAME    QTY_  
15-JAN-19  Boston  NULL  
15-JAN-19  London  10    
15-JAN-19  Paris   NULL  
16-JAN-19  Boston  30    
16-JAN-19  London  NULL  
16-JAN-19  Paris   20    
17-JAN-19  Boston  31    
17-JAN-19  London  11    
17-JAN-19  Paris   NULL 

PostgreSQL 10 ( dbfiddle )

-- DDL and INSERTs
create table sales (
  date_ date
, location_ varchar( 16 )
, qty_ number
);

create table locations (
  name varchar( 16 )
);

insert into sales ( date_, location_, qty_ ) values 
  ( '2019-01-17', 'London', 11 )
, ( '2019-01-15', 'London', 10 )
, ( '2019-01-16', 'Paris', 20 )
, ( '2019-01-17', 'Boston', 31 )
, ( '2019-01-16', 'Boston', 30 )

insert into locations ( name ) values 
( 'London' ), ( 'Paris' ), ( 'Boston' );

Query (Postgres)

-- SL: all date_ <-> location combinations
-- S: all location_ and qty_ values of table sales
select SL.*, S.qty_
from
(
  select *
  from (
    select distinct date_ from sales
  ) S_ cross join (
    select distinct name from locations
  ) L_
) SL left join (
  select date_, location_, qty_ from sales
) S on SL.name = S.location_ and SL.date_ = S.date_ 
order by SL.date_, SL.name
;

date_        name    qty_
2019-01-15   Boston  
2019-01-15   London  10
2019-01-15   Paris    
2019-01-16   Boston  30
2019-01-16   London    
2019-01-16   Paris   20
2019-01-17   Boston  31
2019-01-17   London  11
2019-01-17   Paris    

Leave a Reply

Your email address will not be published. Required fields are marked *