Pull data from multiple tables in a view or function

Posted on

Question :

There are 3 tables from which I need to pull data, and print it in a specific way.

http://sqlfiddle.com/#!15/59481/8

Refer to this question to see how the query was produced. The version in the fiddle working partially, but I need to modify the query to produce a different output.

I have the table project_report which is storing data for keyword:

1. If a keyword has data for a particular day it will store it in `project_report`
2. A keyword might have no data to store for a particular day.
3. A keyword might store multiple rows of data for a particular day (the data is uniq)
4. A keyword might not have data for a particular day, but it might have for a different day.

I believe to solve my problem a DB function will be the best approach, because I would like to be able to specify a date and get all the data from project_reports for each keyword.

I don’t know how to write this query, to produces the below results, also I don’t know what will be better, to have a view or a db function, I think that with a view in order to select a specific date, I will need for each keyword and each date to produce those outputs. The other problem is that I would also like to use these results in a chart, which will take the data for a range, for that I think a view is better, then calling a function n times.


So if keyword_id: 1 has in the project_report:

ID: 1, keyword_id: 1, project_id: 1, position: 1, created_at: '2014-07-09'
ID: 2, keyword_id: 1, project_id: 1, position: 2, created_at: '2014-07-09'

For the selected date 2014-07-09 I should get (please see that position will be the smallest position instead of [1, 2]):

keyword_id: 1, project_id: 1, position: 1, yesterday_pos: 0, last_week_pos: 0, last_month_pos:0 created_at: '2014-07-09'

If keyword_id: 2 has in the project_report:

ID: 3, keyword_id: 2, project_id: 1, position: 3, created_at: '2014-07-09'
ID: 4, keyword_id: 2, project_id: 1, position: 4, created_at: '2014-07-08'
ID: 5, keyword_id: 2, project_id: 1, position: 5, created_at: '2014-07-08'
ID: 6, keyword_id: 2, project_id: 1, position: 4, created_at: '2014-07-03'

For the selected date 2014-07-09 I should get:

keyword_id: 2, project_id: 1, position: 3, yesterday_pos: 4, last_week_pos: 4, last_month_pos: 0 created_at: '2014-07-09'

If keyword_id: 3 has in the project_report:

ID: 7, keyword_id: 3, project_id: 1, position: 10, created_at: '2014-07-08'
ID: 8, keyword_id: 3, project_id: 1, position: 11, created_at: '2014-07-03'

For the selected date 2014-07-09 I should get:

keyword_id: 3, project_id: 1, position: 0, yesterday_pos: 10, last_week_pos: 11, last_month_pos: 0 created_at: '2014-07-09'

If keyword_id: 4 has in the project_report:

`nothing`

For the selected date 2014-07-09 I should get:

keyword_id: 4, project_id: 1, position: 0, yesterday_pos: 0, last_week_pos: 0, last_month_pos: 0 created_at: '2014-07-09'

Answer :

I suggest a radically new approach with crosstab() from the additional module tablefunc. You need to install it once per database. Detailed instructions:

SELECT k.name, keyword_id, project_id, now()::date AS the_date
     , t AS today, y As yesterday, w AS week, m AS month
FROM   crosstab(
   $$
   SELECT rn
        , pk.keyword_id
        , pk.project_id
        , d.created_at
        , COALESCE(pr.pos, 0) AS pos
   FROM  (SELECT *, row_number() OVER () AS rn FROM project_keyword) pk
   CROSS  JOIN (
      VALUES
         (now()::date)
       , (now()::date - 1)
       , (now()::date - 7)
       ,((now() - interval '1 month')::date)
     ) d(created_at)
   LEFT JOIN (
      SELECT keyword_id
           , project_id
           , created_at::date AS created_at
           , min(position) AS pos
      FROM   project_report
      GROUP  BY keyword_id, project_id, created_at::date
      ) pr USING (keyword_id, project_id, created_at)
   ORDER  BY pk.rn, d.created_at
   $$

  ,$$
   VALUES
         (now()::date)
       , (now()::date - 1)
       , (now()::date - 7)
       , ((now() - interval '1 month')::date)
   $$
   ) ct (rn int, keyword_id int, project_id int
       , t int, y int, w int, m int)
JOIN   keyword k USING (keyword_id);

Returns:

name    keyword_id   project_id   the_date     today   yesterday   week   month
Cheese  1            1            2014-07-11   1       1           1      0
Monitor 2            1            2014-07-11   2       2           2      0
Apple   3            1            2014-07-11   9       9           9      0
Apple1  4            1            2014-07-11   0       0           0      0
Iphone  5            1            2014-07-11   0       33          34     0

Explain

  • We need a single unique column per group for crosstab(). Since you don’t have that per (keyword_id int, project_id) I am adding a surrogate key with row_number() to project_keyword.

  • JOIN that to a set of dates you are interested in (today, yesterday etc.). One row per date.

  • LEFT JOIN to project_reports to attach the position for each date if available. Default to 0 with COALESCE.

  • Cross-tabulate to get a single row with one position column per given date.

  • Finally JOIN to keyword just to add the name. Add the current date and remove rn from the result while being at it.

Aside: “name” is not a good name. I’d use something descriptive instead.

Leave a Reply

Your email address will not be published. Required fields are marked *