Join 2 tables and get the latest record

Posted on

Question :

I have 2 tables table1 contains the columns of id, status,created_by,created_date,name and table2 is a log table which contains id, updated_by,updated_date. Table2 contains multiple records of same id. I’m trying to get the resultant query which support to get the selected columns from both tables by join operation. but when I tried i’m getting the duplicate records associated with that id. I want only one latest record for that id. My query…

select  t.created_by,t.created_date,tl.updated_by,tl.updated_date
    from  test_name t
    join (
        SELECT  updated_by,updated_date,id
            from  test_log
            order by  updated_date desc
            limit  1
         ) tl  ON t.id=tl.id
    where  state = 'active';

Answer :

You could return the sequential number of a row within a partition (in this case id) of a result set, starting at 1 (AND seqnum = 1) for the first row in each partition.

See below

SELECT t.created_by, 
       t.created_date, 
       tl.updated_by, 
       tl.updated_date 
FROM   test_name t 
      INNER JOIN (SELECT tl.*,
            Row_number() 
              OVER ( 
                partition BY id 
                ORDER BY updated_date DESC ) AS seqnum 
             FROM   test_log tl) tl 
         ON t.id = tl.id 
            AND seqnum = 1 
WHERE  state = 'active'; 

You can use a self join on log table to get to he latest row per table 1 something like:

select
    t.created_by,
    t.created_date,
    l1.updated_by,
    l1.updated_date 
from
    test_name t 
    left join test_log l1 
        on t.id = l1.id 
    left join test_log l2 
        on l1.id = l2.id 
        and l1.updated_date < l2.updated_date 
where
    t.state = 'active' 
    and l2.id is null ;

http://mysql.rjweb.org/doc.php/groupwise_max discusses how to deal with dups when doing “groupwise max”.

SELECT RowNum,
            username,
            p.purchased,
            p.product_id
                FROM (
                    SELECT ROW_NUMBER() OVER (ORDER BY  users.id) AS RowNum, users.id AS uid,username
                    FROM users
                ) AS users1
                LEFT JOIN  (SELECT orders.*, ROW_NUMBER() OVER (partition by user_id  ORDER BY created_at DESC) AS seqnum
                FROM orders) p ON uid = p.user_id  AND p.seqnum = 1
            WHERE RowNum >= 0 AND uid ='111'
            ORDER BY RowNum

If you create your table structure as below then you get perfect result using following query.

table1 contains columns like
pk_id_table1,
status,
created_by,
created_date,
name

table2 contains columns like
pk_id_table2,
pk_id_table1,
updated_by,
updated_date

SELECT table2.*, table1.name 
FROM table2 
LEFT JOIN table1 ON table2.pk_id_table1=table1.pk_id_table1  
WHERE table2.pk_id_table2 
  IN (SELECT max(pk_id_table2) 
  FROM table2 
  GROUP BY pk_id_table1) 
ORDER BY pk_id_table2 desc

If you want to get the latest records from log for each name then this can’t be performed by queries only in an easy way (at least within MySQL dialect). The simplest way is to create the copy of the log table log_last but with UNIQUE index on the id and INSERT ON DUPLICATE KEY UPDATE syntax. Each record written into the log should be written into log_last too. IODKU ensure that old data will be overwritten by new ones and when you SELECT * FROM log_last you’ll get all and only latest log lines for each id.

Leave a Reply

Your email address will not be published. Required fields are marked *