Is there way to get transaction commit timestamp in Postgres?

Posted on

Question :

I have data-pulling functionality that once in 5 seconds grabs all the data from Postgres table basing on modified_timestamp column. It works the following way:

  1. SELECT * FROM my_table WHERE modified_timestamp > _some_persisted_timestamp
  2. _some_persisted_timestamp = CURRENT_TIMESTAMP
  3. Processing data received from step 1
  4. Sleep for 5s
  5. Go to step 1

Where modified_timestamp updated with trigger (after any row update modified_timestamp becomes CURRENT_TIMESTAMP).
It worked fine until I noticed that CURRENT_TIMESTAMP in Postgres is transaction start timestamp in fact and some of the updates are lost. Why are they lost? That’s quite simple – at the moment when I execute query SELECT * FROM my_table WHERE modified_timestamp > _some_persisted_timestamp some of the changes have already occurred, but modified_timestamp is before updated _some_persisted_timestamp because transaction is in progress still.

This problem could be easily solved by assigning in step 2 timestamp when update becomes visible for other transactions (transaction commit timestamp in other words) instead of CURRENT_TIMESTAMP or clock_timestamp().

I read documentation, but have found nothing related to transaction commit timestamp. Could you kindly suggest smth?

Btw, I’m aware of logical decoding and I know that this mechanism suits better for my needs in theory, but there are certain practical problems not allowing me to use it.

Answer :

This problem could be easily solved by assigning in step 2 timestamp
when update becomes visible for other transactions (transaction commit
timestamp in other words) instead of CURRENT_TIMESTAMP or
clock_timestamp().

This is logically impossible. Postgres writes new row versions before it finally commits to make them visible. It would require prophetic capabilities to write a future timestamp yet unknown at the time of writing.

However, you can get commit timestamps from a different source: since Postgres 9.5, there is a GUC setting track_commit_timestamp to start logging commit timestamps globally.

Then you can get commit timestamps with the utility function pg_xact_commit_timestamp(xid). Your query could look like:

SELECT * FROM my_table t
WHERE   pg_xact_commit_timestamp(t.xmin) > _some_persisted_timestamp;

Be aware that commit timestamps are not kept around forever. After
two billion transactions transactions (2^31), transaction IDs are “frozen”.
That does not delete it right away, but after 4 billion transactions, the information is gone for certain. That’s a big number of transactions, and only very busy databases burn that much over a lifetime. But there can be programming errors burning through transaction numbers more quickly than expected …

Your step 2 and step 3 trade positions, and you record the commit timestamp instead of CURRENT_TIMESTAMP – or xmin from any freshly updated row to derive the commit timestamp with pg_xact_commit_timestamp() once more.

More:

About xmin:

But I am not completely sure I understand your task. Maybe you need a queuing tool or process rows one by one like discussed here:

Leave a Reply

Your email address will not be published. Required fields are marked *