Question :
I have an existing foreign key that has ON DELETE NO ACTION
defined. I need to change this foreign key to ON DELETE CASCADE
. I can do this within a transaction:
begin;
alter table posts drop constraint posts_blog_id_fkey;
alter table posts add constraint posts_blog_id_fkey foreign key (blog_id) references blogs (id) on update no action on delete cascade;
commit;
The problem is that the posts
table is large (4 million rows) which means that validating the foreign key can take a non-trivial amount of time (I’ve tested this with a copy of the database). Dropping/adding the foreign key acquires an ACCESS EXCLUSIVE
lock on posts
. So, adding the foreign key blocks all access to the posts
table for a decent amount of time because the lock is held while constraint validation occurs. I need to perform an online migration (I don’t have a dedicated downtime window).
I know that I can perform 2 transactions to help with the check taking a long time:
begin;
alter table posts drop constraint posts_blog_id_fkey;
alter table posts add constraint posts_blog_id_fkey foreign key (blog_id) references blogs (id) on update no action on delete cascade not valid;
commit;
begin;
alter table posts validate constraint posts;
commit;
The advantage of this approach is that the ACCESS EXCLUSIVE
lock is held for a very short time for dropping/adding the constraint and then for validating the constraint only a SHARE UPDATE EXCLUSIVE
on posts
and ROW SHARE
lock on blogs
since I’m on Postgres 9.5.
Are there downsides to this? I know that adding NOT VALID
to the constraints means that existing data is not validated, but any rows inserted/updated before the VALIDATE CONSTRAINT
will be checked. Because the foreign key is dropped/added in the same transaction, is there any possibility of creating inconsistent data?
Answer :
The docs say this about NOT VALID
ADD table_constraint [ NOT VALID ]
This form adds a new constraint to a table using the same syntax as
CREATE TABLE
, plus the optionNOT VALID
, which is currently only allowed for foreign key and CHECK constraints. If the constraint is markedNOT VALID
, the potentially-lengthy initial check to verify that all rows in the table satisfy the constraint is skipped. The constraint will still be enforced against subsequent inserts or updates (that is, they’ll fail unless there is a matching row in the referenced table, in the case of foreign keys; and they’ll fail unless the new row matches the specified check constraints). But the database will not assume that the constraint holds for all rows in the table, until it is validated by using theVALIDATE CONSTRAINT
option.
Your concern,
I know that adding
NOT VALID
to the constraints means that existing data is not validated, but any rows inserted/updated before theVALIDATE CONSTRAINT
will be checked.
They’ll only be checked during validation AFTER you tell it to validate, so you can delay that until you have scheduled downtime.
Because the foreign key is dropped/added in the same transaction, is there any possibility of creating inconsistent data?
No, because the second you add the NOT VALID
it applies to all rows inserted AFTER the statement as if they were always there. VALIDATION is for the rejection of creating an FOREIGN KEY
when the referenced rows do not exist. It has nothing to do with cascading, observe
CREATE TABLE foo
AS
SELECT 1 AS a;
CREATE TABLE bar
AS
SELECT a
FROM ( VALUES (1),(2) )
AS t(a);
ALTER TABLE foo
ADD PRIMARY KEY (a);
ALTER TABLE bar
ADD FOREIGN KEY (a)
REFERENCES foo
ON DELETE CASCADE
NOT VALID;
DELETE FROM foo;
TABLE foo;
a
---
(0 rows)
test=# TABLE bar;
a
---
2
(1 row)
At this point you can see
- the deletion from
bar
cascaded tobar
- the lack of validation on bar means it still has a row that is invalid
The constraint still can’t be validated (as shown below), but for the purposes of cascaded deletion all things are good.
ALTER TABLE bar VALIDATE CONSTRAINT bar_a_fkey ;
ERROR: insert or update on table "bar" violates foreign key constraint "bar_a_fkey"
DETAIL: Key (a)=(2) is not present in table "foo".
Btw you can write this
begin;
alter table posts drop constraint posts_blog_id_fkey;
alter table posts add constraint posts_blog_id_fkey foreign key (blog_id) references blogs (id) on update no action on delete cascade not valid;
commit;
Like this
alter table posts
drop constraint posts_blog_id_fkey,
add constraint posts_blog_id_fkey
foreign key (blog_id)
references blogs (id)
on update no action
on delete cascade
not valid;
You don’t have to wrap it in a txn. You also don’t have to wrap any single statement in a txn — PostgreSQL isn’t MySQL. Everything is already transactional.