Question :
Server details
We have one Galera cluster Mariadb with 3 nodes . On primary node database size is 200GB & diskspace size is 400GB . On other 2 secondary nodes database size is 200GB & diskspace size is 240GB .
Issue summery:-
We have ran delete query on primary node against one table which size is 100GB . On primary node delete query completed successfully . But on both secondary nodes we are facing diskspace issue and after sometime both secondary nodes are down .
Request :-
Can someone please suggest what is happen here and how mysql delete query works internally .
Any suggestions will be appreciates .
Answer :
-
200GB in 240GB — That is too tight. You may not be able to run an
ALTER
. And other queries may blow out the disk space. Since two nodes are that tight, it is likely that when it happens, Galera will lose quorum and stop accepting writes. -
DELETE
must save the old rows in anticipation of aROLLBACK
or crash. That is what caused the crash. -
Probably the only thing you can do now is to blow away the two small nodes and pretend you are doing an initial load — letting the one live node rebuild the other two.
-
In the future, keep half the disk free.
-
In the future,
DELETE
rows in chunks. More: http://mysql.rjweb.org/doc.php/deletebig -
Look at your schema (or provide it to us for critique). For example: A
BIGINT
takes 8 bytes, but such big numbers are rarely needed;INT
takes half the space. Again, keep in mind that changingBIGINT
toINT
will involve anALTER
, which may need to copy the table over — leading to disk full in your current configuration. -
Meanwhile, since the
DELETE
could notCOMMIT
on the other two Galera nodes, that should have triggered failure on the originating node. -
Always check for errors on
COMMIT
in a Galera setup. -
OPTIMIZE TABLE
is unlikely to be safe for a large file on a small disk. That is, I don’t see it as useful now. Nor later.
Your problem may be due to the number of rows you deleted.
In InnoDB, each node in the Galera cluster has to maintain an undo log entry for every row you delete while the DELETE
command is running.
When the DELETE
command completes, node1 has to receive an acknowledgement from Node2 and Node3 that the transaction is ready to be committed on those nodes as well. All the deleted rows will thus sit in the system tablespace file (your ibdata1 file) and will grow.
Please note the InnoDB Architecture (Pictorial Representation by Percona CTO Vadim Tkachenko)
In this picture, please note the Rollback Segments which are the undo logs. All the rows you are deleting are piling up in them until the DELETE
transaction is done. When it commits, the used space is not reclaimed. That is the sad reality of doing huge transactions like this. Node1 has to do this, and so does Node2 and Node3.
I once posted info on this over 8 years ago : How can Innodb ibdata1 file grows by 5X even with innodb_file_per_table set?
This explains where the used diskspace went.
What you should have done is delete the rows in chunks of 1000 rows at a time.
This gives Node2 and Node3 breathing room to digest the deletes. In turn, ibdata1 would have remained its original size on all 3 nodes.