Question :
I ran a few HammerDB benchmarks and mess with the configurations. And I notice different values of slave_parallel_workers
have a considerable impact on TPM.
As far as I understand, slave_parallel_workers
only affects how quickly the transactions are being applied from the relay log. Yet, all the possible HA/DR setups, including async replication, semi-sync replication, and group replication will not wait for the transactions to be applied on the secondary node(s) before committing. To my understanding, slave_parallel_workers
should only affect replication lag and have zero impact on TPM while this is clearly not the case.
Am I missing something?
Edit: To clear any confusion, I expect the value of slave_parallel_workers
will have no impact on TPM. While the reality is the higher the value of slave_parallel_workers
is, the higher the TPM is. So I’m hoping someone would be able to explain the reason for that.
Edit 2: All transactions are done on the primary node.
Answer :
You have many transactions running on the Primary. In some cases, there is no conflict over the rows that they access, so they proceed in parallel. Perhaps sometimes they conflict and one get delayed or rolled back due to a deadlock, but let’s say that ‘most’ proceed with no apparent conflicts.
At each COMMIT
, the entire transaction is replicated to the Replica(s), where they must be replayed. Assuming binlog_format=ROW, the replaying on the Replica is not as complex.
With slave_parallel_workers
> 1 and slave_parallel_type = LOGICAL_CLOCK
, “Transactions that are part of the same binary log group commit on a source are applied in parallel on a replica.”
That is, “transactions per minute” is likely to be increased.
More discussion: https://dev.mysql.com/doc/refman/5.7/en/replication-options-replica.html
I would expect the speedup factor to be ‘up to’ the value of slave_parallel_workers
. And I would expect that going above the number of CPU cores to give diminishing improvements.
With lots of conflicting transactions, I would expect both the Primary and the Replica to suffer.
If you have only one database, slave_parallel_type = DATABASE
probably prevents any parallelism.