MDEV-36025: backup taken from a replica with optimistic parallel replication fails to restore most of the time#4888
MDEV-36025: backup taken from a replica with optimistic parallel replication fails to restore most of the time#4888hemantdangi-gc wants to merge 1 commit into10.11from
Conversation
…ication fails to restore most of the time Issue: The commit 5836191 (MDEV-21168) was deliberately NOT ported to 10.5+. It added an optional --rollback-xa flag to mariabackup in 10.4 only, with this note in the commit message: "The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for slaves." However, MDEV-742 does not solve the problem for internal XA transactions, as MDEV-36025 demonstrates. The --rollback-xa option, SRV_OPERATION_RESTORE_ROLLBACK_XA, and related code are completely absent from the 10.6 codebase. Solution: Port the MDEV-21168 fix to MariaDB 10.6. Add SRV_OPERATION_RESTORE_ROLLBACK_XA server operation mode and --rollback-xa option (enabled by default) to mariabackup --prepare. This automatically rolls back prepared XA transactions during prepare, since the backup does not contain the binary log needed to resolve them. Prevent incompatible combination of --rollback_xa and --export options. The combination creates mmap state inconsistency in InnoDB's MTR system, leading to crash.
41e76d3 to
e61f267
Compare
How it demonstrates? At any rate the commit message should be more verbose in this part. Please describe that scenario. |
The I am saying here MDEV-742 didn't fixed needed issue, and so we do have to port MDEV-21168, to handle MDEV-36025 error. I wanted to put a reason in commit message why MDEV-21168 is needed so added this line. |
|
@hemantdangi-gc , whatever MDEV-742 failed to fix, that issue just has to be described in this ticket in all detail in the PR.
I thought I would see that failure scenario in some test, and that's exactly what a good commit message must point to. The solution section needs to be structured better too.
As MDEV-36025 is reported for slave, the refined issue description must either confirm this is the slave side indeed or exonerate 😄 the good old slave (the blame is on the general server therefore). PS. If you need to discuss the technical side of the issue I'll be available from next Tue. |
Issue:
The commit 5836191 (MDEV-21168) was deliberately NOT ported to 10.5+. It added an optional --rollback-xa flag to mariabackup in 10.4 only, with this note in the commit message:
"The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for slaves."
However, MDEV-742 does not solve the problem for internal XA transactions, as MDEV-36025 demonstrates. The --rollback-xa option, SRV_OPERATION_RESTORE_ROLLBACK_XA, and related code are completely absent from the 10.6 codebase.
Solution:
Port the MDEV-21168 fix to MariaDB 10.6.
Add SRV_OPERATION_RESTORE_ROLLBACK_XA server operation mode and --rollback-xa option (enabled by default) to mariabackup --prepare. This automatically rolls back prepared XA transactions during prepare, since the backup does not contain the binary log needed to resolve them.
Prevent incompatible combination of --rollback_xa and --export options. The combination creates mmap state inconsistency in InnoDB's MTR system, leading to crash.