CEP-45: Incremental repair for mutation tracking by aweisberg · Pull Request #4696 · apache/cassandra

aweisberg · 2026-03-27T19:59:39Z

No description provided.

maedhroz · 2026-03-31T19:34:36Z

+        for (Shard shard : overlappingShards)
+        {
+            ShardSyncState state = new ShardSyncState(shard, liveHostIds);
+            shardStates.put(shard.range, state);


Trying to reason about the thread safety of shardStates here...

The assignment is clearly visible after the CAS above, but are the iterations inside the callbacks later guaranteed to see the results of the put()s here?

Register sync coordinator is a write barrier because it does a put in a ConcurrentHashMap? So any prior writes will be visible? As long shardStates is effectively immutable after that particular map should be OK.

This could be an ImmutableMap which might make it a little clearer so I'll make that change.

maedhroz · 2026-04-01T04:17:28Z

+            finally
+            {
+                if (!allSucceeded)
+                    syncCoordinator.cancel();


What happens if the try block above produces InterruptedException? Do we need to cancel the rest of the sync coordinators (that hadn't been processed yet)?

We should clean them up just so they don't have to wait for their timeout to elapse to clean up. I'll rework the exception handling here to catch Exception instead of RuntimeException.

maedhroz · 2026-04-01T04:38:47Z

-            CLUSTER.get(1).nodetoolResult("repair", specification.keyspaceName()).asserts().success();
+            // Background reconciliation doesn't exist/work so incremental repair just hangs waiting for reconciliation that never occurs
+            if (specification.replicationType.isTracked())
+                CLUSTER.get(1).nodetoolResult("repair", "-full", specification.keyspaceName()).asserts().success();


Should an incremental repair request succeed after a successful full repair? It tried this, and it appears to hang, but I'm not sure why yet...

"node1_Repair#4:1" #270 daemon prio=5 os_prio=31 cpu=0.08ms elapsed=92.27s tid=0x000000013de75200 nid=0x2530b waiting on condition [0x000000036944a000] java.lang.Thread.State: TIMED_WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@11.0.19/Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.19/LockSupport.java:357) at org.apache.cassandra.utils.concurrent.AsyncFuture.awaitUntil(AsyncFuture.java:221) at org.apache.cassandra.utils.concurrent.Awaitable$Defaults.await(Awaitable.java:114) at org.apache.cassandra.utils.concurrent.AbstractFuture.await(AbstractFuture.java:482) at org.apache.cassandra.utils.concurrent.AbstractFuture.get(AbstractFuture.java:252) at org.apache.cassandra.replication.MutationTrackingSyncCoordinator.awaitCompletion(MutationTrackingSyncCoordinator.java:351) at org.apache.cassandra.repair.MutationTrackingIncrementalRepairTask.waitForSyncCompletion(MutationTrackingIncrementalRepairTask.java:127)

Anyway, I think the lack of background reconciliation still means that this won't work. The transfer IDs are only there to make sure read reconciliation works.

I misunderstood the test. I don't think IR should hang in this test because we aren't relying on background reconciliation. There aren't any down nodes at all.

Ah right now I remember. So the test inserts data using executeInternal which gives the mutation and id and applys it locally correclty, but because it's only applied locally it never propagates because there is no background reconciliation.

Mutations applied via execute/StorageProxy are given to ActiveLogReconciler which is basically in-memory hinted handoff for mutation tracking.

So this is working as intended for now in that we need to use full repair here instead of IR since IR can't complete until background reconciliation is done.

maedhroz · 2026-04-01T19:36:47Z

+    public int hashCode()
+    {
+        return Objects.hash(desc, offsetsByShard);
+    }


nit: Do we ever actually put MutationTrackingSyncResponse in a collection?

I'll remove hashCode and equals. I ran the tests and I don't think they get used anymore.

It's needed for RepairMessageSerializationsTest

maedhroz · 2026-04-01T19:38:38Z

+        {
+            logger.warn("Mutation tracking sync failed for keyspace {}", keyspace, error);
+            resultPromise.tryFailure(error);
+            return;