We are experiencing that workflow and workspace items are mistakenly included in the replication queue when items get modified.
Looking closely at METSReplicateConsumer.java, unarchived items (like those sitting in a user's workspace or moving through an approval workflow) can indeed trigger replication events, even though the original implementation tried to prevent it.
While there is a check in place:
// if NOT (Create & Item)
// (i.e. We don't want to replicate items UNTIL they are Installed)
if (!(subjType == Constants.ITEM && evType == CREATE)) {
other key checks have been missed in two other major event types:
-
MODIFY and MODIFY_METADATA: If a user edits the metadata of a workspace or workflow item, DSpace fires a MODIFY_METADATA event. The code only checks if the object exists and has a Handle (id != null). Because DSpace often assigns Handles before items are fully archived, these in-progress changes get immediately queued for replication.
-
ADD: When a workspace item is initially mapped or added to a Collection, it triggers an ADD event on the Collection. The consumer currently proceeds to queue both the Collection and the mapped Item for replication without checking if that Item is actually installed/archived.
We are experiencing that workflow and workspace items are mistakenly included in the replication queue when items get modified.
Looking closely at METSReplicateConsumer.java, unarchived items (like those sitting in a user's workspace or moving through an approval workflow) can indeed trigger replication events, even though the original implementation tried to prevent it.
While there is a check in place:
other key checks have been missed in two other major event types:
MODIFY and MODIFY_METADATA: If a user edits the metadata of a workspace or workflow item, DSpace fires a MODIFY_METADATA event. The code only checks if the object exists and has a Handle (id != null). Because DSpace often assigns Handles before items are fully archived, these in-progress changes get immediately queued for replication.
ADD: When a workspace item is initially mapped or added to a Collection, it triggers an ADD event on the Collection. The consumer currently proceeds to queue both the Collection and the mapped Item for replication without checking if that Item is actually installed/archived.