Skip to content

fix(gitlab): preserve user-added labels#2484

Open
ashearin wants to merge 1 commit into
The-PR-Agent:mainfrom
ashearin:fix/gitlab-preserve-labels
Open

fix(gitlab): preserve user-added labels#2484
ashearin wants to merge 1 commit into
The-PR-Agent:mainfrom
ashearin:fix/gitlab-preserve-labels

Conversation

@ashearin

Copy link
Copy Markdown

fix(gitlab): preserve user-added labels in publish_labels and get_pr_labels

Note on history: earlier revisions of this branch grew scope through review iteration (typed exceptions, managed-label predicates, helper extraction across pr_description / pr_reviewer). I've squashed and rebased onto current upstream/main so the PR contains only the targeted GitLab provider fix and its unit tests. The wider refactor was reverted; happy to revisit any of it in a follow-up PR if maintainers prefer that direction.

Summary

The GitLab provider drops user-added labels every time PR-Agent publishes a label set (most visibly during /review). Two compounding bugs in GitLabProvider cause a read-modify-write race that wipes any label a user added after the webhook fired.

This PR fixes both bugs on the GitLab provider in a single, focused change. The hot path (no refresh requested, no label changes needed) is unchanged.

What users see

"Every time /review runs, the agent removes the labels I just added to the MR."

It only happens on GitLab. GitHub and other providers are unaffected because the bug is provider-local.

Reproduction

  1. Configure PR-Agent against a GitLab project with auto_review enabled (or trigger /review manually).
  2. Open a merge request; let the agent post its first review (which applies Review effort N/5 and possibly Possible security concern labels).
  3. As a user, add any label of your own (e.g., area/backend).
  4. Trigger /review again, or push a new commit so auto_review fires.
  5. Observed: the agent's Review effort label is updated, but area/backend is gone.
  6. Expected: area/backend is preserved; only the agent-managed labels are updated.

Root cause

PRReviewer.set_review_labels does a read-modify-write cycle specifically to preserve user-added labels:

current_labels = self.git_provider.get_pr_labels(update=True)
current_labels_filtered = [
    label for label in current_labels
    if not label.lower().startswith("review effort")
    and not label.lower().startswith("possible security concern")
]
new_labels = review_labels + current_labels_filtered
self.git_provider.publish_labels(new_labels)

Both halves of that cycle are broken on the GitLab provider.

Bug 1 — get_pr_labels(update=True) ignores the flag

def get_pr_labels(self, update=False):
    return self.mr.labels

The update argument is unused; the method always returns self.mr.labels from the snapshot cached at provider construction. A label the user added after the webhook fired is missing from current_labels, so the filter step drops it out of new_labels.

Bug 2 — publish_labels does a full PUT against the same stale snapshot

def publish_labels(self, pr_types):
    try:
        self.mr.labels = list(set(pr_types))
        self.mr.save()
    ...

Setting self.mr.labels and calling save() translates to a full PUT on the labels array — the server overwrites whatever it has with pr_types. Any label added during the read-modify-write window (or any label missed because of Bug 1) is wiped.

Fix

Both halves of the cycle now refresh from the server, and the write becomes an incremental set-diff rather than a full overwrite.

get_pr_labels(update=True) now actually refreshes

def get_pr_labels(self, update=False):
    if update:
        self.mr = self._get_merge_request()
    return self.mr.labels

update=False keeps the cached read, so callers that don't ask for a refresh see no behavior change. Refresh failure propagates to the caller — PRReviewer.set_review_labels already wraps its get_pr_labels(update=True) call in a try/except, so a transient GitLab API error degrades into "skip the label update for this run" rather than corrupting the read-modify-write cycle.

publish_labels refreshes, writes a diff, and cleans up

def publish_labels(self, pr_types):
    try:
        desired = set(pr_types)
        try:
            self.mr = self._get_merge_request()
        except Exception as refresh_err:
            get_logger().warning(
                f"publish_labels: aborting, failed to refresh MR before save. error: {refresh_err}"
            )
            return
        current = set(self.mr.labels or [])
        to_add = sorted(desired - current)
        to_remove = sorted(current - desired)
        if not to_add and not to_remove:
            return
        try:
            if to_add:
                self.mr.add_labels = ",".join(to_add)
            if to_remove:
                self.mr.remove_labels = ",".join(to_remove)
            self.mr.save()
        finally:
            for attr in ("add_labels", "remove_labels"):
                try:
                    delattr(self.mr, attr)
                except (AttributeError, KeyError):
                    pass
    except Exception as e:
        get_logger().warning(f"Failed to publish labels, error: {e}")

Key changes from the previous implementation:

  • Refresh immediately before the diff. Concurrent additions outside the diff are preserved by the server.
  • Strict abort on refresh failure. A stale snapshot would re-introduce the very bug this method is meant to fix, so we log and return without calling save().
  • Incremental update via add_labels / remove_labels. python-gitlab forwards those attributes to identically-named parameters on PUT /projects/:id/merge_requests/:merge_request_iid (GitLab MR API). The server applies a set-diff rather than overwriting the array.
  • try/finally clears the transient diff attrs. add_labels and remove_labels are delattr'd after save() (or on save failure) so they cannot leak into a subsequent self.mr.save() from an unrelated tool (e.g., publish_description).
  • No-op publishes return early. If pr_types already matches the server's label set, no save() is issued at all.

What is NOT changed

  • Other providers (GithubProvider, BitbucketProvider, BitbucketServerProvider, AzureDevopsProvider, GiteaProvider, GerritProvider, CodeCommitProvider, LocalGitProvider).
  • The public signatures of publish_labels and get_pr_labels.
  • The behavior of get_pr_labels(update=False) (still returns the cached snapshot).
  • Any caller of set_review_labels, PRDescription, PRGenerateLabels, etc. — they continue to call the same provider methods with the same arguments.

Tests

Added under tests/unittest/test_gitlab_provider.py (10 new cases, all isolated to the two methods this PR touches):

publish_labels:

  • noop when the desired and current label sets are equal (save() not called);
  • add-only diff;
  • remove-only diff;
  • user labels outside the diff are not overwritten (regression test for the original bug);
  • abort on refresh failure (save() not called);
  • transient diff attributes (add_labels / remove_labels) are cleared even when save() raises.

get_pr_labels:

  • update=False returns the cached snapshot and does not refresh;
  • update=True refreshes via _get_merge_request() and returns the fresh labels;
  • update=True propagates refresh failures to the caller.

Related

  • The bug exists in every released version of pr-agent I checked (v0.34.xv0.37.0).
  • This affects every GitLab user who has both auto_review (or repeated manual /review) and any user-curated label workflow.

@github-actions github-actions Bot added the bug label Jun 30, 2026
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

PR Summary by Qodo

fix(gitlab): preserve user-added labels during label publish/refresh

🐞 Bug fix 🧪 Tests 🕐 20-40 Minutes

Grey Divider

AI Description

• Refresh GitLab MR labels on demand to avoid using a stale cached snapshot.
• Publish labels via add/remove diffs (not full overwrite) to preserve user labels.
• Add unit tests covering no-op, diff updates, refresh failures, and cleanup behavior.
Diagram

graph TD
  A["PRReviewer.set_review_labels"] --> B["GitLabProvider.get_pr_labels(update=True)"] --> C["_get_merge_request() refresh"] --> D[("GitLab MR API")]
  A --> E["GitLabProvider.publish_labels(desired)"] --> C
  E --> F{"Diff empty?"}
  F -->|"Yes"| G["Return (no save)"]
  F -->|"No"| H["MR.save(add_labels/remove_labels)"] --> D
  subgraph Legend
    direction LR
    _svc["Service/Method"] ~~~ _dec{"Decision"} ~~~ _api[("External API")]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Refresh then overwrite full label array
  • ➕ Simpler implementation; no transient add_labels/remove_labels cleanup needed
  • ➕ Matches prior conceptual model of 'set labels to this exact list'
  • ➖ Still vulnerable to races between refresh and PUT (concurrent user label changes can be overwritten)
  • ➖ Harder to guarantee preservation of non-managed labels without additional merge logic
2. Per-label add/remove API calls (multiple requests)
  • ➕ Avoids reliance on python-gitlab transient attributes
  • ➕ Each operation is narrowly scoped, reducing clobber risk
  • ➖ More API calls and latency (especially for large diffs)
  • ➖ More failure modes (partial updates) unless carefully handled/rolled back

Recommendation: Keep the PR’s diff-based approach using GitLab’s add_labels/remove_labels with a mandatory refresh-before-save. It best preserves user-added labels under concurrent updates while remaining a single request in the common case (and a no-op when already in sync).

Files changed (2) +230 / -2

Bug fix (1) +68 / -2
gitlab_provider.pyMake GitLab label reads refreshable and label writes race-safe +68/-2

Make GitLab label reads refreshable and label writes race-safe

• Updates publish_labels to refresh the MR before computing a label set diff, then applies changes via add_labels/remove_labels instead of overwriting the full labels array. Updates get_pr_labels(update=True) to actually re-fetch the MR and to propagate refresh failures (instead of silently returning cached labels).

pr_agent/git_providers/gitlab_provider.py

Tests (1) +162 / -0
test_gitlab_provider.pyAdd regression and edge-case coverage for GitLab label behavior +162/-0

Add regression and edge-case coverage for GitLab label behavior

• Adds focused unit tests validating publish_labels no-op behavior, add/remove diffs, preservation of user labels, refresh failure aborts, and transient attribute cleanup on save exceptions. Adds tests for get_pr_labels ensuring update=False uses cached data, update=True refreshes, and refresh errors propagate.

tests/unittest/test_gitlab_provider.py

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (3) 📜 Skill insights (0)

Grey Divider


Action required

1. F541 f-strings in debug() 📘 Rule violation ⚙ Maintainability
Description
New log lines use f-strings with no interpolations (e.g., f"original labels"), which triggers
flake8 F541 and can fail CI lint gates. Replace them with regular string literals to satisfy
linting requirements without changing behavior.
Code

pr_agent/tools/pr_description.py[R167-176]

+                        original_labels = self.git_provider.get_pr_labels(update=True)
+                        get_logger().debug(f"original labels", artifact=original_labels)
+                        user_labels = get_user_labels(original_labels)
+                        new_labels = pr_labels + user_labels
+                        get_logger().debug(f"published labels", artifact=new_labels)
+                        if set(new_labels) != set(original_labels):
+                            get_logger().info(f"Setting describe labels:\n{new_labels}")
+                            self.git_provider.publish_labels(new_labels)
+                        else:
+                            get_logger().debug(f"Labels are the same, not updating")
Evidence
PR Compliance ID 9 requires resolving flake8 issues in changed code. The added logging lines call
get_logger().debug() with f"..." strings that contain no interpolation fields, which is a known
flake8 violation (F541).

AGENTS.md: Run and Fix Linting: Resolve All flake8 Issues Without Changing Program Logic During Cleanup: AGENTS.md: Run and Fix Linting: Resolve All flake8 Issues Without Changing Program Logic During Cleanup
pr_agent/tools/pr_description.py[167-176]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pr_agent/tools/pr_description.py` uses f-strings without any `{}` placeholders in new/modified logging statements, which violates flake8 (`F541`) and can break lint checks.
## Issue Context
These are purely mechanical fixes: replace `f"..."` with plain string literals (`"..."`) where no interpolation is needed.
## Fix Focus Areas
- pr_agent/tools/pr_description.py[167-176]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. s3cret credential in test URL 📘 Rule violation ⛨ Security
Description
A credential-like value is hardcoded in a URL inside a unit test, which can be picked up by secret
scanners and risks accidental reuse or leakage. Replace it with a clearly redacted/non-secret
placeholder that does not resemble real credentials.
Code

tests/unittest/test_gitlab_provider.py[R367-372]

+        class _SecretError(RuntimeError):
+            """Sentinel whose repr would carry a credentialed URL if logged raw."""
+
+        gitlab_provider._get_merge_request = MagicMock(
+            side_effect=_SecretError("https://oauth2:******@gitlab.example.com/api/v4/...")
+        )
Evidence
PR Compliance ID 6 forbids committing secrets/tokens in source or tests. The added test code
includes a credential-like URL containing oauth2:s3cret@..., which meets the failure criteria for
hardcoded secret-bearing values.

AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration: AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration: AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration: AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration
tests/unittest/test_gitlab_provider.py[367-372]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A unit test hardcodes a credential-like URL (`https://oauth2:******@...`) which violates the repository requirement to avoid committed secrets/tokens (and may trigger secret scanning).
## Issue Context
The value appears intended as a sentinel example, but it still resembles real credentials and should be replaced with a clearly redacted placeholder (e.g., `oauth2:[REDACTED]`, `oauth2:token`, or remove the credential portion entirely).
## Fix Focus Areas
- tests/unittest/test_gitlab_provider.py[367-372]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Concurrent labels still dropped 🐞 Bug ≡ Correctness
Description
GitLabProvider.publish_labels() computes to_remove = current - desired from a refreshed MR
snapshot, so any label a user adds after a caller reads labels (to build desired) but before
publish_labels() refreshes will be treated as “extra” and removed. This means /review and
/describe can still wipe user-added labels in a narrow intra-run window despite the intended
race-safety.
Code

pr_agent/git_providers/gitlab_provider.py[R904-937]

+            desired = set(pr_types)
+            try:
+                self.mr = self._get_merge_request()
+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                #
+                # Log only the exception type — the message from HTTP/SDK
+                # client errors may include credentialed URLs or headers, so
+                # we avoid emitting the raw object.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save "
+                    f"({type(refresh_err).__name__})"
+                )
+                return
+            current = set(self.mr.labels or [])
+            to_add = sorted(desired - current)
+            to_remove = sorted(current - desired)
+            if not to_add and not to_remove:
+                return
+            # GitLab accepts comma-separated strings for these attributes.
+            # Wrap the save in try/finally so the transient add_labels /
+            # remove_labels attributes are always cleared from ``self.mr``,
+            # even on save() failure. Otherwise a later self.mr.save() call
+            # in an unrelated tool (e.g. publish_description) would resend
+            # the prior label diff and cause unexpected churn.
+            try:
+                if to_add:
+                    self.mr.add_labels = ",".join(to_add)
+                if to_remove:
+                    self.mr.remove_labels = ",".join(to_remove)
+                self.mr.save()
Evidence
The provider computes removals as a full set difference against a refreshed snapshot, while key
callers build the desired label set from an earlier read; any labels added between these two reads
are absent from desired but present in current, so they get removed.

pr_agent/git_providers/gitlab_provider.py[887-946]
pr_agent/tools/pr_reviewer.py[365-416]
pr_agent/tools/pr_description.py[156-182]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` refreshes the MR and diffs against that refreshed label set. In read-modify-write flows (`get_pr_labels(update=True)` -> compute desired -> `publish_labels(desired)`), this second refresh can include labels added concurrently after the read; since those labels are not in `desired`, they end up in `to_remove` and are removed.
### Issue Context
- Callers like `PRReviewer.set_review_labels()` and `PRDescription.run()` already perform a label read specifically to preserve user labels.
- `publish_labels()` performs another refresh unconditionally before computing `to_remove`, which can invalidate the caller’s “preserve” set.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[887-947]
### Suggested change
Add an optional parameter to `publish_labels()` (e.g., `current_labels: list[str] | None = None` or `mr_snapshot=None`).
- If `current_labels` is provided, **skip** the internal `_get_merge_request()` refresh and compute `current` from the provided snapshot.
- Keep the existing refresh behavior as the default when no snapshot is provided.
Update read-modify-write callers to pass the labels they just fetched:
- `PRReviewer.set_review_labels`: pass `current_labels` returned from `get_pr_labels(update=True)`.
- `PRDescription.run`: pass `original_labels` returned from `get_pr_labels(update=True)`.
This eliminates the intra-run “double refresh” removal case and also reduces an extra API call per label update.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Logs raw refresh_err 📘 Rule violation ⛨ Security
Description
publish_labels() logs the full exception text from a GitLab refresh failure ({refresh_err}),
which can include sensitive details (e.g., credentialed URLs or request context) depending on the
underlying client error. Logs are a security boundary and should avoid emitting secret-bearing
values.
Code

pr_agent/git_providers/gitlab_provider.py[R907-915]

+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save. error: {refresh_err}"
+                )
+                return
Evidence
PR Compliance ID 20 requires treating logs as a security boundary and avoiding logging
secrets/secret-bearing values. The new code logs the full refresh_err exception string in a
warning message, which risks exposing sensitive request details depending on the exception content.

pr_agent/git_providers/gitlab_provider.py[905-915]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` logs the raw exception object (`refresh_err`) when `_get_merge_request()` fails. Exception strings from HTTP clients can sometimes include sensitive data (credentialed URLs, headers, or other context), so logging them verbatim can leak secrets.
## Issue Context
This PR introduces a new refresh-failure logging path inside `publish_labels()`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[905-915]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Describe aborted on refresh 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_pr_labels(update=True) now refreshes via _get_merge_request() without
catching exceptions, so a transient GitLab API failure can raise out of get_pr_labels.
PRDescription calls get_pr_labels(update=True) in its main flow without isolating errors, which
can prevent the rest of /describe (including description publishing) from running.
Code

pr_agent/git_providers/gitlab_provider.py[R962-964]

+        if update:
+            self.mr = self._get_merge_request()
     return self.mr.labels
Evidence
The GitLab provider now raises on refresh when update=True, and PRDescription calls it before
publishing the description; a raised exception will skip the subsequent steps in that try block.
Other providers’ get_pr_labels methods swallow exceptions and return [], making GitLab’s new
behavior a caller-visible contract change.

pr_agent/git_providers/gitlab_provider.py[948-964]
pr_agent/tools/pr_description.py[156-169]
pr_agent/git_providers/github_provider.py[934-946]
pr_agent/git_providers/azuredevops_provider.py[141-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GitLabProvider.get_pr_labels(update=True)` now raises on refresh failure. At least one call site (`pr_description`) doesn’t handle this locally, so `/describe` can abort entirely when the GitLab refresh call fails.
### Issue Context
Other providers treat label reads as best-effort and return `[]` on failure, but GitLab is now strict. Keeping the strictness is fine for correctness, but callers that shouldn’t fail the whole tool run must catch and downgrade the failure to “skip label publishing”.
### Fix Focus Areas
- pr_agent/tools/pr_description.py[156-170]
- pr_agent/git_providers/gitlab_provider.py[948-964]
### Suggested fix
Wrap the label publishing section in `PRDescription` with a `try/except` around `get_pr_labels(update=True)` (and/or the whole label block). On exception, log a warning and continue to publish the description.
(Alternative: make `GitLabProvider.get_pr_labels` consistent with other providers by catching refresh exceptions and returning `[]` or cached labels, but that weakens the strict policy this PR introduced.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational

6. Stale cached labels after publish 🐞 Bug ☼ Reliability ⭐ New
Description
After GitLabProvider.publish_labels() saves an add/remove diff, it does not update
self.mr.labels, while get_pr_labels(update=False) returns the cached self.mr.labels without
refreshing. Within the same provider instance, any later code path that calls get_pr_labels()
without update=True can observe a stale label set until a refresh occurs.
Code

pr_agent/git_providers/gitlab_provider.py[R953-969]

    def get_pr_labels(self, update=False):
+        # The previous implementation ignored the ``update`` flag entirely and
+        # always returned the snapshot cached at provider construction. Callers
+        # such as ``PRReviewer.set_review_labels`` rely on a fresh read to
+        # preserve user-added labels in their read-modify-write cycle; without
+        # the refresh, any label added after the webhook fired would be missing
+        # from ``current_labels`` and dropped by the subsequent publish_labels.
+        #
+        # Strict policy on refresh failure: a stale snapshot will produce an
+        # incorrect diff in the caller's filter-and-republish cycle, so we
+        # surface the failure to the caller instead of silently returning
+        # cached data. ``set_review_labels`` already wraps this in a broad
+        # try/except, so the failure degrades into "skip the label update for
+        # this run" rather than breaking the whole review.
+        if update:
+            self.mr = self._get_merge_request()
        return self.mr.labels
Evidence
get_pr_labels(update=False) returns the cached self.mr.labels unless update=True triggers
_get_merge_request(). The new publish_labels() path only sets transient
add_labels/remove_labels and calls save(), without updating self.mr.labels, so the cached
value can lag behind the actual server state.

pr_agent/git_providers/gitlab_provider.py[887-947]
pr_agent/git_providers/gitlab_provider.py[953-969]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`publish_labels()` now updates labels via `add_labels` / `remove_labels` + `save()`, but it doesn’t update the in-memory `self.mr.labels` snapshot. Since `get_pr_labels(update=False)` returns `self.mr.labels` as-is, subsequent reads on the same provider instance can return stale labels.

### Issue Context
This is not a server-state bug (the server is updated by `save()`), but it can create confusing/inconsistent behavior for any later logic in the same run that reads labels without `update=True`.

### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[887-947]
- pr_agent/git_providers/gitlab_provider.py[953-969]

### Suggested fix
After a successful `self.mr.save()` in `publish_labels()`, do one of:
- Refresh: `self.mr = self._get_merge_request()` (strong consistency, extra API call), or
- Update the cache: set `self.mr.labels` to reflect the intended post-save set (e.g., `list(desired)`), or
- Explicitly document/ensure internal callers always use `get_pr_labels(update=True)` after publishing labels.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

  • Author self-review: I have reviewed the code review findings, and addressed the relevant ones.

Previous review results

Review updated until commit 22facfb

Results up to commit ee9060d


🐞 Bugs (2) 📘 Rule violations (3) 📜 Skill insights (0)


Action required
1. F541 f-strings in debug() 📘 Rule violation ⚙ Maintainability ⭐ New
Description
New log lines use f-strings with no interpolations (e.g., f"original labels"), which triggers
flake8 F541 and can fail CI lint gates. Replace them with regular string literals to satisfy
linting requirements without changing behavior.
Code

pr_agent/tools/pr_description.py[R167-176]

+                        original_labels = self.git_provider.get_pr_labels(update=True)
+                        get_logger().debug(f"original labels", artifact=original_labels)
+                        user_labels = get_user_labels(original_labels)
+                        new_labels = pr_labels + user_labels
+                        get_logger().debug(f"published labels", artifact=new_labels)
+                        if set(new_labels) != set(original_labels):
+                            get_logger().info(f"Setting describe labels:\n{new_labels}")
+                            self.git_provider.publish_labels(new_labels)
+                        else:
+                            get_logger().debug(f"Labels are the same, not updating")
Evidence
PR Compliance ID 9 requires resolving flake8 issues in changed code. The added logging lines call
get_logger().debug() with f"..." strings that contain no interpolation fields, which is a known
flake8 violation (F541).

AGENTS.md: Run and Fix Linting: Resolve All flake8 Issues Without Changing Program Logic During Cleanup
pr_agent/tools/pr_description.py[167-176]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pr_agent/tools/pr_description.py` uses f-strings without any `{}` placeholders in new/modified logging statements, which violates flake8 (`F541`) and can break lint checks.

## Issue Context
These are purely mechanical fixes: replace `f"..."` with plain string literals (`"..."`) where no interpolation is needed.

## Fix Focus Areas
- pr_agent/tools/pr_description.py[167-176]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. s3cret credential in test URL 📘 Rule violation ⛨ Security
Description
A credential-like value is hardcoded in a URL inside a unit test, which can be picked up by secret
scanners and risks accidental reuse or leakage. Replace it with a clearly redacted/non-secret
placeholder that does not resemble real credentials.
Code

tests/unittest/test_gitlab_provider.py[R367-372]

+        class _SecretError(RuntimeError):
+            """Sentinel whose repr would carry a credentialed URL if logged raw."""
+
+        gitlab_provider._get_merge_request = MagicMock(
+            side_effect=_SecretError("https://oauth2:******@gitlab.example.com/api/v4/...")
+        )
Evidence
PR Compliance ID 6 forbids committing secrets/tokens in source or tests. The added test code
includes a credential-like URL containing oauth2:s3cret@..., which meets the failure criteria for
hardcoded secret-bearing values.

AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration: AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration
tests/unittest/test_gitlab_provider.py[367-372]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A unit test hardcodes a credential-like URL (`https://oauth2:******@...`) which violates the repository requirement to avoid committed secrets/tokens (and may trigger secret scanning).
## Issue Context
The value appears intended as a sentinel example, but it still resembles real credentials and should be replaced with a clearly redacted placeholder (e.g., `oauth2:[REDACTED]`, `oauth2:token`, or remove the credential portion entirely).
## Fix Focus Areas
- tests/unittest/test_gitlab_provider.py[367-372]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
3. Concurrent labels still dropped 🐞 Bug ≡ Correctness
Description
GitLabProvider.publish_labels() computes to_remove = current - desired from a refreshed MR
snapshot, so any label a user adds after a caller reads labels (to build desired) but before
publish_labels() refreshes will be treated as “extra” and removed. This means /review and
/describe can still wipe user-added labels in a narrow intra-run window despite the intended
race-safety.
Code

pr_agent/git_providers/gitlab_provider.py[R904-937]

+            desired = set(pr_types)
+            try:
+                self.mr = self._get_merge_request()
+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                #
+                # Log only the exception type — the message from HTTP/SDK
+                # client errors may include credentialed URLs or headers, so
+                # we avoid emitting the raw object.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save "
+                    f"({type(refresh_err).__name__})"
+                )
+                return
+            current = set(self.mr.labels or [])
+            to_add = sorted(desired - current)
+            to_remove = sorted(current - desired)
+            if not to_add and not to_remove:
+                return
+            # GitLab accepts comma-separated strings for these attributes.
+            # Wrap the save in try/finally so the transient add_labels /
+            # remove_labels attributes are always cleared from ``self.mr``,
+            # even on save() failure. Otherwise a later self.mr.save() call
+            # in an unrelated tool (e.g. publish_description) would resend
+            # the prior label diff and cause unexpected churn.
+            try:
+                if to_add:
+                    self.mr.add_labels = ",".join(to_add)
+                if to_remove:
+                    self.mr.remove_labels = ",".join(to_remove)
+                self.mr.save()
Evidence
The provider computes removals as a full set difference against a refreshed snapshot, while key
callers build the desired label set from an earlier read; any labels added between these two reads
are absent from desired but present in current, so they get removed.

pr_agent/git_providers/gitlab_provider.py[887-946]
pr_agent/tools/pr_reviewer.py[365-416]
pr_agent/tools/pr_description.py[156-182]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` refreshes the MR and diffs against that refreshed label set. In read-modify-write flows (`get_pr_labels(update=True)` -> compute desired -> `publish_labels(desired)`), this second refresh can include labels added concurrently after the read; since those labels are not in `desired`, they end up in `to_remove` and are removed.
### Issue Context
- Callers like `PRReviewer.set_review_labels()` and `PRDescription.run()` already perform a label read specifically to preserve user labels.
- `publish_labels()` performs another refresh unconditionally before computing `to_remove`, which can invalidate the caller’s “preserve” set.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[887-947]
### Suggested change
Add an optional parameter to `publish_labels()` (e.g., `current_labels: list[str] | None = None` or `mr_snapshot=None`).
- If `current_labels` is provided, **skip** the internal `_get_merge_request()` refresh and compute `current` from the provided snapshot.
- Keep the existing refresh behavior as the default when no snapshot is provided.
Update read-modify-write callers to pass the labels they just fetched:
- `PRReviewer.set_review_labels`: pass `current_labels` returned from `get_pr_labels(update=True)`.
- `PRDescription.run`: pass `original_labels` returned from `get_pr_labels(update=True)`.
This eliminates the intra-run “double refresh” removal case and also reduces an extra API call per label update.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Logs raw refresh_err 📘 Rule violation ⛨ Security
Description
publish_labels() logs the full exception text from a GitLab refresh failure ({refresh_err}),
which can include sensitive details (e.g., credentialed URLs or request context) depending on the
underlying client error. Logs are a security boundary and should avoid emitting secret-bearing
values.
Code

pr_agent/git_providers/gitlab_provider.py[R907-915]

+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save. error: {refresh_err}"
+                )
+                return
Evidence
PR Compliance ID 20 requires treating logs as a security boundary and avoiding logging
secrets/secret-bearing values. The new code logs the full refresh_err exception string in a
warning message, which risks exposing sensitive request details depending on the exception content.

pr_agent/git_providers/gitlab_provider.py[905-915]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` logs the raw exception object (`refresh_err`) when `_get_merge_request()` fails. Exception strings from HTTP clients can sometimes include sensitive data (credentialed URLs, headers, or other context), so logging them verbatim can leak secrets.
## Issue Context
This PR introduces a new refresh-failure logging path inside `publish_labels()`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[905-915]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Describe aborted on refresh 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_pr_labels(update=True) now refreshes via _get_merge_request() without
catching exceptions, so a transient GitLab API failure can raise out of get_pr_labels.
PRDescription calls get_pr_labels(update=True) in its main flow without isolating errors, which
can prevent the rest of /describe (including description publishing) from running.
Code

pr_agent/git_providers/gitlab_provider.py[R962-964]

+        if update:
+            self.mr = self._get_merge_request()
      return self.mr.labels
Evidence
The GitLab provider now raises on refresh when update=True, and PRDescription calls it before
publishing the description; a raised exception will skip the subsequent steps in that try block.
Other providers’ get_pr_labels methods swallow exceptions and return [], making GitLab’s new
behavior a caller-visible contract change.

pr_agent/git_providers/gitlab_provider.py[948-964]
pr_agent/tools/pr_description.py[156-169]
pr_agent/git_providers/github_provider.py[934-946]
pr_agent/git_providers/azuredevops_provider.py[141-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GitLabProvider.get_pr_labels(update=True)` now raises on refresh failure. At least one call site (`pr_description`) doesn’t handle this locally, so `/describe` can abort entirely when the GitLab refresh call fails.
### Issue Context
Other providers treat label reads as best-effort and return `[]` on failure, but GitLab is now strict. Keeping the strictness is fine for correctness, but callers that shouldn’t fail the whole tool run must catch and downgrade the failure to “skip label publishing”.
### Fix Focus Areas
- pr_agent/tools/pr_description.py[156-170]
- pr_agent/git_providers/gitlab_provider.py[948-964]
### Suggested fix
Wrap the label publishing section in `PRDescription` with a `try/except` around `get_pr_labels(update=True)` (and/or the whole label block). On exception, log a warning and continue to publish the description.
(Alternative: make `GitLabProvider.get_pr_labels` consistent with other providers by catching refresh exceptions and returning `[]` or cached labels, but that weakens the strict policy this PR introduced.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Results up to commit e26403b


🐞 Bugs (2) 📘 Rule violations (2) 📜 Skill insights (0)


Action required
1. s3cret credential in test URL 📘 Rule violation ⛨ Security ⭐ New
Description
A credential-like value is hardcoded in a URL inside a unit test, which can be picked up by secret
scanners and risks accidental reuse or leakage. Replace it with a clearly redacted/non-secret
placeholder that does not resemble real credentials.
Code

tests/unittest/test_gitlab_provider.py[R367-372]

+        class _SecretError(RuntimeError):
+            """Sentinel whose repr would carry a credentialed URL if logged raw."""
+
+        gitlab_provider._get_merge_request = MagicMock(
+            side_effect=_SecretError("https://oauth2:******@gitlab.example.com/api/v4/...")
+        )
Evidence
PR Compliance ID 6 forbids committing secrets/tokens in source or tests. The added test code
includes a credential-like URL containing oauth2:s3cret@..., which meets the failure criteria for
hardcoded secret-bearing values.

AGENTS.md: No Hardcoded Secrets or Tokens in Code or Configuration
tests/unittest/test_gitlab_provider.py[367-372]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A unit test hardcodes a credential-like URL (`https://oauth2:******@...`) which violates the repository requirement to avoid committed secrets/tokens (and may trigger secret scanning).

## Issue Context
The value appears intended as a sentinel example, but it still resembles real credentials and should be replaced with a clearly redacted placeholder (e.g., `oauth2:[REDACTED]`, `oauth2:token`, or remove the credential portion entirely).

## Fix Focus Areas
- tests/unittest/test_gitlab_provider.py[367-372]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
2. Concurrent labels still dropped 🐞 Bug ≡ Correctness ⭐ New
Description
GitLabProvider.publish_labels() computes to_remove = current - desired from a refreshed MR
snapshot, so any label a user adds after a caller reads labels (to build desired) but before
publish_labels() refreshes will be treated as “extra” and removed. This means /review and
/describe can still wipe user-added labels in a narrow intra-run window despite the intended
race-safety.
Code

pr_agent/git_providers/gitlab_provider.py[R904-937]

+            desired = set(pr_types)
+            try:
+                self.mr = self._get_merge_request()
+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                #
+                # Log only the exception type — the message from HTTP/SDK
+                # client errors may include credentialed URLs or headers, so
+                # we avoid emitting the raw object.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save "
+                    f"({type(refresh_err).__name__})"
+                )
+                return
+            current = set(self.mr.labels or [])
+            to_add = sorted(desired - current)
+            to_remove = sorted(current - desired)
+            if not to_add and not to_remove:
+                return
+            # GitLab accepts comma-separated strings for these attributes.
+            # Wrap the save in try/finally so the transient add_labels /
+            # remove_labels attributes are always cleared from ``self.mr``,
+            # even on save() failure. Otherwise a later self.mr.save() call
+            # in an unrelated tool (e.g. publish_description) would resend
+            # the prior label diff and cause unexpected churn.
+            try:
+                if to_add:
+                    self.mr.add_labels = ",".join(to_add)
+                if to_remove:
+                    self.mr.remove_labels = ",".join(to_remove)
+                self.mr.save()
Evidence
The provider computes removals as a full set difference against a refreshed snapshot, while key
callers build the desired label set from an earlier read; any labels added between these two reads
are absent from desired but present in current, so they get removed.

pr_agent/git_providers/gitlab_provider.py[887-946]
pr_agent/tools/pr_reviewer.py[365-416]
pr_agent/tools/pr_description.py[156-182]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`publish_labels()` refreshes the MR and diffs against that refreshed label set. In read-modify-write flows (`get_pr_labels(update=True)` -> compute desired -> `publish_labels(desired)`), this second refresh can include labels added concurrently after the read; since those labels are not in `desired`, they end up in `to_remove` and are removed.

### Issue Context
- Callers like `PRReviewer.set_review_labels()` and `PRDescription.run()` already perform a label read specifically to preserve user labels.
- `publish_labels()` performs another refresh unconditionally before computing `to_remove`, which can invalidate the caller’s “preserve” set.

### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[887-947]

### Suggested change
Add an optional parameter to `publish_labels()` (e.g., `current_labels: list[str] | None = None` or `mr_snapshot=None`).
- If `current_labels` is provided, **skip** the internal `_get_merge_request()` refresh and compute `current` from the provided snapshot.
- Keep the existing refresh behavior as the default when no snapshot is provided.

Update read-modify-write callers to pass the labels they just fetched:
- `PRReviewer.set_review_labels`: pass `current_labels` returned from `get_pr_labels(update=True)`.
- `PRDescription.run`: pass `original_labels` returned from `get_pr_labels(update=True)`.

This eliminates the intra-run “double refresh” removal case and also reduces an extra API call per label update.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Logs raw refresh_err 📘 Rule violation ⛨ Security
Description
publish_labels() logs the full exception text from a GitLab refresh failure ({refresh_err}),
which can include sensitive details (e.g., credentialed URLs or request context) depending on the
underlying client error. Logs are a security boundary and should avoid emitting secret-bearing
values.
Code

pr_agent/git_providers/gitlab_provider.py[R907-915]

+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save. error: {refresh_err}"
+                )
+                return
Evidence
PR Compliance ID 20 requires treating logs as a security boundary and avoiding logging
secrets/secret-bearing values. The new code logs the full refresh_err exception string in a
warning message, which risks exposing sensitive request details depending on the exception content.

pr_agent/git_providers/gitlab_provider.py[905-915]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` logs the raw exception object (`refresh_err`) when `_get_merge_request()` fails. Exception strings from HTTP clients can sometimes include sensitive data (credentialed URLs, headers, or other context), so logging them verbatim can leak secrets.
## Issue Context
This PR introduces a new refresh-failure logging path inside `publish_labels()`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[905-915]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Describe aborted on refresh 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_pr_labels(update=True) now refreshes via _get_merge_request() without
catching exceptions, so a transient GitLab API failure can raise out of get_pr_labels.
PRDescription calls get_pr_labels(update=True) in its main flow without isolating errors, which
can prevent the rest of /describe (including description publishing) from running.
Code

pr_agent/git_providers/gitlab_provider.py[R962-964]

+        if update:
+            self.mr = self._get_merge_request()
       return self.mr.labels
Evidence
The GitLab provider now raises on refresh when update=True, and PRDescription calls it before
publishing the description; a raised exception will skip the subsequent steps in that try block.
Other providers’ get_pr_labels methods swallow exceptions and return [], making GitLab’s new
behavior a caller-visible contract change.

pr_agent/git_providers/gitlab_provider.py[948-964]
pr_agent/tools/pr_description.py[156-169]
pr_agent/git_providers/github_provider.py[934-946]
pr_agent/git_providers/azuredevops_provider.py[141-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GitLabProvider.get_pr_labels(update=True)` now raises on refresh failure. At least one call site (`pr_description`) doesn’t handle this locally, so `/describe` can abort entirely when the GitLab refresh call fails.
### Issue Context
Other providers treat label reads as best-effort and return `[]` on failure, but GitLab is now strict. Keeping the strictness is fine for correctness, but callers that shouldn’t fail the whole tool run must catch and downgrade the failure to “skip label publishing”.
### Fix Focus Areas
- pr_agent/tools/pr_description.py[156-170]
- pr_agent/git_providers/gitlab_provider.py[948-964]
### Suggested fix
Wrap the label publishing section in `PRDescription` with a `try/except` around `get_pr_labels(update=True)` (and/or the whole label block). On exception, log a warning and continue to publish the description.
(Alternative: make `GitLabProvider.get_pr_labels` consistent with other providers by catching refresh exceptions and returning `[]` or cached labels, but that weakens the strict policy this PR introduced.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Results up to commit c323c58


🐞 Bugs (1) 📘 Rule violations (1) 📜 Skill insights (0)


Remediation recommended
1. Logs raw refresh_err 📘 Rule violation ⛨ Security
Description
publish_labels() logs the full exception text from a GitLab refresh failure ({refresh_err}),
which can include sensitive details (e.g., credentialed URLs or request context) depending on the
underlying client error. Logs are a security boundary and should avoid emitting secret-bearing
values.
Code

pr_agent/git_providers/gitlab_provider.py[R907-915]

+            except Exception as refresh_err:
+                # Strict policy: a stale snapshot can produce an incorrect
+                # add/remove diff that re-introduces the bug this method is
+                # meant to fix. Abort the publish, log, and leave server
+                # state untouched rather than risk clobbering user labels.
+                get_logger().warning(
+                    f"publish_labels: aborting, failed to refresh MR before save. error: {refresh_err}"
+                )
+                return
Evidence
PR Compliance ID 20 requires treating logs as a security boundary and avoiding logging
secrets/secret-bearing values. The new code logs the full refresh_err exception string in a
warning message, which risks exposing sensitive request details depending on the exception content.

pr_agent/git_providers/gitlab_provider.py[905-915]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`publish_labels()` logs the raw exception object (`refresh_err`) when `_get_merge_request()` fails. Exception strings from HTTP clients can sometimes include sensitive data (credentialed URLs, headers, or other context), so logging them verbatim can leak secrets.

## Issue Context
This PR introduces a new refresh-failure logging path inside `publish_labels()`.

## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[905-915]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Describe aborted on refresh 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_pr_labels(update=True) now refreshes via _get_merge_request() without
catching exceptions, so a transient GitLab API failure can raise out of get_pr_labels.
PRDescription calls get_pr_labels(update=True) in its main flow without isolating errors, which
can prevent the rest of /describe (including description publishing) from running.
Code

pr_agent/git_providers/gitlab_provider.py[R962-964]

+        if update:
+            self.mr = self._get_merge_request()
        return self.mr.labels
Evidence
The GitLab provider now raises on refresh when update=True, and PRDescription calls it before
publishing the description; a raised exception will skip the subsequent steps in that try block.
Other providers’ get_pr_labels methods swallow exceptions and return [], making GitLab’s new
behavior a caller-visible contract change.

pr_agent/git_providers/gitlab_provider.py[948-964]
pr_agent/tools/pr_description.py[156-169]
pr_agent/git_providers/github_provider.py[934-946]
pr_agent/git_providers/azuredevops_provider.py[141-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`GitLabProvider.get_pr_labels(update=True)` now raises on refresh failure. At least one call site (`pr_description`) doesn’t handle this locally, so `/describe` can abort entirely when the GitLab refresh call fails.

### Issue Context
Other providers treat label reads as best-effort and return `[]` on failure, but GitLab is now strict. Keeping the strictness is fine for correctness, but callers that shouldn’t fail the whole tool run must catch and downgrade the failure to “skip label publishing”.

### Fix Focus Areas
- pr_agent/tools/pr_description.py[156-170]
- pr_agent/git_providers/gitlab_provider.py[948-964]

### Suggested fix
Wrap the label publishing section in `PRDescription` with a `try/except` around `get_pr_labels(update=True)` (and/or the whole label block). On exception, log a warning and continue to publish the description.

(Alternative: make `GitLabProvider.get_pr_labels` consistent with other providers by catching refresh exceptions and returning `[]` or cached labels, but that weakens the strict policy this PR introduced.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

@ashearin ashearin force-pushed the fix/gitlab-preserve-labels branch from c323c58 to e26403b Compare June 30, 2026 00:43
Comment thread tests/unittest/test_gitlab_provider.py
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

Code review by qodo was updated up to the latest commit e26403b

@ashearin ashearin force-pushed the fix/gitlab-preserve-labels branch from e26403b to ee9060d Compare June 30, 2026 01:02
Comment thread pr_agent/tools/pr_description.py Outdated
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

Code review by qodo was updated up to the latest commit ee9060d

…labels

PRReviewer.set_review_labels and similar callers do a read-modify-write
cycle on MR labels (read existing via get_pr_labels(update=True), filter
out our managed labels, add review-effort/security labels, write back via
publish_labels) specifically to preserve user-added labels. On the GitLab
provider both halves of that cycle were buggy, so any label a user added
after the webhook fired would be wiped on every /review.

Issue 1: GitLabProvider.get_pr_labels(update=False) ignored the update
flag and returned self.mr.labels from the snapshot cached at provider
construction. Callers requesting a fresh read silently got stale data,
so user-added labels were missing from current_labels and excluded from
the write.

Issue 2: GitLabProvider.publish_labels did:

    self.mr.labels = list(set(pr_types))
    self.mr.save()

which issues a full PUT on the labels array against that same stale
snapshot. Any concurrent addition during the read-modify-write window
is silently dropped.

This change fixes both halves:

- get_pr_labels(update=True) now actually refreshes via
  self._get_merge_request() before reading. update=False keeps the
  cached read so the hot path is unchanged. Refresh failure propagates
  to the caller so a stale snapshot cannot corrupt the
  read-modify-write cycle (see PRDescription change below).
- publish_labels refreshes the MR immediately before computing the
  delta, then uses python-gitlab's add_labels / remove_labels
  attributes (forwarded by python-gitlab to identically-named API
  parameters on PUT /projects/:id/merge_requests/:merge_request_iid)
  on save(). The server now applies an incremental set-diff, so
  concurrent additions outside the diff are preserved.
- save() is wrapped in try/finally; both add_labels and remove_labels
  are delattr'd in the finally block so a later self.mr.save() on the
  same provider instance (e.g. publish_description) cannot resend the
  prior label diff. Cleanup runs on save() failure too.
- publish_labels aborts on refresh failure (logs and returns without
  calling save()) so we never publish from a stale snapshot.
- The refresh-failure warning logs only the exception type, not the
  raw exception object. HTTP/SDK client errors can carry credentialed
  URLs or headers in their message; logs are a security boundary.

PRDescription.run now isolates label-update failures: the label
refresh/publish block is wrapped in an inline try/except so that a
transient GitLab API error during get_pr_labels(update=True) (or a
publish_labels failure) does not abort the surrounding description
publish. Without this wrapper, the outer try/except in run() is wide
enough that a label-step exception would skip the description publish
too. PRReviewer.set_review_labels already wraps its
get_pr_labels(update=True) call in a try/except, so no change is
needed there. Other providers' get_pr_labels methods swallow
exceptions and return [], which is why callers were not previously
structured to expect exceptions here.

Tests:

New tests in tests/unittest/test_gitlab_provider.py cover:
- publish_labels: noop when sets are equal, add-only diff, remove-only
  diff, user labels outside the diff are not touched, abort on refresh
  failure (and the warning log redacts the exception body), diff-attr
  cleanup on save() failure.
- get_pr_labels: no-update returns cached, update refreshes, update
  propagates refresh failure.

New tests in tests/unittest/test_pr_description.py cover the
description publish flow remaining intact when:
- get_pr_labels(update=True) raises.
- publish_labels raises.

PYTHONPATH=. ./.venv/bin/pytest tests/unittest -q
1052 passed

Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
@ashearin ashearin force-pushed the fix/gitlab-preserve-labels branch from ee9060d to 22facfb Compare June 30, 2026 01:25
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

Code review by qodo was updated up to the latest commit 22facfb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant