Skip to content

fix(webapp,run-engine): honor per-queue length cap on concurrency-key queues#3558

Open
ericallam wants to merge 5 commits into
mainfrom
fix/ck-queue-length-cap-and-dashboard
Open

fix(webapp,run-engine): honor per-queue length cap on concurrency-key queues#3558
ericallam wants to merge 5 commits into
mainfrom
fix/ck-queue-length-cap-and-dashboard

Conversation

@ericallam
Copy link
Copy Markdown
Member

@ericallam ericallam commented May 12, 2026

Summary

Queues that use concurrency keys can no longer bypass the per-queue length cap, and the "Queued | Running" columns in the dashboard now show the true total across all CK variants instead of 0.

The cap and the dashboard both relied on ZCARD of the base queue key, but CK-keyed runs live under <base>:ck:<variant> keys. Any queue that used concurrency keys read 0 — letting a single CK variant grow unbounded past the user's configured cap.

Fix

Two per-base-queue counters are maintained inside the CK Lua scripts: <base>:lengthCounter and <base>:runningCounter. Non-CK enqueue/dequeue paths are untouched.

Counters are lazy-initialized the first time a CK enqueue (or nack) lands on a queue: the Lua script sums ZCARD across the variants tracked by ckIndex, sets the counter, then INCRs. Pre-existing CK backlog on already-populated queues is captured automatically — no batch migration required.

INCR/DECR is gated on ZADD/SADD returning 1 (a new entry vs an idempotent no-op), so duplicate enqueues or re-dequeues don't inflate the counter.

The counter is SET with a 24-hour TTL on init. INCR/DECR do not extend the TTL, so the counter expires daily and the next CK operation re-seeds it from ckIndex. This bounds any drift that accumulates during the rolling-deploy overlap window — where old (un-Tracked) and new (Tracked) webapp instances briefly coexist — to ≤24 hours, with no admin sweep or background reconciler needed.

Read paths pipeline ZCARD/SCARD on the base key + GET on the counter and sum. A missing counter is treated as 0, so pure non-CK queues see the same answer as before.

The counter-aware scripts ship alongside the originals with a Tracked suffix for rolling-deploy safety; a follow-up PR will drop the originals once this has rolled out.

Test plan

  • pnpm run test --filter @internal/run-engine — 116 tests pass, including a new ckCounters.test.ts covering lazy init from pre-existing backlog, churn, floor-at-zero, the non-CK regression case, mixed CK + non-CK on the same base queue, idempotent re-enqueue (ZADD-already-exists), 24h TTL on the counter, and nack re-seeding after counter expiry.
  • Verified end-to-end against a live local environment:
    • Triggered 24 CK enqueues across 4 variants → lengthCounter=16, runningCounter=8, dashboard showed Queued=16 / Running=8 for the CK queue.
    • Set the env queue cap to 16, triggered 12 more enqueues → 8 succeeded, 4 rejected with QueueSizeLimitExceededError.
    • Deleted the counter on a queue with 31 messages already sitting in CK variants, triggered one more enqueue → counter materialized to 31 from the ckIndex sum, then INCR'd.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 12, 2026

⚠️ No Changeset found

Latest commit: 1866ff1

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 420fe872-ee88-4713-9fe6-4a2d620e91ad

📥 Commits

Reviewing files that changed from the base of the PR and between ad7d627 and 1866ff1.

📒 Files selected for processing (1)
  • internal-packages/run-engine/src/run-queue/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal-packages/run-engine/src/run-queue/index.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: units / e2e-webapp / 🧪 E2E Tests: Webapp
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: typecheck / typecheck

Walkthrough

This PR fixes a bug where per-queue length limits and dashboard metrics for queues using concurrency keys (CK) incorrectly reported 0, allowing the per-queue cap to be bypassed. The fix introduces per-base-queue counter keys and tracked Redis Lua commands. Query methods now aggregate base set/zset cardinality with GET-backed counters. CK enqueue, dequeue, message read, acknowledge, nack, dead-letter, release, and TTL expiry paths were updated to atomically maintain length and running counters with floored DECR guards.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main fix: honoring per-queue length caps for concurrency-key queues, which is the core problem addressed in the changeset.
Description check ✅ Passed The description covers Summary, Fix, and Test plan sections, providing substantial technical detail. However, the Checklist section is incomplete—all items are unchecked, and the Testing, Changelog, and Screenshots sections (from the template) are either missing or unfilled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ck-queue-length-cap-and-dashboard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 0 potential issues.

View 3 additional findings in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal-packages/run-engine/src/run-queue/index.ts`:
- Around line 420-424: The code is incorrectly treating Redis command errors as
missing keys by converting baseErr/ctrErr to 0; update the logic around the
results tuple (baseErr, baseVal, ctrErr, ctrVal) so that if baseErr or ctrErr is
truthy you surface/propagate that error (throw or return it) instead of coercing
to 0, while still treating a null/undefined value as 0 (i.e., keep the baseVal
== null ? 0 behavior). Locate the block using results[0]/results[1] and
variables baseErr, baseVal, ctrErr, ctrVal, compute baseCount/ctrCount only when
no error is present, and return or rethrow the encountered error so callers can
handle Redis command failures rather than silently undercounting.

In `@internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts`:
- Line 4: The test imports describe from node:test which conflicts with the
Vitest runner; replace the import so the test uses Vitest's API (i.e., import
describe from "vitest" or use Vitest globals) so the file ckCounters.test.ts
runs under Vitest; update any other test helpers (e.g., it/expect) in the same
file to use Vitest symbols if needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f0aabc18-c004-4d2b-894b-a0397acc1811

📥 Commits

Reviewing files that changed from the base of the PR and between 2b84545 and ede7864.

⛔ Files ignored due to path filters (1)
  • references/hello-world/src/trigger/ckCounters.ts is excluded by !references/**
📒 Files selected for processing (5)
  • .server-changes/fix-ck-queue-length-cap-and-dashboard.md
  • internal-packages/run-engine/src/run-queue/index.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/types.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
{apps,internal-packages}/**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (CLAUDE.md)

Use pnpm run typecheck to verify changes in apps and internal packages (apps/*, internal-packages/*) instead of build, which proves almost nothing about correctness

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
{package.json,**/*.{ts,tsx,js}}

📄 CodeRabbit inference engine (CLAUDE.md)

Pin Zod to version 3.25.76 exactly across the entire monorepo - never use a different version or version range

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js}: Import from @trigger.dev/core using subpaths only, never the root export
Always import tasks from @trigger.dev/sdk, never from @trigger.dev/sdk/v3 or deprecated client.defineJob
Add crumbs to code using // @Crumbs comments or `// `#region` `@crumbs blocks for debug tracing during development

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
**/*.{ts,tsx,js,jsx,json,md,css,scss}

📄 CodeRabbit inference engine (AGENTS.md)

Code formatting is enforced using Prettier. Run pnpm run format before committing

Files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
**/*.test.{ts,tsx,js}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.test.{ts,tsx,js}: Use vitest exclusively for testing and never mock anything - use testcontainers instead
Place test files next to source files using the pattern MyService.ts -> MyService.test.ts

**/*.test.{ts,tsx,js}: Use vitest for unit testing and run tests with pnpm run test
Test files should live beside the files under test with descriptive describe and it blocks
Tests should avoid mocks or stubs and use helpers from @internal/testcontainers when Redis or Postgres are needed

Files:

  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use testcontainers with redisTest, postgresTest, or containerTest from @internal/testcontainers for testing with Redis/PostgreSQL dependencies

Files:

  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
🧠 Learnings (2)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts
  • internal-packages/run-engine/src/run-queue/index.ts

Comment thread internal-packages/run-engine/src/run-queue/index.ts
Comment thread internal-packages/run-engine/src/run-queue/tests/ckCounters.test.ts Outdated
Comment thread internal-packages/run-engine/src/run-queue/index.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants