Skip to content

feat(ci): auto-build DocumentDB images on new upstream release#410

Open
Ritvik-Jayaswal wants to merge 9 commits into
documentdb:mainfrom
Ritvik-Jayaswal:developer/auto-build-documentdb-images
Open

feat(ci): auto-build DocumentDB images on new upstream release#410
Ritvik-Jayaswal wants to merge 9 commits into
documentdb:mainfrom
Ritvik-Jayaswal:developer/auto-build-documentdb-images

Conversation

@Ritvik-Jayaswal

Copy link
Copy Markdown
Collaborator

Summary

Closes #360.

When a new version of DocumentDB is released upstream, the extension and gateway images should be built automatically, and the default version bumped in code and docs. This wires up that automation by reusing the existing build/release workflows and adding a scheduled watcher.

What changed

  • New watch_documentdb_images.yml — polls the upstream documentdb/documentdb releases/latest on a cron (0 */6 * * *, plus manual workflow_dispatch with version override and dry_run). When a release newer than the current DEFAULT_DOCUMENTDB_IMAGE is found, it builds candidate images and opens a version-bump PR.
  • build_documentdb_images.yml — added a workflow_call trigger exposing documentdb_version and image_tag outputs; resolves the version from the inputs context; cosign verify now uses an identity regexp so it passes when run as a reusable workflow.
  • release_documentdb_images.yml — added a workflow_call trigger mirroring its inputs so the watcher can chain into it.
  • Docs — added an "Automatic Release Detection" section to docs/designs/image-management.md and a workflow entry in AGENTS.md.

Behavior

Scheduled poll → build candidate images → promote + open a chore: bump DocumentDB images PR. Merging that PR is the human gate that makes the new version the default for new installs.

Idempotent: once images are promoted, the documentdb:<version> tag exists, so later cron ticks short-circuit until the bump PR merges (which advances the default and stops detection for that version). Drafts and pre-releases are excluded because detection uses GitHub's releases/latest.

Notes

  • Uses the no-PAT reusable-workflow pattern (secrets: inherit). As with the existing manual release flow, the bump PR is created with GITHUB_TOKEN, so CI on that PR is re-triggered by a maintainer on review.
  • The build workflow keeps its repository_dispatch: documentdb-release trigger, so a real upstream webhook can still be wired later if a cross-repo PAT becomes available.

Testing

  • YAML validates with no errors.
  • watch_documentdb_images.yml can be exercised manually via workflow_dispatch with dry_run: true to confirm detection without building.

Closes documentdb#360. Add watch_documentdb_images.yml which polls the upstream documentdb/documentdb repo on a schedule and, when a newer release than the current default is published, builds candidate documentdb + gateway images and opens a version-bump PR (human-merged gate).

Make build_documentdb_images.yml and release_documentdb_images.yml reusable via workflow_call (build exposes documentdb_version and image_tag outputs) so the watcher can chain them. Document the automation in image-management.md and AGENTS.md.

Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CI automation to detect upstream documentdb/documentdb releases and drive the “database image track” (build candidate images → promote to release tags → open a version-bump PR) with a scheduled watcher, reusing existing build/release workflows via workflow_call.

Changes:

  • Introduces .github/workflows/watch_documentdb_images.yml to poll releases/latest every 6 hours (or manually) and chain into build + release workflows.
  • Exposes workflow_call interfaces for build_documentdb_images.yml (with outputs) and release_documentdb_images.yml (with inputs) to support chaining.
  • Updates documentation (image-management.md, AGENTS.md) to describe the new automation and workflow entry.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs/designs/image-management.md Documents the new scheduled upstream release watcher and end-to-end flow.
AGENTS.md Adds the watcher workflow to the CI workflow inventory.
.github/workflows/watch_documentdb_images.yml New scheduled/manual workflow to detect upstream releases and invoke build + release workflows.
.github/workflows/release_documentdb_images.yml Adds workflow_call trigger inputs to allow reusable invocation from the watcher.
.github/workflows/build_documentdb_images.yml Adds workflow_call trigger outputs and updates version resolution + cosign verify identity matching for reusable runs.

Comment on lines +115 to +122
# Already promoted? If the release tag exists, the bump PR is likely
# pending review/merge, so don't rebuild on every cron tick.
echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
if docker manifest inspect "ghcr.io/${{ github.repository }}/documentdb:${NEW}" >/dev/null 2>&1; then
echo "Release image documentdb:${NEW} already exists; version-bump PR is likely pending merge. Skipping."
echo "should_release=false" >> "$GITHUB_OUTPUT"
exit 0
fi
@documentdb-triage-tool documentdb-triage-tool Bot added CI/CD documentation Improvements or additions to documentation enhancement New feature or request labels Jun 24, 2026
@documentdb-triage-tool

Copy link
Copy Markdown

🤖 Auto-triaged by documentdb-triage-tool.

Applied: CI/CD, documentation, enhancement
Project fields suggested: Component ci · Priority P2 · Effort M · Status Needs Review
Confidence: 0.30 (deterministic)

Reasoning

component from path globs (ci, docs); effort from diff stats (235+2 LOC, 5 files); LLM failed: Invalid response body while trying to fetch https://api.anthropic.com/v1/messages: Premature close

If a label is wrong, remove it manually and ping @patty-chow so the rules can be tuned. The bot will not re-label items that already have component labels.

@xgerman xgerman left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WE want to build the documentdb images from the official packages @guanzhousongmicrosoft is creating. Please adjust the build process.

We also want them to be build that we cna integarte them into the offical CNPG image gallery (see https://github.com/xgerman/postgres-extensions-containers/tree/xgerman/documentdb)

@xgerman

xgerman commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

also instead of a watch on the documentdb repo we were thinking about using webhooks to trigger builds of new versions upon release

…hook trigger

Address review feedback on PR documentdb#410:

- Install postgresql-18-documentdb from the official DocumentDB APT repo (documentdb.io/deb) instead of GitHub-release .deb assets; the meta-package pulls in Citus/RUM/pgvector/PostGIS so explicit cron/pgvector/postgis installs are dropped. APT package version is pinnable per build.

- Make repository_dispatch (documentdb-release) the primary trigger for watch_documentdb_images.yml; demote cron to a daily safety-net. Add reference upstream sender workflow as docs/designs/upstream-release-dispatch-sender.md.

- Idempotency: skip only when BOTH documentdb and gateway release tags exist; rebuild on partial promotion.

- Gateway build unchanged (still from documentdb-local public image).

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
@Ritvik-Jayaswal

Ritvik-Jayaswal commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review @xgerman, pushed a rework (48bad40) addressing your three points:

  1. Official packages the extension image now installs postgresql-18-documentdb (from Guanzhou) from the official APT repo (documentdb.io/deb) instead of GitHub-release .deb assets. The meta-package pulls in Citus/RUM/pgvector/PostGIS, so the manual cron/pgvector/postgis installs are dropped. The APT package version is pinnable per build (documentdb_apt_version).
  2. Webhook trigger repository_dispatch (documentdb-release) is now the primary trigger for the new version workflow; cron is demoted to a daily safety-net. Added a reference upstream sender workflow at docs/designs/upstream-release-dispatch-sender.md (it has to live in documentdb/documentdb + needs a cross-repo token).
  3. CNPG gallery heads up: the gallery doesn't have a documentdb image yet (no folder/PR/package in cloudnative-pg/postgres-extensions-containers main). So I kept a gallery-compatible self-build for now; flipping DEFAULT_DOCUMENTDB_IMAGE to the gallery image is a ~1-line change once it's published upstream. Also note upstream requires packages from Debian main/PGDG + CNCF-allowlisted licenses, which the documentdb.io/deb channel would need to satisfy.

Also fixed a flagged idempotency bug: the watch job now skips only when both documentdb and gateway release tags exist, and rebuilds on a partial promotion.

One open question: your branch pins 1.0.0 while this repo tracks 0.110.x. Which version line does the official APT stable channel publish? That determines what we pin.

…documentdb-images

Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>

# Conflicts:
#	.github/workflows/build_documentdb_images.yml
@Ritvik-Jayaswal Ritvik-Jayaswal requested a review from xgerman June 26, 2026 18:27
Comment thread .github/workflows/build_documentdb_images.yml Outdated
Comment thread .github/workflows/build_documentdb_images.yml Outdated
The official APT repo serves dashed Debian versions (e.g. 0.113-0), not dotted semver. Default the apt pin to VERSION_DASH and verify the exact postgresql-18-documentdb=<version> is present in the stable index (with bounded retry to absorb publish lag) before building, instead of only checking that the repo responds.

Addresses review feedback from WentingWu666666 on PR documentdb#410.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
@WentingWu666666

Copy link
Copy Markdown
Collaborator

e2e should exercise Dockerfile_extension, and the build-mode consumer needs updating for the new APT contract.

Three linked points:

1. Trigger e2e on Dockerfile changes. test-e2e.yml's pull_request.paths doesn't include .github/dockerfiles/**, so changes to Dockerfile_extension (this PR) don't run e2e at all. Add:

    paths:
      - '.github/dockerfiles/**'

2. Triggering alone won't build it — registry mode short-circuits. test-build-and-package.yml's probe-images selects registry mode whenever documentdb:<version> + gateway:<version> already exist in GHCR (the common case here, since the version is unchanged), and just pulls the prebuilt image — the modified Dockerfile is never built. To make this a real gate, force a Dockerfile build (build mode) when .github/dockerfiles/Dockerfile_extension is in the diff.

3. Fix the build-mode build args for the new contract. In build mode it currently runs:

docker buildx build --build-arg PG_MAJOR=18 \
  --build-arg DEB_PACKAGE_REL_PATH=packages/$DEB_FILE \
  -f .github/dockerfiles/Dockerfile_extension .

After this PR, DEB_PACKAGE_REL_PATH is no longer consumed and DOCUMENTDB_APT_VERSION is unset, so it silently installs the latest published package instead of the version under test — a green build of the wrong artifact. The deb filename already yields the dashed version (…documentdb_0.111-0_amd64.deb), so pass it through:

--build-arg DOCUMENTDB_APT_VERSION=<dashed version parsed from $DEB_FILE>

One deeper limitation worth calling out: the new Dockerfile installs only from the published APT repo, so build mode can no longer build an image from a locally-compiled .deb of an unpublished documentdb_ref. If we still need source-built/unpublished coverage, the Dockerfile needs an optional local-deb path; otherwise we should document that build mode only works for already-published versions.

@WentingWu666666

Copy link
Copy Markdown
Collaborator

Please attach evidence of a full end-to-end run on your fork before merge.

None of the new workflows (watch_documentdb_images.ymlbuild_documentdb_images.ymlrelease_documentdb_images.yml) can run from a PR branch — schedule, repository_dispatch, and workflow_call only register from the default branch — so this chain isn't exercised by CI here. Given the dotted-vs-dashed pin that slipped through earlier (which lived in the build path), I'd ask for a real run rather than just a dry run: dry_run: true only exercises version detection and skips build/release entirely, so it wouldn't have caught that bug or validated any of the parts that actually changed.

Could you dispatch watch_documentdb_images.yml on your fork with dry_run: false and link the run? A fork run is side-effect-free — images push to your fork's GHCR (ghcr.io/<you>/...), and the bump PR opens against your fork, never the official registry.

Expected if it's working end-to-end:

  • detectshould_release=true, emits the new dotted version + dashed apt_version.
  • build: the "Verify package available in the APT index" step passes, then documentdb + gateway candidate images build/push/sign per-arch — i.e. apt-get install postgresql-18-documentdb=<dashed> actually resolves.
  • release: cosign verify of the candidates passes (with the new --certificate-identity-regexp identity), candidate tags promote to release tags, and update-defaults opens the chore: bump documentDbVersion PR.
  • Evidence to link: the run URL, the candidate + release tags in your fork's packages, the cosign-verify step log, and the auto-opened bump PR.

One practical note: run it against a version that's already published to the APT repo and has a matching documentdb-local:pg17-<version> payload (e.g. the current default). Otherwise the APT-verify and gateway-pull steps will fail for reasons unrelated to your code.

Addresses WentingWu666666's e2e/build-mode review feedback on PR documentdb#410:

1. test-e2e.yml now triggers on .github/dockerfiles/** so Dockerfile_extension changes run e2e.

2. probe-images forces build mode when Dockerfile_extension is in the PR diff, so registry mode no longer short-circuits the modified Dockerfile.

3. Dockerfile_extension gains an optional DOCUMENTDB_DEB_PACKAGE build arg (installed via RUN --mount=type=bind) so build mode validates the source-built/unpublished .deb instead of silently installing the latest published package. Build-mode workflows pass it instead of the now-unused DEB_PACKAGE_REL_PATH; runtime deps still resolve from the official APT repo.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
@Ritvik-Jayaswal

Copy link
Copy Markdown
Collaborator Author

Thanks, all three addressed in ba17048:

  1. e2e on Dockerfile changestest-e2e.yml's pull_request.paths now includes .github/dockerfiles/**.
  2. Registry mode no longer short-circuitsprobe-images now forces build mode when .github/dockerfiles/Dockerfile_extension is in the PR diff (computed via git diff against the PR base SHA), so the modified Dockerfile is actually built.
  3. Build-mode contract — rather than just passing a version pin, I kept build mode able to test source-built/unpublished .debs: Dockerfile_extension gains an optional DOCUMENTDB_DEB_PACKAGE build arg (installed via RUN --mount=type=bind so no COPY is needed on the release path), taking precedence over DOCUMENTDB_APT_VERSION. The build-mode workflows now pass DOCUMENTDB_DEB_PACKAGE=packages/$DEB_FILE instead of the now-unused DEB_PACKAGE_REL_PATH, so the compiled deb under test is what gets installed; its runtime deps (Citus/RUM/pgvector/PostGIS) still resolve from the official APT repo. DOCKER_BUILDKIT=1 is set on the plain docker build consumers since --mount needs BuildKit.

This addresses your "deeper limitation" point directly — build mode keeps source-built/unpublished coverage rather than only working for already-published versions.

rjayaswal and others added 3 commits June 30, 2026 14:23
The verify step matched the package version with grep -A1 '^Package:' |
grep -qx 'Version: ...', which assumes Version is the line immediately
after Package. In the real Debian Packages index the stanza order is
Package / Source / Version, so the Version line was never captured and
the check produced a false negative even when the version was published
(e.g. 0.113-0). Replace it with a stanza-aware awk parser.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
GHCR rejects tags whose repository path contains uppercase characters
(repository name must be lowercase). The image references were built
directly from github.repository, which preserves the owner login case,
so builds failed on forks whose owner has uppercase letters
(e.g. Ritvik-Jayaswal). Normalize the owner/repo to lowercase for all
ghcr.io image refs in the documentdb build, watch idempotency check, and
promote workflows. The cosign certificate-identity-regexp keeps the
original case to match the OIDC certificate subject.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
The extension install RUN uses 'set -eux' (set -u). Docker does not
inject an ARG into the RUN environment when it has no value, so the APT
path (which passes only DOCUMENTDB_APT_VERSION) failed with
'DOCUMENTDB_DEB_PACKAGE: parameter not set'. Use \ default
expansion for both optional ARGs so an unset value is treated as empty.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
The built-in GITHUB_TOKEN cannot push changes under .github/workflows/**
(GitHub blocks it without the 'workflow' scope, which that token cannot
be granted), so create-pull-request failed with 'refusing to allow a
GitHub App to create or update workflow ... without workflows
permission'. Drop the sed edits that rewrote DEFAULT_DOCUMENTDB_VERSION
and dispatch 'default:' values in build_documentdb_images.yml and
release_documentdb_images.yml. Those are only manual-dispatch fallbacks;
the watch workflow resolves the real version dynamically from upstream
releases, so leaving them static is harmless and keeps the automation
self-contained (no PAT/App token required). Update the PR body to match.

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Ritvik Jayaswal <rjayaswal@microsoft.com>
@Ritvik-Jayaswal

Ritvik-Jayaswal commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator Author

Full workflow evidence: full watch → build → release → bump-PR chain runs green end-to-end

I ran the complete chain on my fork against a real upstream version (0.113.0) and it went fully green, including the auto-opened version-bump PR.

Evidence run

1. APT version verified before build (dashed 0.113-0, both arches)

APT package version: 0.113-0
Looking for postgresql-18-documentdb=0.113-0 (amd64) in https://documentdb.io/deb/dists/stable/main/binary-amd64/Packages
Found postgresql-18-documentdb=0.113-0 (amd64).
Looking for postgresql-18-documentdb=0.113-0 (arm64) in https://documentdb.io/deb/dists/stable/main/binary-arm64/Packages
Found postgresql-18-documentdb=0.113-0 (arm64).

2. Multi-arch images built, pushed, signed (cosign / keyless) and verified

Signing manifest-list@sha256:711e3065ddf2ffbb0e2504fb2d3b0f993071c15068e1396d639021ca31c2cdf1
tlog entry created with index: 2025736783
Pushing signature to: ghcr.io/ritvik-jayaswal/documentdb-kubernetes-operator/documentdb
Verification for ghcr.io/ritvik-jayaswal/documentdb-kubernetes-operator/documentdb@sha256:711e3065... --
  - The cosign claims were validated
  - The code-signing certificate was verified using trusted certificate authority certificates

3. Promoted release tags exist on both fork GHCR packages

Package Release tag
ghcr.io/ritvik-jayaswal/documentdb-kubernetes-operator/documentdb 0.113.0 (+ 0.113.0-build-28470139299-1-5f7c17f, per-arch, .sig)
ghcr.io/ritvik-jayaswal/documentdb-kubernetes-operator/gateway 0.113.0 (+ 0.113.0-build-28470139299-1-5f7c17f, per-arch, .sig)

4. Auto-opened version-bump PR

  • PR: chore: bump DocumentDB default images to 0.113.0 (fork PR #2)
  • The bump PR only touches substantive version sources — no workflow files are edited:
    • operator/src/internal/utils/constants.go
    • operator/cnpg-plugins/sidecar-injector/internal/config/config.go + config_test.go
    • operator/documentdb-helm-chart/values.yaml
    • .github/dockerfiles/Dockerfile_gateway_public_image

The workflow version fallbacks remain intentionally static (the real version is resolved dynamically at run time), so the bump PR no longer rewrites workflow defaults (otherwise the workflow would need more permissions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When a new version of DocumentDB is released, the extension and gateway images should be built automatically

5 participants