Skip to content

Include notebooks and data preparations in tag/action selection#2196

Draft
rbscgh wants to merge 1 commit into
dataform-co:mainfrom
rbscgh:fix_notebook_tag_inclusion
Draft

Include notebooks and data preparations in tag/action selection#2196
rbscgh wants to merge 1 commit into
dataform-co:mainfrom
rbscgh:fix_notebook_tag_inclusion

Conversation

@rbscgh
Copy link
Copy Markdown

@rbscgh rbscgh commented Jun 2, 2026

Summary

prune() (cli/api/commands/prune.ts) — the function that turns a compiled graph plus a run config (selected tags/actions) into the set of actions to execute — only considered tables, operations and assertions. Notebooks and data preparations were:

  • Excluded from the selection union, so a --tags (or --actions) selector could never match them, and dependency/dependent traversal never reached them.
  • Never filtered in the returned graph — they were carried through the ...compiledGraph spread unchanged, so they neither got correctly included by tag nor correctly excluded.

Net effect: a notebook registered with a tag via the JS API, e.g.

notebook({
  name: "run_notebook",
  filename: "notebooks/release_reconciliation_diffs.ipynb",
  tags: ["reconciliation"],
  dependencyTargets: [{ name: "release_reconciliation_diffs" }]
})

would not appear when you "execute by tag" in the CLI or the UI — the tagged notebook was silently dropped from the selected-actions list (e.g. 5 actions shown instead of 6).

The notebook proto itself is fine — core/actions/notebook.ts correctly sets proto.tags, and notebook_test.ts already asserts the tags survive compilation. The defect was purely in the downstream selection step.

This isn't a niche path — tagging notebooks via the JS API is a documented, commonly-used pattern (e.g. this walkthrough: https://gtm-gear.com/posts/dataform-notebooks/).

Changes

  • Add notebooks and dataPreparations to the allActions union in computeIncludedActionNames, so tag/action selectors match them and dependency/dependent traversal includes them.
  • Filter notebooks and dataPreparations in the returned graph instead of leaking them through the spread.
  • Widen the CompileAction type to include INotebook | IDataPreparation.

Data preparations had the identical defect (same tags + dependencyTargets shape, same exclusion), so they're fixed symmetrically.

Test plan

  • Added prune notebooks and data preparations with --tags — a tag1 notebook/dp is selected, a tag2 one is dropped (matching the operation behavior alongside it).
  • Added prune notebooks with --tags pulls in dependencies — selecting by a notebook's tag with includeDependencies pulls in its untagged dependency.
  • bazel test //tests/api:api.spec passes (all 6 pre-existing prune tests + 2 new).
  • bazel test //core:actions/notebook_test //core:actions/data_preparation_test passes.

Note (out of scope): the OSS Builder (cli/api/commands/build.ts) only turns tables/operations/assertions into executable tasks; notebooks are executed by the managed service, not the OSS CLI runner. This PR fixes selection (the pruned graph the UI reads to show "N actions selected for execute"), which is the reported bug.

🤖 Generated with Claude Code

prune() only considered tables, operations and assertions when computing
the set of actions selected by --tags / --actions. Notebooks and data
preparations were excluded from the selection union, so a tag selector
could never match them, and they leaked through the returned graph
unfiltered (passed via the spread but never filtered).

This meant a notebook registered with a tag via the JS API, e.g.
  notebook({ name: "run_notebook", filename: "...", tags: ["reconciliation"] })
would not appear when executing by that tag in the CLI or UI.

Add notebooks and data preparations to both the selection union (so tag
and action selectors match them and dependency/dependent traversal
includes them) and the filtered output graph.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Jun 2, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant