Skip to content

feat: batch per-column PII tag ALTERs into multi-action statements#33

Merged
pquadri merged 2 commits into
mainfrom
feat/batched-alter-pii-tag-sync
Jun 10, 2026
Merged

feat: batch per-column PII tag ALTERs into multi-action statements#33
pquadri merged 2 commits into
mainfrom
feat/batched-alter-pii-tag-sync

Conversation

@pquadri

@pquadri pquadri commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

What

Collapses per-column ALTER TABLE ... MODIFY COLUMN ... SET/UNSET TAG into single multi-action ALTER TABLE statements during PII tag sync. SET and UNSET are kept as separate statements, no-op on empty, and chunked at 100 actions/statement to stay under Snowflake's statement-size limit.

Why

Wide tables previously emitted dozens of individual ALTER statements per sync — one per column tag change. Batching cuts that to a handful, reducing cloud-services cost by roughly 20-40x on wide tables.

Compatibility

The old _set_column_tag / _unset_column_tag are retained as single-element shims, so existing callers keep working unchanged.

Tests

60 tests pass (uv run pytest), ruff check clean.

⚠️ Open item for reviewer / author before merge

The multi-column ALTER TABLE ... MODIFY COLUMN ... batched syntax was validated only against mocked cursors — the agent's Snowflake MCP access is read-only, so no live execution was possible. A reviewer or the author must smoke-test the generated multi-action ALTER against a scratch table in a writable dev schema before merging to confirm Snowflake accepts the batched statement shape.

Collapse per-column ALTER TABLE ... MODIFY COLUMN ... SET/UNSET TAG into
single multi-action ALTER TABLE statements (SET and UNSET kept separate),
no-op on empty, chunked at 100 actions/statement to stay under Snowflake's
statement-size limit. Cuts cloud-services cost from dozens of statements
per table to a handful. Old _set_column_tag/_unset_column_tag retained as
single-element shims for backward compatibility.
@pquadri pquadri requested a review from giamo June 10, 2026 09:05
@pquadri

pquadri commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Resolved the batched-ALTER grammar open item — all forms now live-validated against Snowflake (8/8 checks pass).

Final grammar (asymmetric, per Snowflake):

  • Column SET batches across columns in ONE statement: ALTER TABLE t MODIFY COLUMN "C1" SET TAG g.p.pii='a', "C2" SET TAG g.p.pii='b'
  • Column UNSET is one statement per column — Snowflake rejects multi-column UNSET. Multiple tags on the same column are combined in that column's single statement: ALTER TABLE t MODIFY COLUMN "C1" UNSET TAG t1, t2
  • Table-level SET/UNSET batching across tags was already correct and left unchanged ✅

The earlier multi-column-UNSET emission (which Snowflake rejected) is fixed: _unset_column_tags_batch now groups by column and emits one ALTER per column, while SET keeps batching across columns through _execute_column_tag_actions. Unit tests cover multi-column UNSET (one stmt/col), multi-tag-per-column UNSET, SET batching, chunking, SET/UNSET separation, and the no-op case — all green.

@pquadri pquadri changed the title perf: batch per-column PII tag ALTERs into multi-action statements feat: batch per-column PII tag ALTERs into multi-action statements Jun 10, 2026
@pquadri pquadri merged commit 6936e9f into main Jun 10, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant