Conversation
Engineering spec for the /audit-js-assets . Covers sweep protocol, Chrome DevTools MCP tooling, heuristic filtering, slug generation, init and diff modes. Closes #606
prk-Jr
left a comment
There was a problem hiding this comment.
Summary
Reviewed the engineering spec and left a few implementation-facing questions inline. The main areas are URL normalization, wildcarding, third-party boundary matching, path derivation, and inject_in_head semantics for dynamic scripts.
| |---|---| | ||
| | `slug` | `{publisher_prefix}:{asset_stem}` — see slug algorithm below | | ||
| | `path` | `/{publisher_prefix}/{asset_stem}.js`, or wildcard variant if versioned path detected | | ||
| | `origin_url` | Full captured URL, with wildcard substitution applied if versioned | |
There was a problem hiding this comment.
❓ question — origin_url needs an explicit normalization policy
This reads as though we would persist the captured URL more or less verbatim. That seems likely to commit cache-busters, per-session tokens, consent params, or other transient query values into js-assets.toml, which would create churn and could leak noisy values into config.
Could we define the normalization rules here: whether fragments are dropped, which query params are preserved or stripped, and whether wildcarding happens before or after query normalization?
|
|
||
| Path segments matching either pattern are replaced with `*`: | ||
| - Semver: `\d+\.\d+[\.\d-]*` (e.g., `1.19.8-hcskhn`) | ||
| - Hash-like: `[a-f0-9]{6,}` or `[A-Za-z0-9]{8,}` between path separators |
There was a problem hiding this comment.
❓ question — The hash-like wildcard heuristic looks broad enough to overmatch stable paths
[A-Za-z0-9]{8,} will match a lot of stable path segments in the wild, not just hashes. That seems likely to over-wildcard legitimate routes and produce unstable proxy config.
Would it make sense to tighten this to a higher-entropy pattern or otherwise define a more hash-specific rule so engineering does not have to guess where the false-positive boundary should be?
| 2. Open Chrome via `mcp__chrome-devtools__new_page`, navigate to target URL via `mcp__chrome-devtools__navigate_page` | ||
| 3. Wait for full page load + ~6s settle window for async script loads (`mcp__chrome-devtools__wait_for`) | ||
| 4. In parallel: | ||
| - `mcp__chrome-devtools__list_network_requests` → filter for requests where URL ends in `.js` or `Content-Type: application/javascript`, and origin ≠ `publisher.domain` |
There was a problem hiding this comment.
❓ question — Third-party boundary and filter matching semantics need to be explicit
origin ≠ publisher.domain leaves a few important cases open: www. variants, publisher-owned subdomains/CDNs, and how the heuristic filter list should match real hosts like www.googletagmanager.com instead of just apex domains.
Could we specify whether host matching here is exact-host, eTLD+1, or suffix-with-dot-boundary matching? That will directly affect what gets surfaced vs filtered.
| | Field | Derivation | | ||
| |---|---| | ||
| | `slug` | `{publisher_prefix}:{asset_stem}` — see slug algorithm below | | ||
| | `path` | `/{publisher_prefix}/{asset_stem}.js`, or wildcard variant if versioned path detected | |
There was a problem hiding this comment.
❓ question — The path derivation formula conflicts with the examples below
The table says /{publisher_prefix}/{asset_stem}.js, but the examples later use /sdk/aB3kR7mN.js and /raven-static/*. That leaves the implementation having to infer the canonical route shape.
Can we pick one formula and update the examples so they match it exactly?
| | `path` | `/{publisher_prefix}/{asset_stem}.js`, or wildcard variant if versioned path detected | | ||
| | `origin_url` | Full captured URL, with wildcard substitution applied if versioned | | ||
| | `ttl_sec` | Omitted — proxy defaults to 1800 (wildcard) or 3600 (fixed) | | ||
| | `inject_in_head` | `true` if URL appeared in head script list from DOM evaluation, else `false` | |
There was a problem hiding this comment.
❓ question — How should inject_in_head behave for dynamically inserted scripts that are no longer in document.head at snapshot time?
The current rule makes sense for static markup, but some loaders briefly insert or move script tags and then remove them. In those cases the network sweep may see the request even though the later DOM query does not.
Do we want the source of truth here to be the final DOM snapshot only, or should the implementation treat some dynamic head insertions as inject_in_head = true as well?
aram356
left a comment
There was a problem hiding this comment.
Summary
Well-structured engineering spec for the JS Asset Auditor. The Auditor/Proxy separation and diff mode design are strong. However, several MCP tool assumptions are incorrect and would cause the skill to fail at implementation time, and the path derivation algorithm contradicts its own examples.
Blocking
🔧 wrench
- MCP tool names are wrong: every
mcp__chrome-devtools__*reference uses an incorrect prefix — actual names aremcp__plugin_chrome-devtools-mcp_chrome-devtools__*(line 36+) wait_formisuse: tool waits for text on page, not a settle window — needevaluate_scriptwithsetTimeoutinstead (line 37)list_network_requestsfiltering: tool supportsresourceTypesfiltering, not URL/Content-Type filtering — post-processing needed (line 39)- Path derivation contradicts examples: algorithm says
/{publisher_prefix}/{asset_stem}.jsbut example shows/sdk/aB3kR7mN.js— these are different structures (line 68 vs 119)
❓ question
- Slug concatenation separator:
sha256(publisher.domain + origin_url)doesn't specify a separator — and base62 encoding isn't standardized (character set ordering). Both must be pinned for Auditor/Proxy consistency (line 88) - Skill location: spec says
.claude/skills/audit-js-assets.mdbut project uses.claude/commands/for all slash commands — reason for new directory? (line 186)
Non-blocking
🤔 thinking
- Wildcard regex too broad:
[A-Za-z0-9]{8,}matches stable path segments likeanalytics,bootstrap(line 100) stale_ttl_secomitted: Proxy spec defines this field but Auditor doesn't mention it in generated entries (line 70)platform.twitter.com: may need updating toplatform.x.compost-rebrand (line 58)- 6s settle window: may miss late-loading ad tech scripts — consider making configurable (line 37)
- MCP permission grants:
close_pageandwait_forare not pre-approved in.claude/settings.json— will trigger permission prompts - No error handling specified: DNS failures, ad blockers, browser crashes during sweep — worth documenting expected behavior
🌱 seedling
docs/superpowers/specs/is new: existing docs usedocs/guide/,docs/internal/,docs/epics/— convention should be documented- Wildcard detection asymmetry: only defined in Auditor spec, not cross-referenced in Proxy spec — future tools won't know the patterns
⛏ nitpick
- Relative link to Proxy spec: dead if this PR merges first (line 6, 207)
👍 praise
- Diff mode design: commented-out TOML + never-auto-remove is excellent
- Auditor/Proxy separation: clean architectural boundary with minimal coupling
- Terminal summary: shows filtered-with-reason, slug mapping, wildcard indicators
inject_in_headDOM detection: sound approach usingdocument.head.querySelectorAll
CI Status
- cargo fmt: PASS
- cargo test: PASS
- vitest: PASS
- format-typescript: PASS
- format-docs: FAIL
- CodeQL: PASS
- integration tests: PASS
| 2. Open Chrome via `mcp__chrome-devtools__new_page`, navigate to target URL via `mcp__chrome-devtools__navigate_page` | ||
| 3. Wait for full page load + ~6s settle window for async script loads (`mcp__chrome-devtools__wait_for`) | ||
| 4. In parallel: | ||
| - `mcp__chrome-devtools__list_network_requests` → filter for requests where URL ends in `.js` or `Content-Type: application/javascript`, and origin ≠ `publisher.domain` |
There was a problem hiding this comment.
🔧 wrench — MCP tool names are wrong throughout the spec
Every mcp__chrome-devtools__* reference uses an incorrect prefix. The actual tool names in .claude/settings.json and .claude/settings.local.json are:
mcp__plugin_chrome-devtools-mcp_chrome-devtools__new_page
mcp__plugin_chrome-devtools-mcp_chrome-devtools__navigate_page
mcp__plugin_chrome-devtools-mcp_chrome-devtools__list_network_requests
...etc
Affects 12 occurrences across the Sweep Protocol (lines 36-43) and Implementation (lines 188-193) sections. A skill implemented from this spec will fail on every MCP call.
Fix: Replace all mcp__chrome-devtools__ with mcp__plugin_chrome-devtools-mcp_chrome-devtools__.
| 3. Wait for full page load + ~6s settle window for async script loads (`mcp__chrome-devtools__wait_for`) | ||
| 4. In parallel: | ||
| - `mcp__chrome-devtools__list_network_requests` → filter for requests where URL ends in `.js` or `Content-Type: application/javascript`, and origin ≠ `publisher.domain` | ||
| - `mcp__chrome-devtools__evaluate_script` → `Array.from(document.head.querySelectorAll('script[src]')).map(s => s.src)` → collect head-loaded script URLs |
There was a problem hiding this comment.
🔧 wrench — wait_for does not provide a settle window
The spec says: "Wait for full page load + ~6s settle window for async script loads (wait_for)"
But wait_for waits for specific text to appear on the page (params: text — list of strings, timeout — max wait ms). It is not a network-idle or timeout mechanism.
Fix: Use evaluate_script with:
await new Promise(r => setTimeout(r, 6000))Or poll list_network_requests to detect when no new script requests appear.
🤔 thinking — 6s may be too short for complex ad tech pages
Header bidding waterfalls, consent-gated loading, and lazy demand partners can take 10-15s. Consider making this configurable (e.g., --settle 10s) or documenting the tradeoff.
| - `mcp__chrome-devtools__list_network_requests` → filter for requests where URL ends in `.js` or `Content-Type: application/javascript`, and origin ≠ `publisher.domain` | ||
| - `mcp__chrome-devtools__evaluate_script` → `Array.from(document.head.querySelectorAll('script[src]')).map(s => s.src)` → collect head-loaded script URLs | ||
| 5. Apply heuristic filter (see below) | ||
| 6. For each surviving asset, generate a `[[js_assets]]` entry (see below) |
There was a problem hiding this comment.
🔧 wrench — list_network_requests filtering claim is inaccurate
The spec says: "filter for requests where URL ends in .js or Content-Type: application/javascript"
The tool supports resourceTypes filtering (e.g., ["script"]) but does not support URL pattern or Content-Type filtering. Those must be done in post-processing.
Fix: Rewrite to: "list_network_requests with resourceTypes: ["script"] → post-filter to exclude first-party origins matching publisher.domain"
| | Social embeds | `platform.twitter.com`, `connect.facebook.net` | | ||
|
|
||
| **`googletagmanager.com` is not filtered** — GTM is ad tech and should be proxied. | ||
|
|
There was a problem hiding this comment.
🤔 thinking — platform.twitter.com may be outdated
Twitter rebranded to X. The domain may already be platform.x.com or will change. Consider listing both, or verifying the current domain.
| | Field | Derivation | | ||
| |---|---| | ||
| | `slug` | `{publisher_prefix}:{asset_stem}` — see slug algorithm below | | ||
| | `path` | `/{publisher_prefix}/{asset_stem}.js`, or wildcard variant if versioned path detected | |
There was a problem hiding this comment.
🔧 wrench — Path derivation algorithm contradicts the examples
This table says path = "/{publisher_prefix}/{asset_stem}.js". For the first example (publisher_prefix = aB3kR7mN, asset_stem = prebid-load), this yields /aB3kR7mN/prebid-load.js.
But the example on line 119 shows path = "/sdk/aB3kR7mN.js" — a completely different structure. The Proxy spec shows the same contradictory examples.
Since path is the routing key between Auditor and Proxy, this must be resolved.
Fix: Align the algorithm description with the examples (or vice versa).
| | `slug` | `{publisher_prefix}:{asset_stem}` — see slug algorithm below | | ||
| | `path` | `/{publisher_prefix}/{asset_stem}.js`, or wildcard variant if versioned path detected | | ||
| | `origin_url` | Full captured URL, with wildcard substitution applied if versioned | | ||
| | `ttl_sec` | Omitted — proxy defaults to 1800 (wildcard) or 3600 (fixed) | |
There was a problem hiding this comment.
🤔 thinking — stale_ttl_sec is not mentioned
The Proxy spec defines stale_ttl_sec: Option<u32> (default 86400) in the JsAsset struct. This field is omitted from the Auditor's Asset Entry Generation table and generated TOML examples. While the proxy defaults handle the omission, it would be consistent to mention it (like ttl_sec is): "Omitted — proxy defaults to 86400".
|
|
||
| **MCP tools used:** | ||
| - `mcp__chrome-devtools__new_page` — open browser tab | ||
| - `mcp__chrome-devtools__navigate_page` — load publisher URL |
There was a problem hiding this comment.
❓ question — Skill location: .claude/skills/ vs .claude/commands/
The spec places the skill at .claude/skills/audit-js-assets.md, but the project has no .claude/skills/ directory. All existing slash commands live in .claude/commands/ (check-ci.md, verify.md, test-all.md, etc.).
Is there a reason to introduce a new directory? If not, .claude/commands/audit-js-assets.md would be consistent with existing patterns.
| **Date:** 2026-04-01 | ||
| **Status:** Approved for engineering breakdown | ||
| **Related:** [JS Asset Proxy spec](2026-04-01-js-asset-proxy-design.md) | ||
|
|
There was a problem hiding this comment.
⛏ nitpick — Relative link to Proxy spec may be broken at merge time
[JS Asset Proxy spec](2026-04-01-js-asset-proxy-design.md) assumes the Proxy spec exists in the same directory. If this PR merges first, the link is dead. Same issue on line 207.
| ``` | ||
|
|
||
| --- | ||
|
|
There was a problem hiding this comment.
👍 praise — Diff mode design is excellent
Appending new entries as commented-out TOML blocks keeps the file valid, requires explicit operator action, and provides full provenance. The "never auto-remove" policy for missing assets is the right default for a monitoring tool.
| - `mcp__chrome-devtools__evaluate_script` → `Array.from(document.head.querySelectorAll('script[src]')).map(s => s.src)` → collect head-loaded script URLs | ||
| 5. Apply heuristic filter (see below) | ||
| 6. For each surviving asset, generate a `[[js_assets]]` entry (see below) | ||
| 7. Write output (init or diff mode) |
There was a problem hiding this comment.
👍 praise — inject_in_head detection via DOM query is sound
document.head.querySelectorAll('script[src]') correctly captures what's in <head> at load time, which is exactly what the Proxy needs to replicate at runtime.
Summary
Engineering spec for the JS Asset Auditor — a Claude Code slash command that sweeps a publisher page via Chrome DevTools MCP, detects third-party JS assets, and generates or diffs
js-assets.toml.Closes #606
Docs added
docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md— engineering spec covering sweep protocol, Chrome DevTools MCP tooling, heuristic filtering, slug generation, init and diff modesTest plan