Skip to content

feat(scanning): advisory npm vulnerability scanning with admin UI#8

Merged
albanm merged 41 commits into
mainfrom
feat-vuln-scan
Jun 9, 2026
Merged

feat(scanning): advisory npm vulnerability scanning with admin UI#8
albanm merged 41 commits into
mainfrom
feat-vuln-scan

Conversation

@albanm

@albanm albanm commented Jun 9, 2026

Copy link
Copy Markdown
Member

Add advisory, admin-only vulnerability scanning of npm artefacts via the bundled osv-scanner v2 binary. Scans run on upload (async, non-blocking), on a periodic interval (also refreshes the offline OSV DB), and on-demand (POST /artefacts/:id/scan). A summary lives on the artefact scan field; full findings are stored in the artefact-scans collection and served by GET /artefacts/:id/scan. All scan data is stripped for non-admin callers (reads, listings, and upload responses). UI adds an admin per-artefact scan panel, a vulnerabilities column + sort on the artefact list, and a fleet-wide overview on the admin dashboard. Off by default (scanning.enabled); when the binary is absent scans fail gracefully.

Why: give admins visibility into known vulnerabilities in bundled node_modules artefacts without ever blocking uploads or downloads.

Regression risks:

  • Scanning is off by default and additive to response shapes (admins gain a scan field; non-admin shapes unchanged), so deployments without it are unaffected.
  • sort=vulnerabilities and the fleet worstOffenders aggregation sort on scan.summary.* without an index (consistent with existing unindexed sorts); watch the aggregation sort-memory limit at fleet scale.
  • On boot and on the rescan interval the server refreshes the ~200 MB OSV DB and scans all npm artefacts, per replica — operational load to be aware of on restart.

albanm and others added 30 commits June 4, 2026 17:24
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…spike)

Verified osv-scanner v2.2.3 detects lockfile-less bundled node_modules via
--experimental-plugins javascript/packagejson on `scan source`. Offline local-DB
mode confirmed. SBOM fallback not needed. Plan updated with pinned invocations.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a readOnly `scan` summary sub-document to the artefact JSON schema,
regenerates TypeScript types, adds ArtefactScan/ScanFinding/ScanLicense
types and the `artefactScans` collection getter to mongo.ts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Downloads the statically-linked osv-scanner v2.2.3 binary in a dedicated
build stage and copies it to /usr/local/bin in the runtime image, making
it available on PATH for the scanning service.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ersion, polish

- runner: cache osv-scanner --version (was spawned per scan); populate
  vulnDbUpdatedAt from the local OSV DB file mtime so admins can gauge result
  freshness (the field was declared but never written).
- operations: guard firstFixedVersion against cross-ecosystem affected ranges.
- service: re-queue keeps the prior summary visible (dotted scan.status/queuedAt
  instead of replacing the whole scan object); persist vulnDbUpdatedAt on the
  scan summary and the artefact-scans doc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ted)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…scan race

- pruneExtracted: anchor the tmp-dir skip to a trailing `.tmp.<pid>` regex so it
  can't be confused with a slot whose encoded id contains ".tmp." mid-string.
- service: comment that an artefact uploaded mid-rescan may be pruned once and
  self-heal on its next scan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ner install

- api/config/development.js: enable scanning, osv-scanner on PATH, dbDir under ./data
- AGENTS.md: dev install instructions for the osv-scanner v2 binary
- test: rename the POST /:id/scan case (dev default is now enabled; assertion
  already tolerates 202 or 503)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…irs scan

The scan cache lives under a git-ignored path in dev (./data) and could be any
ignored/mounted path in prod; osv-scanner honors .gitignore by default and was
silently skipping the whole extracted tree (0 findings). A bundled .gitignore
inside an artefact could hide files too. Disable ignore handling for both
scanDir and refreshDb.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ng is on

Dev now enables scanning, so npm uploads auto-scan and would race/overwrite the
injected scan state (and create a scan doc, breaking the "never scanned" 404
case). Use file artefacts (which never auto-scan) for the inject-based read/strip
tests, and a never-uploaded id for the 404 case. Deterministic whether scanning
is enabled or not.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The prior fix's replace-all matched only refreshDb's args (8-space indent inside
a try block); scanDir's args (6-space indent) never got --no-ignore, so real
artefact scans still skipped the git-ignored cache dir and returned 0 findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Scanning is a security control but was nearly silent in container logs:
only errors surfaced, and a failed osv-scan was caught, written to the doc
as status:error, and never logged at all (the internalError safety nets at
the call sites don't fire since runScanNow doesn't reject).

Add [scan]-prefixed info logging across the lifecycle:
- enqueueScan: "scan queued" (covers upload + on-demand triggers)
- runScanNow: "scan started", and a grep-able result line with severity
  breakdown, install-scripts flag, and duration on success
- runScanNow catch: now logs the failure with artefact id via internalError
  (also bumps the Prometheus counter)
- rescanAll: db-refresh timing and rescan bounds (count + duration)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This route lets tests build a deterministic fleet without going through
the upload pipeline, which would auto-trigger a scan and race injected state.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
albanm and others added 11 commits June 9, 2026 17:33
Non-admins silently fall back to the default dataUpdatedAt sort so scan
data cannot be inferred from ordering.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The src/utils auto-import scan in a long-running dev server does not always
pick up a newly-added util module, leaving severityColor/SEVERITY_ORDER
undefined at runtime and crashing the admin page render. Importing them
explicitly is robust across dev and production builds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…erity color

Addresses final-review findings: gate the dashboard Vulnerabilities section-tabs
on scanning.enabled so it isn't an empty card on the default (scanning-off)
config; reuse the shared severityColor helper in artefact-admin instead of an
inline copy. Adds tests for the pending health bucket / null oldestScanAt and
strengthens the non-admin sort-fallback assertion.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Picks up severity.ts exports and the VulnerabilitySection component, as
regenerated by unplugin from the running dev server.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The scan endpoint returns findings grouped by package in scanner order, which
buries criticals below lower-severity rows. Sort client-side by severity
(critical first), tie-breaking by package then advisory id.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two admin-only-scan-data leaks/cruft found in pre-PR review:

- Upload 201 responses returned the committed artefact verbatim, without
  stripScan. On a re-upload, enqueueScan preserves the prior scan.summary
  via dotted-path writes, so an upload-API-key (non-admin) caller got the
  summary back. Add an omitScan primitive (stripScan now routes through it)
  and apply it to both the npm and file upload responses — uploaders are
  never admin session callers, so the strip is unconditional.
- deleteArtefact left an orphan artefact-scans doc, so an artefact later
  re-created with the same id surfaced stale findings via GET /:id/scan.
  Delete the findings doc alongside the artefact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@albanm albanm merged commit b6fc699 into main Jun 9, 2026
2 checks passed
@albanm albanm deleted the feat-vuln-scan branch June 9, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant