Skip to content

feat(audit): item-counting grammar to config, harness-gated (#6855 phase 1)#6906

Closed
chubes4 wants to merge 1 commit into
mainfrom
feat/language-grammar-phase1-retry
Closed

feat(audit): item-counting grammar to config, harness-gated (#6855 phase 1)#6906
chubes4 wants to merge 1 commit into
mainfrom
feat/language-grammar-phase1-retry

Conversation

@chubes4

@chubes4 chubes4 commented Jun 28, 2026

Copy link
Copy Markdown
Member

Retry of #6896 gated by the new runtime audit regression harness (#6903).

Extracts structural.rs item counting into config-driven LanguageGrammar. Extends the fixture to cover symbol-graph / cross-reference detectors (the ones #6896 broke) and asserts the live audit finding set is IDENTICAL before/after.

Scope (tight — learned from #6896)

  • ONLY changes how structural.rs::count_top_level_items counts items.
  • enum Language, symbol_graph, import_matching, field_patterns, edit_op_apply untouched.
  • AuditConfig::is_empty() semantics UNCHANGED — language_grammars is deliberately NOT added to is_empty() (suspected contributor to feat(audit): extract item-counting grammar to config (#6855 phase 1) #6896's regression; documented inline).

What moved

  • New LanguageGrammar config (file_extensions, item_declaration_prefixes, visibility_prefixes, modifier_prefixes, ignore_after_line_equals) + generic count_items.
  • AuditConfig.language_grammars field (#[serde(default, skip_serializing_if = "Vec::is_empty")]) + merge.
  • structural.rs now applies grammar generically — ZERO language keyword literals in detector code; deleted count_rust_items/count_php_items/count_js_items/is_rust_item_declaration.
  • Wired grammars through descriptor_runtime.rs only.
  • rs/php/js grammars applied in fixture homeboy.json and the real homeboy.json so counts are preserved.

Harness gate

  • Extended tests/fixtures/audit_runtime with exports.rs/consumer.rs to exercise cross-file symbol resolution (a referenced export is suppressed; orphaned exports surface). EXPECTED_FINDINGS now covers structural AND symbol-graph.
  • cargo test --lib audit_runtime_regression passes with the finding set IDENTICAL before/after the migration — the feat(audit): extract item-counting grammar to config (#6855 phase 1) #6896 break does not reproduce.
  • cargo test --lib structural (24 tests incl. parity) green; cargo build --lib --tests clean.

@homeboy-ci

homeboy-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Homeboy Results — homeboy

Lint

lint — passed

ℹ️ Full options: homeboy docs commands/lint
ℹ️ Save lint baseline: homeboy lint homeboy --baseline
Deep dive: homeboy lint homeboy --changed-since 5c01d04

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-homeboy-lint-homeboy-Linux contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-homeboy-lint-homeboy-Linux contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/homeboy/actions/runs/28327612808

Test

⚠️ test — baseline red

ℹ️ No tests ran — the runner failed before producing results. See raw_output.stderr_tail / raw_output.stdout_tail for the underlying error (bootstrap failure, missing deps, DB connection, etc.).
ℹ️ To run specific tests: homeboy test homeboy -- --filter=TestName
ℹ️ Auto-fix lint issues: homeboy refactor homeboy --from lint --write
ℹ️ Collect coverage: homeboy test homeboy --coverage
ℹ️ Analyze failures: homeboy test homeboy --analyze
ℹ️ Pass args to test runner: homeboy test -- [args]
ℹ️ Full options: homeboy docs commands/test
Deep dive: homeboy test homeboy --changed-since 5c01d04

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-homeboy-test-homeboy-Linux contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-homeboy-test-homeboy-Linux contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/homeboy/actions/runs/28327612808

Audit

audit — failed

  • core_boundary_leak:core-agnostic-source — 70 finding(s)
  • structural — 56 finding(s)
  • intra-method-duplication — 47 finding(s)
  • field_patterns — 35 finding(s)
  • dead_code — 19 finding(s)
  • near-duplication — 17 finding(s)
  • test_quality — 13 finding(s)
  • duplication — 9 finding(s)
  • remote_execution_preflight — 9 finding(s)
  • thin_command_adapter — 8 finding(s)
  • Total: 309 finding(s)

Deep dive: homeboy audit homeboy --changed-since 5c01d04

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-homeboy-audit-homeboy-Linux contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-homeboy-audit-homeboy-Linux contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/homeboy/actions/runs/28327612808
Tooling versions
  • Homeboy CLI: homeboy 0.268.0+656b8ad09816+6ef88b923
  • Extension: rust from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: c2f541c8
  • Action: unknown@unknown

@chubes4 chubes4 marked this pull request as draft June 28, 2026 15:56
@chubes4

chubes4 commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

⚠️ Held as draft — real-codebase audit regresses (+298) despite green harness

The local runtime audit regression harness passes (audit_runtime_regression_matches_snapshot ok — findings on the fixture tree are identical before/after, including the new symbol-graph fixtures consumer.rs/exports.rs). Structural parity tests: 24 passed. Lint ✅ Test ✅.

But the CI Audit gate still fails: delta: +298, drift_increased: true — same broad cross-detector explosion as #6896 (158 CoreBoundaryLeak, 94 IntraMethodDuplicate, 83 HighItemCount, 30 GodFile, etc.) on the REAL codebase.

Ruled out this round

  • NOT the AuditConfig::is_empty() bug (this PR deliberately leaves language_grammars out of is_empty(), with a comment).
  • NOT a JSON break (real homeboy.json is valid; only audit.language_grammars added, nothing removed; 169 baseline fingerprints unchanged).
  • NOT baseline drift (branch is 0 commits behind main; identical baseline).
  • The migration is correct in isolation (parity + harness pass).

The real finding: the harness has a FIDELITY GAP

The regression reproduces ONLY on the real repository, NOT on the small fixture tree. So the harness fixture is too simple to exercise whatever real-codebase condition the migration perturbs (most likely: real source files whose item counts shift subtly under the config-driven count_items vs the original hardcoded path, cascading into GodFile/HighItemCount, and something in the symbol/structure understanding that the fixture doesn't trigger at scale).

Recommended next step (harness improvement, then retry)

Make the regression harness audit the actual homeboy repository tree (or a large, representative snapshot of it) against a committed finding snapshot — not just a toy fixture. A toy fixture can't catch real-codebase-only regressions like this. Once the harness audits the real tree, this exact +298 would fail locally and be diagnosable without a CI round-trip. That harness upgrade is the prerequisite for safely landing #6855 Phase 1.

Draft left open; contract/config/structural migration are correct and salvageable once the harness can actually gate against the real codebase.

@chubes4

chubes4 commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

Narrowed the +298: test files are being scanned that shouldn't be

Drilled into the new CoreBoundaryLeak findings — they land on tests/ files (e.g. agent_task_controller_service/tests/failure_summary_tests.rs, tests/resume_tests.rs, tests/run_next_fan_out_tests.rs). These are auto-skipped by the walker's test-path detection on main, but get scanned on this branch.

Yet I verified:

  • The branch does NOT touch walker.rs or source_policy.rs (test-path logic unchanged).
  • core-agnostic-source exclude_path_contains is byte-identical main vs branch.
  • AuditConfig::is_empty() was deliberately NOT changed.

So something about the migration causes the walker/detector to no longer treat these as test paths at runtime, even though no test-path code or config changed. This is a genuinely subtle runtime coupling that needs the actual audit binary traced against the real repo to pin — static inspection has been exhausted. Reinforces the recommendation: the regression harness must audit the real repository tree so this fails locally and is debuggable. Stopping autonomous attempts here pending that harness upgrade.

@chubes4

chubes4 commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

Status: held as draft. Root cause is a real-codebase-specific audit regression that synthetic-fixture harnesses can't reproduce (proven in #6855 / #6917). Blocked on #6920 (real-repo drift harness). Full consolidated trail: #6855. Migration code here is correct and salvageable once #6920 lands.

@chubes4

chubes4 commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

Closing as superseded by #6915, which is the current grammar phase 1 draft.

@chubes4 chubes4 closed this Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant