feat(ci): add AI security audit for pull requests#923
feat(ci): add AI security audit for pull requests#923davidapp wants to merge 2 commits intotronprotocol:developfrom
Conversation
Add a GitHub Actions workflow that automatically runs AI-powered security audits on every pull request using Claude. The workflow analyzes PR diffs for vulnerabilities and posts structured audit reports as PR comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Code Review1. [HIGH] Prompt Injection via PR Diff ContentThe PR diff (attacker-controlled) is directly embedded into the Claude prompt. A malicious PR could include code comments or strings like: This could trick the AI into producing a false "clean" report, defeating the purpose of the audit. Recommendations (combine for defense in depth):
2. [MEDIUM] stderr Leaks into Audit Comment (line 108)AUDIT_RESULT=$(echo "$FULL_PROMPT" | claude -p --output-format text 2>&1) || true
Fix: Separate stderr and handle failure explicitly: AUDIT_RESULT=$(printf '%s\n' "$FULL_PROMPT" | claude -p --output-format text 2>/tmp/claude_err.txt) || true
if [ ! -s audit_result.md ]; then
echo "AI audit failed to produce results. Check workflow logs for details." > audit_result.md
fi3. [MEDIUM] Unnecessary
|
| Severity | Count | Items |
|---|---|---|
| High | 1 | Prompt injection |
| Medium | 2 | stderr leak, unnecessary fetch-depth |
| Low | 3 | echo fragility, duplicate code, no version pin |
| Info | 1 | expression best practice |
The workflow structure is well thought out — the diff size guard, comment replacement mechanism, and API key validation are solid. The main concern is the prompt injection surface: since this workflow's value proposition is security gating, it's worth hardening against crafted diffs that could manipulate audit output. The most impactful fix is switching from claude -p to the API with proper system/user message separation.
The AI security audit now exits with failure if any CRITICAL severity issues are detected, preventing PRs with critical vulnerabilities from showing a green check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Additional Review FindingsBeyond the 7 issues raised in the previous review (none addressed yet), here are additional concerns: 1. [HIGH] Fork PRs will fail —
|
| # | Issue | Severity | Status |
|---|---|---|---|
| prev-1 | Prompt injection via diff content | HIGH | ❌ Open |
| prev-2 | stderr leaks into PR comment | MEDIUM | ❌ Open |
| prev-3 | Unnecessary fetch-depth: 0 |
MEDIUM | ❌ Open |
| prev-4 | echo fragility for complex content |
LOW | ❌ Open |
| prev-5 | Duplicate comment-deletion logic | LOW | ❌ Open |
| prev-6 | No version pinning for Claude Code | LOW | ❌ Open |
| prev-7 | GitHub expression inline in run: |
INFO | ❌ Open |
| new-1 | Fork PRs fail (read-only token) | HIGH | ❌ Open |
| new-2 | Unnecessary checkout = attack surface | HIGH | ❌ Open |
| new-3 | Shell variable expansion injection | MEDIUM | ❌ Open |
| new-4 | || true swallows all errors |
MEDIUM | ❌ Open |
| new-5 | [CRITICAL] grep unreliable |
LOW | ❌ Open |
| new-6 | wallet-cli not ideal target repo | INFO | ❌ Open |
Total: 3 HIGH, 4 MEDIUM, 4 LOW, 2 INFO — 0 addressed.
The workflow idea has merit, but in its current state it has more security issues than it would catch. Recommend addressing at minimum the 3 HIGH issues before merging.
Summary
claude -pTest plan
🤖 Generated with Claude Code