Skip to content

fix: filter multi-segment identifiers from Jira issue key matching#902

Merged
dangrondahl merged 4 commits into
mainfrom
900-fix-jira-ticket-regex-cve-false-positives
May 21, 2026
Merged

fix: filter multi-segment identifiers from Jira issue key matching#902
dangrondahl merged 4 commits into
mainfrom
900-fix-jira-ticket-regex-cve-false-positives

Conversation

@dangrondahl
Copy link
Copy Markdown
Contributor

@dangrondahl dangrondahl commented May 21, 2026

Summary

  • Adds FindJiraIssueKeys() to the jira package, encapsulating regex matching and post-filtering in one function. Matches that are always followed by -<digit> in the source text (e.g. CVE-2026-41284) are discarded as false positives.
  • Updates attestJira.go to use FindJiraIssueKeys instead of raw pattern matching via MatchPatternInCommitMessageORBranchName.
  • Adds 10 unit tests covering CVE, CWE, multi-segment identifiers, project key filtering, and edge cases.

Closes #900

Test plan

  • go test ./internal/jira/ passes (10 new tests + 2 existing)
  • make lint passes with 0 issues
  • make test_integration passes (requires local server)

…VE-2026-41284

Adds FindJiraIssueKeys to the jira package which encapsulates pattern
matching and filtering in one function. Matches that are always followed
by -<digit> in the source text are discarded, preventing false positives
from CVE/CWE-style identifiers. (#900)
…itives

Replaces MatchPatternInCommitMessageORBranchName with direct
GetCommitInfoFromCommitSHA + FindJiraIssueKeys, which encapsulates
both regex matching and multi-segment identifier filtering. (#900)
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

Claude finished @dangrondahl's task in 3m 52s —— View job


PR Review: fix: filter multi-segment identifiers from Jira issue key matching

  • Phase 1: Static analysis (go build, go vet, go mod tidy)
  • Phase 2: Read and review changed files
  • Phase 3: Post review feedback

Static Analysis

All checks pass:

  • go build ./... — clean
  • go vet ./... — clean
  • go mod tidy — no diff

Review Summary

This is a clean, well-structured PR. The TDD workflow is evident in the commit history (red → green → refactor → cleanup), and the previous review feedback (precondition comment, edge case tests) was addressed.

What works well

  • Separation of concerns: Moving pattern matching + filtering into FindJiraIssueKeys is a good design choice. The old MatchPatternInCommitMessageORBranchName bundled git operations with regex matching; the new code splits these cleanly.
  • The isPartialMultiSegment algorithm is correct: It checks all occurrences of a candidate match in the text, only filtering when every occurrence is followed by -<digit>. This correctly handles the "standalone + multi-segment" case (e.g., CVE-2026 appearing both in CVE-2026-41284 and on its own).
  • Test coverage is thorough: 12 tests covering CVE, CWE, deduplication, project key filtering, mixed standalone/multi-segment, empty input, and boundary cases.

Minor observations

  1. Behavioral equivalence verified: The old code searched commit message, branch, and secondary source independently, then merged. The new code joins with \n and searches once. Since the Jira key regex [A-Z][A-Z0-9]{1,9}-[0-9]+ cannot match across \n, this is equivalent. The multi-segment filtering works correctly across the combined text because isPartialMultiSegment scans all positions.

  2. Return type change: The old path returned []string{} for no matches; FindJiraIssueKeys returns nil. This is safe since the result is only consumed by range, where nil and empty slice behave identically.

  3. Regex compilation per call: FindJiraIssueKeys compiles two regexes on each call. For a CLI command invoked once, this is negligible — no action needed.

Overall: LGTM. No bugs, no security concerns, no performance issues. The filtering logic is sound and well-tested.

Comment thread internal/jira/jira.go
Comment thread internal/jira/jira_test.go
Address review feedback: document the precondition on
isPartialMultiSegment, add tests for key appearing both standalone
and in a multi-segment identifier, and for dedup behavior.
Comment thread cmd/kosli/attestJira.go
searchTexts = append(searchTexts, o.secondarySource)
}
combinedText := strings.Join(searchTexts, "\n")
issueIDs := jira.FindJiraIssueKeys(combinedText, o.projectKeys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the old MatchPatternInCommitMessageORBranchName returned []string{} for no matches, while FindJiraIssueKeys returns nil. This is fine here since range nil is a no-op in Go — just calling it out for awareness.

@mbevc1
Copy link
Copy Markdown
Contributor

mbevc1 commented May 21, 2026

It's a minor nit, but I suggest we address Regex compilation per call: FindJiraIssueKeys compiles two regexes on each call. For a CLI command invoked once, this is negligible — no action needed.

@dangrondahl
Copy link
Copy Markdown
Contributor Author

dangrondahl commented May 21, 2026

It's a minor nit, but I suggest we address Regex compilation per call: FindJiraIssueKeys compiles two regexes on each call. For a CLI command invoked once, this is negligible — no action needed.

Considered this. Moving the regexes to package-level compiled vars is possible, but the readability cost outweighs the benefit here. As you said, the CLI command runs once per invocation, so the per-call compilation is diminishing here. Keeping the regexes local to the function makes the code easier to follow. Leaving as-is.

@dangrondahl dangrondahl enabled auto-merge (squash) May 21, 2026 11:12
@dangrondahl dangrondahl merged commit 9da9ee6 into main May 21, 2026
20 checks passed
@dangrondahl dangrondahl deleted the 900-fix-jira-ticket-regex-cve-false-positives branch May 21, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

attest jira: ticket regex matches partial CVE identifiers as Jira references

2 participants