Skip to content

[SREP-4791]: Add sre-agent integration to osdctl AI commands#890

Open
TheUndeadKing wants to merge 2 commits into
openshift:masterfrom
TheUndeadKing:SREP-4791
Open

[SREP-4791]: Add sre-agent integration to osdctl AI commands#890
TheUndeadKing wants to merge 2 commits into
openshift:masterfrom
TheUndeadKing:SREP-4791

Conversation

@TheUndeadKing
Copy link
Copy Markdown
Contributor

@TheUndeadKing TheUndeadKing commented May 13, 2026

Summary

Integrates the sre-agent Python tool as a new osdctl ai sre-agent command with automatic installation, configuration, and validation capabilities.

Files Added

  • cmd/ai/sre_agent/sre_agent.go - Main command and orchestration
  • cmd/ai/sre_agent/sre_agent_install.go - Installation logic
  • cmd/ai/sre_agent/sre_agent_config.go - Configuration management
  • cmd/ai/sre_agent/helper.go - Shared utilities (git, venv, prompts)

Summary by CodeRabbit

  • New Features

    • Added an "ai" command with a new "sre-agent" subcommand to run SRE incident investigations.
    • sre-agent accepts --pd-url (required), --auto-execute (optional), and --output (optional) and supports fully automated or interactive modes.
    • Interactive setup assists with installing/locating the agent environment and required SOP assets.
  • Documentation

    • Added user docs for the ai command and sre-agent usage and examples.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2026
@openshift-ci openshift-ci Bot requested review from Nikokolas3270 and bergmannf May 13, 2026 08:58
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Walkthrough

Adds an ai command group to osdctl and an sre-agent subcommand that validates a per-user sre-agent virtualenv and configuration, optionally copies required assets (venv and ops-sop), then invokes the external sre-agent binary with flags and stdin/stdout/stderr wired to the current process.

Changes

AI Command Group and SRE Agent Subcommand

Layer / File(s) Summary
Base AI command and root integration
cmd/ai/cmd.go, cmd/cmd.go
Adds NewCmdAI() defining osdctl ai (no args) and wires ai into the root CLI via rootCmd.AddCommand(ai.NewCmdAI()).
CLI orchestration: sre-agent command
cmd/ai/sre_agent/sre_agent.go
Adds NewCmdSreAgent() defining flags (--pd-url required, --auto-execute, --output), enforces --pd-url, resolves home dir, runs validation checks, constructs args, and executes the external ~/.local/share/sre-agent/venv/bin/sre-agent subprocess with optional working directory.
Installation validation
cmd/ai/sre_agent/validate_sre_agent.go
Adds validateSreAgent(homeDir string) bool to check for venv/bin/sre-agent under ~/.local/share/sre-agent, prompt user for a venv path if missing, create base directory, and copy the provided venv into baseDir/venv.
Configuration validation and update
cmd/ai/sre_agent/validate_sre_agent_config.go
Adds checkSreAgentConfig(homeDir string) bool to read ~/.config/sre-agent/config.yaml, require sop.directory, prompt for an ops-sop repo path, copy it into ~/.local/share/sre-agent/ops-sop if absent, and update/write the config when the path differs.
Helper utilities
cmd/ai/sre_agent/helper.go
Adds unexported helpers: copyRepository(sourcePath, destPath string) error (runs cp -r and streams output to os.Stderr) and promptUserInput() (string, error) (reads trimmed, lowercased stdin line).
Documentation updates
docs/README.md, docs/osdctl.md, docs/osdctl_ai.md, docs/osdctl_ai_sre-agent.md
Documents osdctl ai and osdctl ai sre-agent in CLI overview and dedicated command pages, listing flags and examples.
sequenceDiagram
    actor User
    participant CLI as osdctl CLI
    participant Cmd as ai sre-agent Cmd
    participant Validate as Validation Logic
    participant FS as File System
    participant Binary as External sre-agent Binary

    User->>CLI: osdctl ai sre-agent --pd-url=<url>
    CLI->>Cmd: execute subcommand
    Cmd->>Validate: validateSreAgent(homeDir)
    Validate->>FS: stat ~/.local/share/sre-agent/venv/bin/sre-agent
    alt not present
        Validate->>User: prompt for venv path
        User->>Validate: provide path
        Validate->>FS: copy venv -> ~/.local/share/sre-agent/venv
    end
    Cmd->>Validate: checkSreAgentConfig(homeDir)
    Validate->>FS: read ~/.config/sre-agent/config.yaml
    alt ops-sop missing or path differs
        Validate->>User: prompt for ops-sop path
        User->>Validate: provide path
        Validate->>FS: copy ops-sop -> ~/.local/share/sre-agent/ops-sop
        Validate->>FS: write updated config.yaml
    end
    Cmd->>Binary: exec ~/.local/share/sre-agent/venv/bin/sre-agent (stdio wired)
    Binary-->>CLI: exit code / output
    CLI-->>User: display result
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding sre-agent integration to osdctl AI commands, which is directly supported by the changeset (new ai command with sre-agent subcommand, validation, and documentation).
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The pull request contains no Ginkgo test files or test code. The custom check for "Stable and Deterministic Test Names" is not applicable as there are no tests to assess.
Test Structure And Quality ✅ Passed PR adds no Ginkgo test code. The custom check for Ginkgo test structure and quality is not applicable as there are no _test.go files or test blocks in this PR.
Microshift Test Compatibility ✅ Passed This PR adds CLI command implementations and documentation, not Ginkgo e2e tests. The custom check targets new Ginkgo e2e tests and is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. The PR adds CLI command implementations and documentation only. The custom check for SNO test compatibility is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds CLI command code only. No Kubernetes manifests, operator code, or controllers are present. Topology-aware scheduling check is not applicable.
Ote Binary Stdout Contract ✅ Passed OTE Binary Stdout Contract check is not applicable. osdctl is a CLI tool (Cobra-based), not an OTE test binary. All new code properly routes messages to stderr.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds CLI commands and helpers, not Ginkgo e2e tests. No Ginkgo tests found. Custom check does not apply.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: TheUndeadKing
Once this PR has been reviewed and has the lgtm label, please assign matesaary for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
cmd/ai/sre_agent/ops_sop.go (1)

12-39: 💤 Low value

Consider renaming to reflect the function's full behavior.

The function name opsSopIsPresent suggests it only checks whether ops-sop is present, but it also updates an existing repository or clones a new one. A more descriptive name like ensureOpsSopReady or setupOpsSopRepo would better communicate that the function performs side effects beyond just checking presence.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/ops_sop.go` around lines 12 - 39, Rename the function
opsSopIsPresent to a name that reflects its side effects (e.g.,
ensureOpsSopReady or setupOpsSopRepo) and update all call sites accordingly;
inside the function, keep the existing logic (checking
utils.FolderExists(opsSopPath), calling gitPullAndValidate(opsSopPath) when
present, and cloneRepository(opsSopRepoURL, opsSopPath) when missing) but change
the function name and exported visibility if needed so callers clearly
understand it performs cloning/updating, not a pure presence check.
cmd/ai/sre_agent/validate_cluster.go (1)

37-41: ⚡ Quick win

Consider defining organization IDs in a single location.

The allowed organization IDs are duplicated in the description (lines 23-25), here in code, and again in sre_agent.go (lines 27-29). If these IDs ever need to change, you'll need to update them in multiple places.

♻️ Suggestion: Define once and reference

Consider defining these in a shared constant or variable in one file and referencing it from the description and validation logic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/validate_cluster.go` around lines 37 - 41, The hard-coded
slice knownOrgIDs is duplicated; extract it into a single shared symbol (e.g.,
an exported package-level variable or constant named AllowedOrgIDs) and replace
the inline knownOrgIDs in validate_cluster.go and the duplicate list in
sre_agent.go and the description to reference that one symbol; ensure the new
symbol has the appropriate visibility for where it’s referenced and update all
uses (validateCluster logic that reads knownOrgIDs and the description
generation that lists allowed org IDs) to use AllowedOrgIDs so changes are made
only once.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/ai/sre_agent/helper.go`:
- Around line 117-133: gitPullAndValidate currently only prints a warning and
ignores the error from running "git status --porcelain"; change it so after the
pull you capture the output and error from exec.Command("git", "-C", repoPath,
"status", "--porcelain").Output(), return a wrapped error if that call fails,
and if the output is non-empty return a descriptive error (e.g.
fmt.Errorf("repository has uncommitted changes: %s",
strings.TrimSpace(string(output)))); update references in the function
gitPullAndValidate (the second cmd variable / output handling) and ensure all
returned errors are wrapped with context.
- Around line 136-140: The promptUserInput function currently ignores the error
from reader.ReadString('\n'); update promptUserInput to return (string, error),
check the error from ReadString and if non-nil return "", err (or a wrapped
fmt.Errorf) after trimming and lowercasing only on success, and update all
callers of promptUserInput to handle the returned error appropriately
(propagate, log, or exit). Ensure you reference the promptUserInput function and
reader.ReadString in your changes so callers are adjusted to the new signature.
- Around line 48-79: The getLatestWheelURL function currently parses the curl
JSON output with string operations; replace that with proper JSON unmarshaling
using encoding/json: run the command to get output as before, define small
structs matching the GitLab release/asset structure (e.g., Release -> Assets ->
Links with fields Name and URL), json.Unmarshal the output into those structs,
iterate the Links slice to find the link with Name == "Python wheel" and return
its URL, and propagate clear errors if unmarshaling fails or no matching link is
found; update error messages in getLatestWheelURL accordingly and remove the
brittle string-index logic.

In `@cmd/ai/sre_agent/sre_agent_config.go`:
- Around line 96-99: The code in checkSreAgentConfig incorrectly treats
updateSopDirectory's false return as a failure; updateSopDirectory only signals
whether it performed an update (true) or no change was needed (false). Modify
checkSreAgentConfig so it does not return false when updateSopDirectory returns
false—either remove the conditional return and ignore the boolean result, or
call updateSopDirectory without treating its return as an error; ensure
references to updateSopDirectory and checkSreAgentConfig are used to locate the
change.

In `@cmd/ai/sre_agent/sre_agent_install.go`:
- Around line 27-30: The code currently calls cmdutil.CheckErr(fmt.Errorf(...))
after os.MkdirAll(baseDir, 0755) and then has a reachable-looking return false
which is actually unreachable because CheckErr exits; remove the redundant
return false so control flow is consistent (keep the CheckErr call and its error
message), or alternatively replace the CheckErr call with direct error handling
that returns false (choose one approach and apply it consistently); locate this
in the block containing os.MkdirAll, baseDir, cmdutil.CheckErr and adjust
accordingly.

In `@cmd/ai/sre_agent/sre_agent.go`:
- Line 97: sreAgentCmd.MarkFlagRequired("pd-url") currently ignores its error
return; capture and handle the error from MarkFlagRequired (called on
sreAgentCmd) by assigning it to a variable (e.g., err) and reacting
appropriately (returning the error from the initializer, logging and exiting, or
calling run-time error handling consistent with surrounding code) so the
function checks and handles any failure from MarkFlagRequired.

In `@cmd/ai/sre_agent/validate_cluster.go`:
- Line 60: The call to validateClusterCmd.MarkFlagRequired("cluster-id") ignores
the returned error; capture its error (e.g., err :=
validateClusterCmd.MarkFlagRequired("cluster-id")) and handle it per existing
command conventions (return the error, log and exit, or use the package's error
helper) so failures are not silently ignored; update the code around
validateClusterCmd.MarkFlagRequired to check err and handle/report it
appropriately.

---

Nitpick comments:
In `@cmd/ai/sre_agent/ops_sop.go`:
- Around line 12-39: Rename the function opsSopIsPresent to a name that reflects
its side effects (e.g., ensureOpsSopReady or setupOpsSopRepo) and update all
call sites accordingly; inside the function, keep the existing logic (checking
utils.FolderExists(opsSopPath), calling gitPullAndValidate(opsSopPath) when
present, and cloneRepository(opsSopRepoURL, opsSopPath) when missing) but change
the function name and exported visibility if needed so callers clearly
understand it performs cloning/updating, not a pure presence check.

In `@cmd/ai/sre_agent/validate_cluster.go`:
- Around line 37-41: The hard-coded slice knownOrgIDs is duplicated; extract it
into a single shared symbol (e.g., an exported package-level variable or
constant named AllowedOrgIDs) and replace the inline knownOrgIDs in
validate_cluster.go and the duplicate list in sre_agent.go and the description
to reference that one symbol; ensure the new symbol has the appropriate
visibility for where it’s referenced and update all uses (validateCluster logic
that reads knownOrgIDs and the description generation that lists allowed org
IDs) to use AllowedOrgIDs so changes are made only once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d5b4b4bf-dd36-43ae-bf6a-9f478bb8407f

📥 Commits

Reviewing files that changed from the base of the PR and between 5b99f69 and f3bb554.

📒 Files selected for processing (8)
  • cmd/ai/cmd.go
  • cmd/ai/sre_agent/helper.go
  • cmd/ai/sre_agent/ops_sop.go
  • cmd/ai/sre_agent/sre_agent.go
  • cmd/ai/sre_agent/sre_agent_config.go
  • cmd/ai/sre_agent/sre_agent_install.go
  • cmd/ai/sre_agent/validate_cluster.go
  • cmd/cmd.go

Comment thread cmd/ai/sre_agent/helper.go Outdated
Comment on lines +48 to +79
func getLatestWheelURL() (string, error) {
cmd := exec.Command("curl", "-s", gitlabReleasesURL)
output, err := cmd.Output()
if err != nil {
return "", fmt.Errorf("failed to fetch releases: %w", err)
}

// Parse JSON to find Python wheel URL
// Looking for: "name":"Python wheel","url":"<wheel_url>"
outputStr := string(output)

// Find first occurrence of "Python wheel" link
wheelIdx := strings.Index(outputStr, `"name":"Python wheel"`)
if wheelIdx == -1 {
return "", fmt.Errorf("no Python wheel found in latest release")
}

// Find the URL after "Python wheel"
urlStart := strings.Index(outputStr[wheelIdx:], `"url":"`)
if urlStart == -1 {
return "", fmt.Errorf("failed to parse wheel URL")
}
urlStart += wheelIdx + len(`"url":"`)

urlEnd := strings.Index(outputStr[urlStart:], `"`)
if urlEnd == -1 {
return "", fmt.Errorf("failed to parse wheel URL end")
}

wheelURL := outputStr[urlStart : urlStart+urlEnd]
return wheelURL, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace manual JSON string parsing with proper JSON unmarshaling.

Using string index operations to parse JSON is fragile and error-prone. The JSON structure could change, field order could vary, or there could be escaped characters that break the parsing. The Go standard library provides encoding/json for reliable JSON parsing.

📦 Proposed fix using encoding/json
+import (
+	"encoding/json"
+)
+
+type GitLabRelease struct {
+	Assets struct {
+		Links []struct {
+			Name string `json:"name"`
+			URL  string `json:"url"`
+		} `json:"links"`
+	} `json:"assets"`
+}
+
 func getLatestWheelURL() (string, error) {
 	cmd := exec.Command("curl", "-s", gitlabReleasesURL)
 	output, err := cmd.Output()
 	if err != nil {
 		return "", fmt.Errorf("failed to fetch releases: %w", err)
 	}
 
-	// Parse JSON to find Python wheel URL
-	// Looking for: "name":"Python wheel","url":"<wheel_url>"
-	outputStr := string(output)
-
-	// Find first occurrence of "Python wheel" link
-	wheelIdx := strings.Index(outputStr, `"name":"Python wheel"`)
-	if wheelIdx == -1 {
-		return "", fmt.Errorf("no Python wheel found in latest release")
-	}
-
-	// Find the URL after "Python wheel"
-	urlStart := strings.Index(outputStr[wheelIdx:], `"url":"`)
-	if urlStart == -1 {
-		return "", fmt.Errorf("failed to parse wheel URL")
-	}
-	urlStart += wheelIdx + len(`"url":"`)
-
-	urlEnd := strings.Index(outputStr[urlStart:], `"`)
-	if urlEnd == -1 {
-		return "", fmt.Errorf("failed to parse wheel URL end")
+	var releases []GitLabRelease
+	if err := json.Unmarshal(output, &releases); err != nil {
+		return "", fmt.Errorf("failed to parse releases JSON: %w", err)
+	}
+
+	if len(releases) == 0 {
+		return "", fmt.Errorf("no releases found")
+	}
+
+	// Find Python wheel in the first (latest) release
+	for _, link := range releases[0].Assets.Links {
+		if link.Name == "Python wheel" {
+			return link.URL, nil
+		}
 	}
 
-	wheelURL := outputStr[urlStart : urlStart+urlEnd]
-	return wheelURL, nil
+	return "", fmt.Errorf("no Python wheel found in latest release")
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 48 - 79, The getLatestWheelURL
function currently parses the curl JSON output with string operations; replace
that with proper JSON unmarshaling using encoding/json: run the command to get
output as before, define small structs matching the GitLab release/asset
structure (e.g., Release -> Assets -> Links with fields Name and URL),
json.Unmarshal the output into those structs, iterate the Links slice to find
the link with Name == "Python wheel" and return its URL, and propagate clear
errors if unmarshaling fails or no matching link is found; update error messages
in getLatestWheelURL accordingly and remove the brittle string-index logic.

Comment thread cmd/ai/sre_agent/helper.go Outdated
Comment on lines +117 to +133
func gitPullAndValidate(repoPath string) error {
cmd := exec.Command("git", "-C", repoPath, "pull")
cmd.Stdout = os.Stderr
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("git pull failed: %w", err)
}

// Validate repository status
cmd = exec.Command("git", "-C", repoPath, "status", "--porcelain")
output, _ := cmd.Output()
if len(output) > 0 {
fmt.Fprintln(os.Stderr, "Warning: Repository has uncommitted changes")
}

return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Function doesn't validate as its name suggests.

The function gitPullAndValidate indicates it will validate the repository state, and the AI summary states it should "fail if git status --porcelain is non-empty after pulling." However, the implementation only prints a warning when uncommitted changes are detected (line 129) and never returns an error. Additionally, the error from git status is silently ignored on line 127.

🔧 Proposed fix to match the function's contract
 func gitPullAndValidate(repoPath string) error {
 	cmd := exec.Command("git", "-C", repoPath, "pull")
 	cmd.Stdout = os.Stderr
 	cmd.Stderr = os.Stderr
 	if err := cmd.Run(); err != nil {
 		return fmt.Errorf("git pull failed: %w", err)
 	}
 
 	// Validate repository status
 	cmd = exec.Command("git", "-C", repoPath, "status", "--porcelain")
-	output, _ := cmd.Output()
+	output, err := cmd.Output()
+	if err != nil {
+		return fmt.Errorf("git status failed: %w", err)
+	}
 	if len(output) > 0 {
-		fmt.Fprintln(os.Stderr, "Warning: Repository has uncommitted changes")
+		return fmt.Errorf("repository has uncommitted changes:\n%s", string(output))
 	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 117 - 133, gitPullAndValidate
currently only prints a warning and ignores the error from running "git status
--porcelain"; change it so after the pull you capture the output and error from
exec.Command("git", "-C", repoPath, "status", "--porcelain").Output(), return a
wrapped error if that call fails, and if the output is non-empty return a
descriptive error (e.g. fmt.Errorf("repository has uncommitted changes: %s",
strings.TrimSpace(string(output)))); update references in the function
gitPullAndValidate (the second cmd variable / output handling) and ensure all
returned errors are wrapped with context.

Comment thread cmd/ai/sre_agent/helper.go Outdated
Comment thread cmd/ai/sre_agent/sre_agent_config.go Outdated
Comment thread cmd/ai/sre_agent/sre_agent_install.go Outdated
Comment on lines +27 to +30
if err := os.MkdirAll(baseDir, 0755); err != nil {
cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err))
return false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Inconsistent error handling with cmdutil.CheckErr.

cmdutil.CheckErr typically calls os.Exit(1) when given a non-nil error, which means the return false on line 29 is unreachable. This creates confusion about the error handling flow. Either remove the CheckErr wrapper and handle the error directly, or remove the unreachable return statement.

🔧 Proposed fix with consistent error handling
 	// Create base directory
 	if err := os.MkdirAll(baseDir, 0755); err != nil {
-		cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err))
+		fmt.Fprintf(os.Stderr, "Failed to create base directory: %v\n", err)
 		return false
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := os.MkdirAll(baseDir, 0755); err != nil {
cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err))
return false
}
if err := os.MkdirAll(baseDir, 0755); err != nil {
fmt.Fprintf(os.Stderr, "Failed to create base directory: %v\n", err)
return false
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent_install.go` around lines 27 - 30, The code
currently calls cmdutil.CheckErr(fmt.Errorf(...)) after os.MkdirAll(baseDir,
0755) and then has a reachable-looking return false which is actually
unreachable because CheckErr exits; remove the redundant return false so control
flow is consistent (keep the CheckErr call and its error message), or
alternatively replace the CheckErr call with direct error handling that returns
false (choose one approach and apply it consistently); locate this in the block
containing os.MkdirAll, baseDir, cmdutil.CheckErr and adjust accordingly.

Comment thread cmd/ai/sre_agent/sre_agent.go Outdated
Comment thread cmd/ai/sre_agent/validate_cluster.go Outdated
}

validateClusterCmd.Flags().StringVarP(&clusterID, "cluster-id", "C", "", "Cluster ID (internal or external)")
validateClusterCmd.MarkFlagRequired("cluster-id")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Check the error return from MarkFlagRequired.

The error returned by MarkFlagRequired should be checked. While failures are rare (typically only if the flag doesn't exist), Go conventions and the static analysis tool both recommend checking all errors.

✅ Proposed fix
 	validateClusterCmd.Flags().StringVarP(&clusterID, "cluster-id", "C", "", "Cluster ID (internal or external)")
-	validateClusterCmd.MarkFlagRequired("cluster-id")
+	if err := validateClusterCmd.MarkFlagRequired("cluster-id"); err != nil {
+		fmt.Fprintf(os.Stderr, "Error marking flag as required: %v\n", err)
+		os.Exit(1)
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
validateClusterCmd.MarkFlagRequired("cluster-id")
validateClusterCmd.Flags().StringVarP(&clusterID, "cluster-id", "C", "", "Cluster ID (internal or external)")
if err := validateClusterCmd.MarkFlagRequired("cluster-id"); err != nil {
fmt.Fprintf(os.Stderr, "Error marking flag as required: %v\n", err)
os.Exit(1)
}
🧰 Tools
🪛 golangci-lint (2.12.2)

[error] 60-60: Error return value of validateClusterCmd.MarkFlagRequired is not checked

(errcheck)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/validate_cluster.go` at line 60, The call to
validateClusterCmd.MarkFlagRequired("cluster-id") ignores the returned error;
capture its error (e.g., err :=
validateClusterCmd.MarkFlagRequired("cluster-id")) and handle it per existing
command conventions (return the error, log and exit, or use the package's error
helper) so failures are not silently ignored; update the code around
validateClusterCmd.MarkFlagRequired to check err and handle/report it
appropriately.

@TheUndeadKing TheUndeadKing force-pushed the SREP-4791 branch 2 times, most recently from f3bb554 to 7941c1d Compare May 13, 2026 09:37
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2026
Comment thread cmd/ai/sre_agent/helper.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (5)
cmd/ai/sre_agent/helper.go (2)

45-49: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle the error from ReadString.

The error returned by reader.ReadString('\n') is silently ignored. If reading from stdin fails (e.g., stdin is closed or encounters an error), the function will return an empty string with no indication that something went wrong, potentially causing unexpected behavior in callers.

🛡️ Proposed fix to handle read errors
-func promptUserInput() string {
+func promptUserInput() (string, error) {
 	reader := bufio.NewReader(os.Stdin)
-	input, _ := reader.ReadString('\n')
-	return strings.ToLower(strings.TrimSpace(input))
+	input, err := reader.ReadString('\n')
+	if err != nil {
+		return "", fmt.Errorf("failed to read input: %w", err)
+	}
+	return strings.ToLower(strings.TrimSpace(input)), nil
 }

Note: This change will require updating all callers of promptUserInput() to handle the returned error.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 45 - 49, The promptUserInput
function currently ignores the error from reader.ReadString and should surface
it: change promptUserInput to return (string, error), check the error returned
by reader.ReadString('\n') and return an appropriate trimmed/lowercased string
on success or the error on failure, and then update all callers to handle the
new (string, error) signature (e.g., in any place that calls promptUserInput) so
read failures are propagated instead of silently returning an empty string.

26-42: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Function doesn't validate as its name suggests.

The function gitPullAndValidate indicates it will validate the repository state. However, the implementation only prints a warning when uncommitted changes are detected (line 38) and never returns an error. Additionally, the error from git status is silently ignored on line 36.

🔧 Proposed fix to match the function's contract
 func gitPullAndValidate(repoPath string) error {
 	cmd := exec.Command("git", "-C", repoPath, "pull")
 	cmd.Stdout = os.Stderr
 	cmd.Stderr = os.Stderr
 	if err := cmd.Run(); err != nil {
 		return fmt.Errorf("git pull failed: %w", err)
 	}
 
 	// Validate repository status
 	cmd = exec.Command("git", "-C", repoPath, "status", "--porcelain")
-	output, _ := cmd.Output()
+	output, err := cmd.Output()
+	if err != nil {
+		return fmt.Errorf("git status failed: %w", err)
+	}
 	if len(output) > 0 {
-		fmt.Fprintln(os.Stderr, "Warning: Repository has uncommitted changes")
+		return fmt.Errorf("repository has uncommitted changes:\n%s", string(output))
 	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 26 - 42, The function
gitPullAndValidate currently ignores errors from the "git status" call and only
prints a warning for uncommitted changes; change it so gitPullAndValidate
returns an error when repository is not clean and surfaces the git status error
instead of ignoring it. Specifically, check and handle the error returned by
cmd.Output() from the "git -C repoPath status --porcelain" invocation (return
fmt.Errorf with context on failure), and if the output indicates uncommitted
changes (non-empty), return a descriptive error rather than only printing to
stderr; keep the existing behavior for git pull (exec.Command("git", "-C",
repoPath, "pull")) but propagate any errors up through gitPullAndValidate.
cmd/ai/sre_agent/sre_agent_config.go (1)

97-99: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Incorrect handling of updateSopDirectory return value.

The function updateSopDirectory returns true when the config was updated and false when no change was needed (see lines 200-223). However, line 97 treats a false return as a failure condition by returning false from checkSreAgentConfig. This means the configuration setup will fail when the SOP directory is already correct, which is not the intended behavior.

🐛 Proposed fix
 	// Update SOP directory path
-	if !updateSopDirectory(config, homeDir) {
-		return false
-	}
+	updateSopDirectory(config, homeDir)

The return value can be ignored here since updateSopDirectory never fails—it only indicates whether an update was made.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent_config.go` around lines 97 - 99, The check in
checkSreAgentConfig incorrectly treats updateSopDirectory returning false as an
error; updateSopDirectory only signals whether it modified the config (true) or
no change was needed (false). Remove the conditional failure: call
updateSopDirectory(config, homeDir) and ignore its boolean return (or only log
the result) rather than returning false from checkSreAgentConfig when it returns
false, keeping the rest of the function flow unchanged.
cmd/ai/sre_agent/sre_agent_install.go (1)

26-29: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unreachable return after cmdutil.CheckErr.

cmdutil.CheckErr calls os.Exit(1) when given a non-nil error, which means the return false on line 28 is unreachable. This creates confusion about the error handling flow. Either remove the CheckErr wrapper and handle the error directly, or remove the unreachable return statement.

🔧 Proposed fix with consistent error handling
 	// Create base directory
 	if err := os.MkdirAll(baseDir, 0755); err != nil {
-		cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err))
+		fmt.Fprintf(os.Stderr, "Failed to create base directory: %v\n", err)
 		return false
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent_install.go` around lines 26 - 29, The call to
cmdutil.CheckErr after os.MkdirAll makes the subsequent return false unreachable
because CheckErr exits on error; update the error handling in the function
containing this code (the os.MkdirAll call) by choosing one consistent flow:
either remove cmdutil.CheckErr and handle the error locally (e.g., log or wrap
the error and return false), or keep cmdutil.CheckErr and remove the unreachable
return false; specifically edit the block around os.MkdirAll(baseDir, 0755) to
use only one of these approaches and ensure the function's return behavior
matches (no dead code).
cmd/ai/sre_agent/sre_agent.go (1)

85-85: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Check the error return from MarkFlagRequired.

The error returned by MarkFlagRequired should be checked. While failures are rare, Go conventions and the static analysis tool both recommend checking all errors.

✅ Proposed fix
 	// Mark pd-url as required
-	sreAgentCmd.MarkFlagRequired("pd-url")
+	if err := sreAgentCmd.MarkFlagRequired("pd-url"); err != nil {
+		// This should never happen with a valid flag name, but we check for completeness
+		panic(fmt.Sprintf("failed to mark flag as required: %v", err))
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent.go` at line 85, MarkFlagRequired call on
sreAgentCmd (sreAgentCmd.MarkFlagRequired("pd-url")) ignores its error; change
it to capture and handle the returned error according to project conventions:
assign the error (e.g., err := sreAgentCmd.MarkFlagRequired("pd-url")), check if
err != nil and return or log/exit appropriately within the enclosing
init/command setup function. Ensure you reference sreAgentCmd and the "pd-url"
flag when locating the call and use the same error handling style used elsewhere
in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/ai/sre_agent/helper.go`:
- Around line 12-23: In copyRepository, validate sourcePath before invoking the
external cp: call os.Stat(sourcePath) (and use os.IsNotExist to distinguish
missing paths) and return a clear, contextual error if the path does not exist
or is inaccessible; only proceed to exec.Command("cp", "-r", sourcePath,
destPath) after the check (you can also optionally verify it's a directory via
FileInfo.IsDir()), so errors are deterministic and more informative than the raw
cp failure.

In `@cmd/ai/sre_agent/sre_agent_config.go`:
- Around line 64-67: The call to cmdutil.CheckErr makes the subsequent return
false unreachable; replace the cmdutil.CheckErr(fmt.Errorf(...)) inside the
os.MkdirAll error branch with direct error reporting and a normal return so
control flow matches the rest of the function: log or write the formatted error
message (e.g., via fmt.Fprintf(os.Stderr, "failed to create config directory:
%v\n", err) or the project’s standard error writer) and then return false;
update the branch that uses os.MkdirAll(configDir, 0755) to remove
cmdutil.CheckErr and perform explicit error output followed by return false.

In `@cmd/ai/sre_agent/sre_agent.go`:
- Around line 48-52: The explicit "return" after calling cmdutil.CheckErr in the
Run callback is unreachable because cmdutil.CheckErr exits on error; remove the
redundant "return" following the cmdutil.CheckErr call (the block around
os.UserHomeDir / err handling in sre_agent.go) so the function flow is
clear—keep the CheckErr call as-is and delete the subsequent "return".

---

Duplicate comments:
In `@cmd/ai/sre_agent/helper.go`:
- Around line 45-49: The promptUserInput function currently ignores the error
from reader.ReadString and should surface it: change promptUserInput to return
(string, error), check the error returned by reader.ReadString('\n') and return
an appropriate trimmed/lowercased string on success or the error on failure, and
then update all callers to handle the new (string, error) signature (e.g., in
any place that calls promptUserInput) so read failures are propagated instead of
silently returning an empty string.
- Around line 26-42: The function gitPullAndValidate currently ignores errors
from the "git status" call and only prints a warning for uncommitted changes;
change it so gitPullAndValidate returns an error when repository is not clean
and surfaces the git status error instead of ignoring it. Specifically, check
and handle the error returned by cmd.Output() from the "git -C repoPath status
--porcelain" invocation (return fmt.Errorf with context on failure), and if the
output indicates uncommitted changes (non-empty), return a descriptive error
rather than only printing to stderr; keep the existing behavior for git pull
(exec.Command("git", "-C", repoPath, "pull")) but propagate any errors up
through gitPullAndValidate.

In `@cmd/ai/sre_agent/sre_agent_config.go`:
- Around line 97-99: The check in checkSreAgentConfig incorrectly treats
updateSopDirectory returning false as an error; updateSopDirectory only signals
whether it modified the config (true) or no change was needed (false). Remove
the conditional failure: call updateSopDirectory(config, homeDir) and ignore its
boolean return (or only log the result) rather than returning false from
checkSreAgentConfig when it returns false, keeping the rest of the function flow
unchanged.

In `@cmd/ai/sre_agent/sre_agent_install.go`:
- Around line 26-29: The call to cmdutil.CheckErr after os.MkdirAll makes the
subsequent return false unreachable because CheckErr exits on error; update the
error handling in the function containing this code (the os.MkdirAll call) by
choosing one consistent flow: either remove cmdutil.CheckErr and handle the
error locally (e.g., log or wrap the error and return false), or keep
cmdutil.CheckErr and remove the unreachable return false; specifically edit the
block around os.MkdirAll(baseDir, 0755) to use only one of these approaches and
ensure the function's return behavior matches (no dead code).

In `@cmd/ai/sre_agent/sre_agent.go`:
- Line 85: MarkFlagRequired call on sreAgentCmd
(sreAgentCmd.MarkFlagRequired("pd-url")) ignores its error; change it to capture
and handle the returned error according to project conventions: assign the error
(e.g., err := sreAgentCmd.MarkFlagRequired("pd-url")), check if err != nil and
return or log/exit appropriately within the enclosing init/command setup
function. Ensure you reference sreAgentCmd and the "pd-url" flag when locating
the call and use the same error handling style used elsewhere in the file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e68ddeed-d9c5-45b1-a8c5-ea2610b47ee9

📥 Commits

Reviewing files that changed from the base of the PR and between f3bb554 and f5d55c6.

📒 Files selected for processing (7)
  • cmd/ai/cmd.go
  • cmd/ai/sre_agent/helper.go
  • cmd/ai/sre_agent/ops_sop.go
  • cmd/ai/sre_agent/sre_agent.go
  • cmd/ai/sre_agent/sre_agent_config.go
  • cmd/ai/sre_agent/sre_agent_install.go
  • cmd/cmd.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cmd/ai/cmd.go
  • cmd/cmd.go

Comment thread cmd/ai/sre_agent/helper.go
Comment thread cmd/ai/sre_agent/sre_agent_config.go Outdated
Comment on lines +64 to +67
if err := os.MkdirAll(configDir, 0755); err != nil {
cmdutil.CheckErr(fmt.Errorf("failed to create config directory: %w", err))
return false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unreachable return after cmdutil.CheckErr.

cmdutil.CheckErr calls os.Exit(1) when given a non-nil error, making the return false on line 66 unreachable. For consistency with the rest of the function's error handling pattern, replace cmdutil.CheckErr with direct error reporting.

🔧 Proposed fix with consistent error handling
 	// Create config directory
 	if err := os.MkdirAll(configDir, 0755); err != nil {
-		cmdutil.CheckErr(fmt.Errorf("failed to create config directory: %w", err))
+		fmt.Fprintf(os.Stderr, "Failed to create config directory: %v\n", err)
 		return false
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := os.MkdirAll(configDir, 0755); err != nil {
cmdutil.CheckErr(fmt.Errorf("failed to create config directory: %w", err))
return false
}
if err := os.MkdirAll(configDir, 0755); err != nil {
fmt.Fprintf(os.Stderr, "Failed to create config directory: %v\n", err)
return false
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent_config.go` around lines 64 - 67, The call to
cmdutil.CheckErr makes the subsequent return false unreachable; replace the
cmdutil.CheckErr(fmt.Errorf(...)) inside the os.MkdirAll error branch with
direct error reporting and a normal return so control flow matches the rest of
the function: log or write the formatted error message (e.g., via
fmt.Fprintf(os.Stderr, "failed to create config directory: %v\n", err) or the
project’s standard error writer) and then return false; update the branch that
uses os.MkdirAll(configDir, 0755) to remove cmdutil.CheckErr and perform
explicit error output followed by return false.

Comment thread cmd/ai/sre_agent/sre_agent.go
@TheUndeadKing TheUndeadKing force-pushed the SREP-4791 branch 2 times, most recently from 4b3d4d8 to 1048df2 Compare May 13, 2026 10:53
@TheUndeadKing
Copy link
Copy Markdown
Contributor Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 13, 2026
@TheUndeadKing
Copy link
Copy Markdown
Contributor Author

/test all

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
cmd/ai/sre_agent/helper.go (2)

12-23: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate source path exists before attempting copy.

The function does not check whether sourcePath exists before invoking cp -r. If the source is missing or inaccessible, the cp command will fail with a potentially unclear error message.

🛡️ Proposed fix to add source validation
 func copyRepository(sourcePath, destPath string) error {
+	// Validate source exists
+	if _, err := os.Stat(sourcePath); err != nil {
+		if os.IsNotExist(err) {
+			return fmt.Errorf("source path does not exist: %s", sourcePath)
+		}
+		return fmt.Errorf("cannot access source path: %w", err)
+	}
+
 	fmt.Fprintf(os.Stderr, "Copying repository to %s...\n", destPath)
 	cmd := exec.Command("cp", "-r", sourcePath, destPath)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 12 - 23, In copyRepository, validate
that sourcePath exists and is accessible before running exec.Command: use
os.Stat on sourcePath and if it returns an error (or os.IsNotExist(err)) return
a descriptive error mentioning the missing/unreadable sourcePath; only proceed
to create and run the cp command (cmd := exec.Command("cp", "-r", sourcePath,
destPath)) when the stat succeeds to avoid opaque cp failures.

26-30: ⚠️ Potential issue | 🟡 Minor | ⚖️ Poor tradeoff

Handle the error from ReadString.

The error returned by reader.ReadString('\n') is silently ignored. If reading from stdin fails (e.g., stdin is closed or encounters an EOF/error), the function will return an empty string with no indication that something went wrong, potentially causing unexpected behavior in the validation logic.

🛡️ Proposed fix to handle read errors

Note: This requires updating the function signature and all callers in validate_sre_agent.go:26 and validate_sre_agent_config.go:50.

-func promptUserInput() string {
+func promptUserInput() (string, error) {
 	reader := bufio.NewReader(os.Stdin)
-	input, _ := reader.ReadString('\n')
-	return strings.ToLower(strings.TrimSpace(input))
+	input, err := reader.ReadString('\n')
+	if err != nil {
+		return "", fmt.Errorf("failed to read input: %w", err)
+	}
+	return strings.ToLower(strings.TrimSpace(input)), nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/helper.go` around lines 26 - 30, The promptUserInput
function currently ignores the error from reader.ReadString('\n'); change its
signature to return (string, error), check and handle the error from
ReadString('\n') (return "", err on failure), and only on success return the
trimmed lowercase string; then update all callers (validate_sre_agent.go's call
at line referenced and validate_sre_agent_config.go's call) to handle the error
return (propagate or surface a user-facing error) and adjust any validation flow
accordingly.
cmd/ai/sre_agent/sre_agent.go (2)

50-51: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unreachable return after cmdutil.CheckErr.

cmdutil.CheckErr calls os.Exit(1) when given a non-nil error, making the return on line 51 unreachable. Since the Run callback has no return value, the explicit return after CheckErr is unnecessary.

🔧 Proposed fix
 			homeDir, err := os.UserHomeDir()
 			if err != nil {
 				cmdutil.CheckErr(fmt.Errorf("failed to get home directory: %w", err))
-				return
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent.go` around lines 50 - 51, Remove the redundant
unreachable return after cmdutil.CheckErr: locate the Run callback where
cmdutil.CheckErr(fmt.Errorf("failed to get home directory: %w", err)) is called
and delete the following explicit "return" statement since cmdutil.CheckErr
exits on error and Run has no return value; keep the error check call but remove
the dead return line.

80-80: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Check the error return from MarkFlagRequired.

The error returned by MarkFlagRequired should be checked. While failures are rare (typically only programming errors like marking a non-existent flag), Go conventions and the static analysis tool both recommend checking all errors.

✅ Proposed fix
 	// Mark pd-url as required
-	sreAgentCmd.MarkFlagRequired("pd-url")
+	if err := sreAgentCmd.MarkFlagRequired("pd-url"); err != nil {
+		// This should never happen unless there's a programming error
+		panic(fmt.Sprintf("failed to mark pd-url as required: %v", err))
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent.go` at line 80, The call to
sreAgentCmd.MarkFlagRequired("pd-url") currently ignores its error; capture and
handle the returned error (e.g., err := sreAgentCmd.MarkFlagRequired("pd-url")
and if err != nil { log.Fatalf(...) } or return the error from init/setup) so
failures are not silently ignored. Update the code around
sreAgentCmd.MarkFlagRequired to check the error and handle it appropriately (use
your existing logger or return the error from the enclosing function) to satisfy
Go error-handling conventions.
🧹 Nitpick comments (1)
cmd/ai/sre_agent/sre_agent.go (1)

38-83: Consider the installation approach feedback.

A past review comment from bergmannf raised concerns about embedding installation logic in osdctl, suggesting that tools should be packaged via Homebrew or RPM instead. While the current implementation prompts for an existing installation path rather than automatically downloading/installing, the team should consider whether this copy-based approach aligns with the project's packaging strategy and maintenance model.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent.go` around lines 38 - 83, The command currently
embeds installation/copy behavior via validateSreAgent and checkSreAgentConfig;
instead remove any automatic install/copy attempts and make NewCmdSreAgent only
validate presence of a user-managed installation: update validateSreAgent and
checkSreAgentConfig to return false with a clear error message directing users
to install the tool via supported packaging (Homebrew/RPM) or provide an
existing path, ensure executeSreAgent only runs when the binary exists at
sreAgentPath, and adjust command help/flags or docs to document the expected
packaging/installation approach rather than performing installation in code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/ai/sre_agent/validate_sre_agent_config.go`:
- Line 45: Check that sop["directory"] exists and is a string before using it:
replace the silent type assertion currentSopDir, _ := sop["directory"].(string)
with an explicit check that sop["directory"] is present and of type string
(e.g., v, ok := sop["directory"]; s, okType := v.(string)), and if not present
or not a string, return or append a validation error (so the config is rejected
or flagged) rather than proceeding with an empty currentSopDir; update the
validation flow that uses currentSopDir to rely on this guard.

In `@cmd/ai/sre_agent/validate_sre_agent.go`:
- Around line 36-39: The call to cmdutil.CheckErr after os.MkdirAll makes the
subsequent "return false" unreachable because CheckErr exits the process; update
the error handling in the block around os.MkdirAll (in validateSREAgent or the
function containing os.MkdirAll and cmdutil.CheckErr) by either removing the
unreachable "return false" or replacing cmdutil.CheckErr with non-fatal handling
that logs the error and returns false (so the function can continue to return a
boolean instead of exiting).

---

Duplicate comments:
In `@cmd/ai/sre_agent/helper.go`:
- Around line 12-23: In copyRepository, validate that sourcePath exists and is
accessible before running exec.Command: use os.Stat on sourcePath and if it
returns an error (or os.IsNotExist(err)) return a descriptive error mentioning
the missing/unreadable sourcePath; only proceed to create and run the cp command
(cmd := exec.Command("cp", "-r", sourcePath, destPath)) when the stat succeeds
to avoid opaque cp failures.
- Around line 26-30: The promptUserInput function currently ignores the error
from reader.ReadString('\n'); change its signature to return (string, error),
check and handle the error from ReadString('\n') (return "", err on failure),
and only on success return the trimmed lowercase string; then update all callers
(validate_sre_agent.go's call at line referenced and
validate_sre_agent_config.go's call) to handle the error return (propagate or
surface a user-facing error) and adjust any validation flow accordingly.

In `@cmd/ai/sre_agent/sre_agent.go`:
- Around line 50-51: Remove the redundant unreachable return after
cmdutil.CheckErr: locate the Run callback where
cmdutil.CheckErr(fmt.Errorf("failed to get home directory: %w", err)) is called
and delete the following explicit "return" statement since cmdutil.CheckErr
exits on error and Run has no return value; keep the error check call but remove
the dead return line.
- Line 80: The call to sreAgentCmd.MarkFlagRequired("pd-url") currently ignores
its error; capture and handle the returned error (e.g., err :=
sreAgentCmd.MarkFlagRequired("pd-url") and if err != nil { log.Fatalf(...) } or
return the error from init/setup) so failures are not silently ignored. Update
the code around sreAgentCmd.MarkFlagRequired to check the error and handle it
appropriately (use your existing logger or return the error from the enclosing
function) to satisfy Go error-handling conventions.

---

Nitpick comments:
In `@cmd/ai/sre_agent/sre_agent.go`:
- Around line 38-83: The command currently embeds installation/copy behavior via
validateSreAgent and checkSreAgentConfig; instead remove any automatic
install/copy attempts and make NewCmdSreAgent only validate presence of a
user-managed installation: update validateSreAgent and checkSreAgentConfig to
return false with a clear error message directing users to install the tool via
supported packaging (Homebrew/RPM) or provide an existing path, ensure
executeSreAgent only runs when the binary exists at sreAgentPath, and adjust
command help/flags or docs to document the expected packaging/installation
approach rather than performing installation in code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 370a85fd-2ec3-4694-8662-a588c6bc162c

📥 Commits

Reviewing files that changed from the base of the PR and between f5d55c6 and 52752a0.

📒 Files selected for processing (6)
  • cmd/ai/cmd.go
  • cmd/ai/sre_agent/helper.go
  • cmd/ai/sre_agent/sre_agent.go
  • cmd/ai/sre_agent/validate_sre_agent.go
  • cmd/ai/sre_agent/validate_sre_agent_config.go
  • cmd/cmd.go
✅ Files skipped from review due to trivial changes (1)
  • cmd/cmd.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/ai/cmd.go

Comment thread cmd/ai/sre_agent/validate_sre_agent_config.go Outdated
Comment thread cmd/ai/sre_agent/validate_sre_agent.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
cmd/ai/sre_agent/validate_sre_agent.go (1)

40-42: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Inconsistent error handling violates function contract.

While the unreachable return false statement was removed from the previous version, the underlying issue remains: cmdutil.CheckErr calls os.Exit(1) on error, which is inconsistent with the rest of this function's error handling pattern (lines 29, 36, 48 all return false). This violates the function's boolean return contract and causes the process to exit instead of allowing the caller to handle the failure.

🔧 Proposed fix
 	// Create base directory
 	if err := os.MkdirAll(baseDir, 0755); err != nil {
-		cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err))
+		fmt.Fprintf(os.Stderr, "Failed to create base directory: %v\n", err)
+		return false
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/validate_sre_agent.go` around lines 40 - 42, The code calls
cmdutil.CheckErr after os.MkdirAll which exits the process, breaking this
function's boolean-return contract; replace the cmdutil.CheckErr call with
non-fatal error handling that mirrors the other branches (e.g., log the error
including context for baseDir and return false). Specifically, in the block that
currently uses cmdutil.CheckErr(fmt.Errorf("failed to create base directory:
%w", err)), remove the CheckErr call, emit an error via the same logging
mechanism used elsewhere (or fmt.Fprintf(os.Stderr, ...)), and then return false
so callers can handle the failure; reference os.MkdirAll, baseDir and
cmdutil.CheckErr to locate and update the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/ai/sre_agent/validate_sre_agent.go`:
- Around line 24-30: The prompt accepts userVenvPath from promptUserInput but
doesn't validate it's absolute; update the validation after reading userVenvPath
(in the same block that currently checks err) to use
filepath.IsAbs(userVenvPath) and, if false, print a clear error to os.Stderr
(e.g., "path must be absolute") and return false (or re-prompt if desired) so
downstream file/copy operations use a guaranteed absolute path; ensure you
import "path/filepath" if not already present and reference userVenvPath and
promptUserInput in the change.

---

Duplicate comments:
In `@cmd/ai/sre_agent/validate_sre_agent.go`:
- Around line 40-42: The code calls cmdutil.CheckErr after os.MkdirAll which
exits the process, breaking this function's boolean-return contract; replace the
cmdutil.CheckErr call with non-fatal error handling that mirrors the other
branches (e.g., log the error including context for baseDir and return false).
Specifically, in the block that currently uses
cmdutil.CheckErr(fmt.Errorf("failed to create base directory: %w", err)), remove
the CheckErr call, emit an error via the same logging mechanism used elsewhere
(or fmt.Fprintf(os.Stderr, ...)), and then return false so callers can handle
the failure; reference os.MkdirAll, baseDir and cmdutil.CheckErr to locate and
update the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a66b92a9-e5b0-42cc-9680-b414e2d4e707

📥 Commits

Reviewing files that changed from the base of the PR and between 52752a0 and a7a328b.

📒 Files selected for processing (8)
  • cmd/ai/sre_agent/helper.go
  • cmd/ai/sre_agent/sre_agent.go
  • cmd/ai/sre_agent/validate_sre_agent.go
  • cmd/ai/sre_agent/validate_sre_agent_config.go
  • docs/README.md
  • docs/osdctl.md
  • docs/osdctl_ai.md
  • docs/osdctl_ai_sre-agent.md
✅ Files skipped from review due to trivial changes (1)
  • docs/osdctl.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • cmd/ai/sre_agent/sre_agent.go
  • cmd/ai/sre_agent/helper.go
  • cmd/ai/sre_agent/validate_sre_agent_config.go

Comment on lines +24 to +30
// Ask for path to sre-agent venv
fmt.Fprint(os.Stderr, "Enter the absolute path to sre-agent venv directory: ")
userVenvPath, err := promptUserInput()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
return false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate that the user-provided path is absolute.

The prompt on line 25 explicitly requests an "absolute path", but there's no validation to ensure the provided userVenvPath is actually absolute. If the user provides a relative path, the subsequent file checks and copy operations may behave unexpectedly depending on the current working directory, leading to confusing errors.

🛡️ Proposed fix to validate absolute path
 	// Ask for path to sre-agent venv
 	fmt.Fprint(os.Stderr, "Enter the absolute path to sre-agent venv directory: ")
 	userVenvPath, err := promptUserInput()
 	if err != nil {
 		fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
 		return false
 	}
+	if !filepath.IsAbs(userVenvPath) {
+		fmt.Fprintln(os.Stderr, "Error: path must be absolute")
+		return false
+	}
 
 	// Validate venv binary exists in provided path
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/validate_sre_agent.go` around lines 24 - 30, The prompt
accepts userVenvPath from promptUserInput but doesn't validate it's absolute;
update the validation after reading userVenvPath (in the same block that
currently checks err) to use filepath.IsAbs(userVenvPath) and, if false, print a
clear error to os.Stderr (e.g., "path must be absolute") and return false (or
re-prompt if desired) so downstream file/copy operations use a guaranteed
absolute path; ensure you import "path/filepath" if not already present and
reference userVenvPath and promptUserInput in the change.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

@TheUndeadKing: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants