Skip to content

feat(dynamo_mocker): add GPU-free LLM inference simulation workload#895

Open
saivishal1999 wants to merge 8 commits into
NVIDIA:mainfrom
saivishal1999:spothula/dynamo-mocker
Open

feat(dynamo_mocker): add GPU-free LLM inference simulation workload#895
saivishal1999 wants to merge 8 commits into
NVIDIA:mainfrom
saivishal1999:spothula/dynamo-mocker

Conversation

@saivishal1999
Copy link
Copy Markdown

Adds the DynamoMocker workload to cloudai — a lightweight GPU-free LLM inference simulator using dynamo.mocker + dynamo.frontend, benchmarked with genai-perf or aiperf.

  • Pydantic models for engine, worker, frontend, and benchmark config
  • StandaloneCommandGenStrategy that invokes dynamo_mocker.sh via an isolated uv-managed venv (PythonEnvironment installable)
  • ReportGenerationStrategy reusing AIDynamoReportGenerationStrategy for benchmark_report.csv parsing
  • Shell scripts: dynamo_mocker.sh, genai_perf.sh, aiperf.sh
  • Supports combined and disaggregated (prefill/decode) topologies
  • Supports round_robin and kv_router frontend modes
  • Staging configs under conf/experimental/dynamo_mocker/
  • 55 unit tests covering all CLI arg paths and flag passthrough

Summary

Provide a concise summary of the changes introduced by this pull request. Detail the purpose and scope of the changes, referencing any relevant issues or discussions. Explain how these changes address the problem or improve the project.

Test Plan

In this section, describe the testing you have performed to verify the changes. Include:

  • A clear description of the testing environment.
  • The steps you followed to test the new features or bug fixes.
  • Any specific commands used during testing, along with their outputs.
  • A description of the results and observations from your testing.
    This information is crucial for reviewers to understand how the changes have been validated.

Additional Notes

Include any other notes or comments about the pull request here. This can include challenges faced, future considerations, or context that reviewers might find helpful.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a DynamoMocker workload: configs, Pydantic command models and test definition, shell orchestration for NATS/frontend/mockers, genai-perf/aiperf wrappers, a standalone command-gen strategy that builds wrapper invocations, report parsing gating, registry wiring, and tests for command generation.

Changes

DynamoMocker Workload

Layer / File(s) Summary
Configuration and workload models
conf/experimental/dynamo_mocker/system/standalone_system.toml, conf/experimental/dynamo_mocker/test/dynamo_mocker.toml, conf/experimental/dynamo_mocker/test_scenario/dynamo_mocker.toml, src/cloudai/workloads/dynamo_mocker/dynamo_mocker.py, src/cloudai/workloads/dynamo_mocker/__init__.py
TOML configurations define a standalone system, a base test with engine/worker/frontend and aiperf parameters, and a test scenario. Pydantic models specify MockerEngineArgs, MockerWorkerConfig (supporting combined or prefill/decode disaggregation), MockerFrontendArgs, and two benchmark tool arg models (MockerGenAIPerfArgs, MockerAIPerfArgs). DynamoMockerCmdArgs bundles all settings, and DynamoMockerTestDefinition specifies Python 3.12 with ai-dynamo/genai-perf/aiperf dependencies and evaluates success via marker files and benchmark_report.csv presence.
Shell orchestration and service startup
src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
Main orchestration script that parses CLI arguments (disaggregation mode, worker topology, ports, benchmark tool selection), sets thread limits and logging controls, launches nats-server with port availability checks, starts dynamo.frontend with optional KV router mode tuning, launches either a single or disaggregated prefill/decode dynamo.mocker instances with topology and KV bandwidth settings, waits for service readiness (healthz, models endpoint, worker log patterns), detects fatal errors in logs, runs the selected benchmark tool, verifies output, writes success/failure markers, and performs cleanup.
Benchmark tool wrapper scripts
src/cloudai/workloads/dynamo_mocker/genai_perf.sh, src/cloudai/workloads/dynamo_mocker/aiperf.sh
genai_perf.sh and aiperf.sh each wrap their respective benchmark CLI tools. Both parse required context flags (result-dir, model, port), optional flags (report-name, cmd override), and forward remaining arguments as profile parameters. They construct the benchmark invocation, run it against the localhost mocker endpoint with artifact/output directories, locate the generated CSV report, copy it to the result directory, and error if the CSV is missing.
Command generation strategy
src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py
DynamoMockerStandaloneCommandGenStrategy converts DynamoMockerCmdArgs into dynamo_mocker.sh CLI arguments by flattening nested model fields into prefixed flags (--engine-, --mocker-, --frontend-, --genai-perf- or --aiperf-*), handling disaggregation mode differences (combined mode emits --num-workers; disaggregated mode emits prefill/decode node counts and KV settings), forwarding nested prefill/decode arguments with distinct prefixes, lowercasing boolean values, and creating a wrapper bash script that shell-quotes all arguments and redirects stdout/stderr to files. Persists TestRunDetails as TOML.
Report generation strategy
src/cloudai/workloads/dynamo_mocker/report_generation_strategy.py
DynamoMockerReportGenerationStrategy gates handling on test type (DynamoMockerTestDefinition) and presence of benchmark_report.csv, inheriting CSV parsing from AIDynamoReportGenerationStrategy.
Registry and framework wiring
src/cloudai/registration.py
register_all() imports DynamoMocker strategies and test definition, registers DynamoMockerStandaloneCommandGenStrategy for StandaloneSystem, adds DynamoMockerTestDefinition to the test definition registry, and adds DynamoMockerReportGenerationStrategy to the report generation registry.
Command generation test suite
tests/workloads/dynamo_mocker/__init__.py, tests/workloads/dynamo_mocker/test_command_gen_strategy_standalone.py, tests/test_init.py, tests/test_test_scenario.py
Comprehensive test module validates DynamoMockerStandaloneCommandGenStrategy across worker modes (combined vs. disaggregated presence of flags), benchmark tool selection and flag routing, forwarding of tool-specific numeric fields and optional extra/cmd fields with correct boolean lowercase conversion, wrapper script generation (bash invocation, script creation, stdout/stderr redirection), extra args passthrough for engine/worker/frontend with correct prefixes (--mocker-* for worker, not --worker-*), nested prefill/decode configuration forwarding only when disaggregation is enabled, and updates to registration/registry-related test expectations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • srivatsankrishnan
  • podkidyshev
  • jeffnvidia

Poem

🐇 I hopped through configs, flags, and shell,
Prefill and decode in a neat little dell,
Benchmarks sing CSVs bright,
Wrapper scripts chase logs through the night,
A tiny rabbit cheers: tests pass well.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a GPU-free LLM inference simulation workload (DynamoMocker) to the codebase.
Description check ✅ Passed The description is related to the changeset and provides context about the DynamoMocker workload, key components, and testing coverage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh`:
- Around line 115-129: Before launching aiperf, ensure any stale artifacts are
removed so process_results cannot pick up old *aiperf*.csv files: locate the
artifact_dir variable and the aiperf invocation (the block using "${run_cmd[@]}"
with --artifact-dir "$artifact_dir") and add a cleanup step that creates/clears
that directory (e.g., mkdir -p and remove existing aiperf artifacts or delete
and recreate artifact_dir) so it’s empty prior to running; this prevents
process_results from reading stale CSVs produced by prior runs.

In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh`:
- Around line 280-285: In resolve_python(), currently when venv_python is set
but not executable the script silently falls back to system python; change it to
fail fast: if venv_python is non-empty and not -x, print a clear error to stderr
(including the value of venv_python) and exit with a non-zero status; otherwise
keep the existing branch that sets PYTHON and prepends dirname(venv_python) to
PATH. Reference symbols: resolve_python, venv_python, PYTHON, PATH.
- Around line 774-778: The current conditional uses [[ "$benchmark_tool" ==
"aiperf" ]] and falls through to launch_genai_perf for any other value; change
it to explicitly handle valid options and reject unknown values: check
benchmark_tool for "aiperf" to call launch_aiperf, check for the expected genai
option (e.g., "genai" or the exact value used elsewhere) to call
launch_genai_perf, and in an else branch print a clear error (including the
invalid value) to stderr and exit with a non-zero status; refer to the
benchmark_tool variable and the launch_aiperf / launch_genai_perf functions to
locate and update the logic.
- Around line 183-273: The parse_args function reads $2 directly in many case
branches (e.g., --venv-python, --engine-*, --mocker-*, --prefill-extra-args,
--prefill-args-*, --aiperf-*, --genai-perf-*) which breaks under set -u when an
option value is missing; add a small validator (e.g., require_arg or
check_next_arg) invoked before assigning each "$2" that ensures a next argument
exists and is not another flag (starts with "-" ), prints a clear error/usage
message, and returns non-zero; update all branches that currently do direct
assignments to call this validator first and only then assign and shift 2 so
missing values are handled gracefully.

In `@src/cloudai/workloads/dynamo_mocker/genai_perf.sh`:
- Around line 113-123: Before launching genai-perf, remove or empty any prior
artifacts in the artifact directory to avoid copying stale CSVs: ensure the
created local variable artifact_dir (used when calling "${run_cmd[@]}"
--artifact-dir "$artifact_dir") is cleaned (rm -rf "$artifact_dir"/* or recreate
the directory) and recreated as an empty directory prior to invoking the
command, and keep process_results unchanged so it consumes only the fresh
artifacts.
- Around line 54-66: The process_args function currently takes flag values (for
--result-dir, --model, --port, --report-name, --cmd) blindly from $2 which lets
malformed invocations shift parsing; update process_args to validate each flag's
argument before shifting: for each case (--result-dir, --model, --report-name,
--cmd) check that "$2" exists and does not start with '-' (e.g. [[ -n "$2" &&
${2:0:1} != "-" ]]) and on failure print a clear error to stderr and exit
non‑zero; for --port additionally validate numeric (e.g. a regex or [[ "$2" =~
^[0-9]+$ ]]) before assigning and shifting; preserve the existing --) handling
and ensure you only call shift 2 when the validation passes.

In `@src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py`:
- Around line 171-175: The wrapper command currently interpolates unquoted paths
and filenames, which can break when self._script_path, stdout_file, or
stderr_file contain spaces or shell metacharacters; update the construction of
wrapper_lines (and keep quoted_args as is) so that self._script_path,
stdout_file, and stderr_file are each safely shell-quoted (e.g., via
shlex.quote) before interpolation into the f-string that builds the bash
invocation line in wrapper_lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 07cd659f-a1a4-41e4-bcde-6e4ce91ff3d1

📥 Commits

Reviewing files that changed from the base of the PR and between 4bdc465 and 9654f3a.

📒 Files selected for processing (13)
  • conf/experimental/dynamo_mocker/system/standalone_system.toml
  • conf/experimental/dynamo_mocker/test/dynamo_mocker.toml
  • conf/experimental/dynamo_mocker/test_scenario/dynamo_mocker.toml
  • src/cloudai/registration.py
  • src/cloudai/workloads/dynamo_mocker/__init__.py
  • src/cloudai/workloads/dynamo_mocker/aiperf.sh
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.py
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
  • src/cloudai/workloads/dynamo_mocker/genai_perf.sh
  • src/cloudai/workloads/dynamo_mocker/report_generation_strategy.py
  • src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py
  • tests/workloads/dynamo_mocker/__init__.py
  • tests/workloads/dynamo_mocker/test_command_gen_strategy_standalone.py

Comment thread src/cloudai/workloads/dynamo_mocker/aiperf.sh
Comment thread src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
Comment thread src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
Comment thread src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
Comment thread src/cloudai/workloads/dynamo_mocker/genai_perf.sh
Comment thread src/cloudai/workloads/dynamo_mocker/genai_perf.sh
@saivishal1999 saivishal1999 force-pushed the spothula/dynamo-mocker branch from 9654f3a to 19a2f75 Compare May 15, 2026 22:48
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh`:
- Line 64: The --port case currently assigns any non-empty token into port after
calling _require_value; add a numeric validation step in the --port) branch (the
case handling the --port option) to check $2 matches a whole-number regex (e.g.
^[0-9]+$) before assigning port="$2", and on failure print an error to stderr
and exit with non-zero status (or call the script's existing error helper) so
only valid numeric ports are accepted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: bf45ad1e-9704-4a2e-b0ce-e337b811aa0f

📥 Commits

Reviewing files that changed from the base of the PR and between 9654f3a and 19a2f75.

📒 Files selected for processing (13)
  • conf/experimental/dynamo_mocker/system/standalone_system.toml
  • conf/experimental/dynamo_mocker/test/dynamo_mocker.toml
  • conf/experimental/dynamo_mocker/test_scenario/dynamo_mocker.toml
  • src/cloudai/registration.py
  • src/cloudai/workloads/dynamo_mocker/__init__.py
  • src/cloudai/workloads/dynamo_mocker/aiperf.sh
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.py
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
  • src/cloudai/workloads/dynamo_mocker/genai_perf.sh
  • src/cloudai/workloads/dynamo_mocker/report_generation_strategy.py
  • src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py
  • tests/workloads/dynamo_mocker/__init__.py
  • tests/workloads/dynamo_mocker/test_command_gen_strategy_standalone.py

Comment thread src/cloudai/workloads/dynamo_mocker/aiperf.sh Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (3)
src/cloudai/workloads/dynamo_mocker/genai_perf.sh (1)

64-66: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate --port as numeric before building the URL.

Right now any non-empty token becomes localhost:${port}, so a typo is deferred into a less actionable genai-perf failure instead of being rejected here.

🔧 Suggested fix
-      --port)        _require_value "$1" "${2-}"; port="$2";        shift 2 ;;
+      --port)
+        _require_value "$1" "${2-}"
+        [[ "$2" =~ ^[0-9]+$ ]] || { log "ERROR: --port must be numeric (got: '$2')" >&2; exit 1; }
+        port="$2"
+        shift 2
+        ;;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/workloads/dynamo_mocker/genai_perf.sh` around lines 64 - 66, The
--port parsing currently accepts any non-empty token into variable port, so add
a numeric validation step right after the --port assignment (the parsing branch
that calls _require_value and sets port="$2") to reject non-numeric or
out-of-range values before the script uses "localhost:${port}"; implement a
check (e.g., regex to ensure all digits and optionally enforce 1-65535) and if
invalid print a clear error and exit non-zero, referencing the --port option and
the port variable so the parsing block will fail fast on typos.
src/cloudai/workloads/dynamo_mocker/aiperf.sh (1)

62-65: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject non-numeric --port values early.

localhost:${port} is assembled unconditionally, so bad input only fails later inside aiperf with a less direct error.

🔧 Suggested fix
-      --port)        _require_value "$1" "${2-}"; port="$2";        shift 2 ;;
+      --port)
+        _require_value "$1" "${2-}"
+        [[ "$2" =~ ^[0-9]+$ ]] || { log "ERROR: --port must be numeric (got: '$2')" >&2; exit 1; }
+        port="$2"
+        shift 2
+        ;;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh` around lines 62 - 65, The
--port parsing branch currently assigns port="$2" without validation, so
non-numeric values later break when assembling localhost:${port}; after the
existing _require_value call in the --port case (the branch that sets port="$2"
and shift 2), validate that port contains only digits (e.g., use a regex like
^[0-9]+$ or an integer check) and if it fails, print a clear error and exit
non‑zero; update the error path near the --port handling (not the aiperf
invocation) so invalid ports are rejected early.
src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh (1)

219-273: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard the remaining passthrough branches before reading $2.

The fixed-value branches are protected now, but these wildcard/raw passthrough cases still consume $2 blindly. A malformed invocation like --engine-foo --frontend-bar baz will still shift parsing or die under set -u before you emit a controlled error.

🔧 Suggested fix pattern
       --engine-*)
+        _require_value "$1" "${2-}"
         engine_extra_args["--${1#--engine-}"]="$2"
         shift 2 ;;
       --mocker-*)
+        _require_value "$1" "${2-}"
         mocker_extra_args["--${1#--mocker-}"]="$2"
         shift 2 ;;

Apply the same guard to the other "$2" consumers in this block as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh` around lines 219 - 273,
The wildcard/raw passthrough branches (e.g., the lines that set
engine_extra_args["--${1#--engine-}"]="$2"],
mocker_extra_args["--${1#--mocker-}"]="$2"],
frontend_extra_args["--${1#--frontend-}"]="$2"], prefill_extra_args_raw="$2",
prefill_args_extra["--${1#--prefill-args-}"]="$2"], decode_extra_args_raw="$2",
decode_args_extra["--${1#--decode-args-}"]="$2"], aiperf_cmd="$2",
aiperf_extra_args_raw="$2", aiperf_extra_args["--${1#--aiperf-}"]="$2"],
genai_perf_cmd="$2", genai_perf_extra_args_raw="$2",
genai_perf_extra_args["--${1#--genai-perf-}"]="$2"]) read $2 unguarded and will
break under malformed input; add the same guard used in the fixed-value branches
before consuming $2 (check that $# -ge 2 and/or that $2 is not another flag) and
emit a clear error and exit when the next argument is missing or looks like a
flag, then only assign and shift 2 when the guard passes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh`:
- Around line 106-113: The override branch that builds run_cmd from the provided
cmd variable doesn't include the required "profile" subcommand, causing custom
--cmd to run without "profile"; update the code that processes cmd (variable cmd
and run_cmd) to ensure the "profile" subcommand is present—parse read -ra
run_cmd <<< "$cmd", then if the first positional element is not "profile" (or if
"profile" is not present in run_cmd), append "profile" to run_cmd so both the
default path (run_cmd=("$AIPERF" profile)) and the custom --cmd path always
execute the "profile" subcommand when invoking AIPERF.

In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh`:
- Around line 397-405: The write_node_roles() block currently hardcodes
"genai_perf" into the roles file; change it to record the chosen benchmark tool
instead by replacing the fixed echo "$(hostname),genai_perf" with an echo that
uses the benchmark selection variable (e.g., echo
"$(hostname),${benchmark_tool}" or the script's variable that holds the
--benchmark-tool value). Keep the rest of the function (roles_file,
disaggregation_mode branches, and other roles) unchanged so the roles file
reflects the actual selected benchmark (e.g., "aiperf") rather than always
"genai_perf".

In `@src/cloudai/workloads/dynamo_mocker/genai_perf.sh`:
- Around line 111-118: When a custom --cmd is provided the script currently sets
run_cmd from cmd but omits the required "profile" subcommand; update the cmd
branch in genai_perf.sh so after read -ra run_cmd <<< "$cmd" you check whether
run_cmd already contains "profile" and if not insert "profile" immediately after
the executable (i.e. at index 1) so run_cmd becomes ("<cmd-exe>" "profile" ...).
Use the existing variables run_cmd, cmd and GENAI_PERF to locate the logic to
modify.

In `@src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py`:
- Around line 1-2: Update the copyright header year from 2025 to 2026 in this
file; specifically change the two header comment lines containing "2025 NVIDIA
CORPORATION & AFFILIATES" to "2026 NVIDIA CORPORATION & AFFILIATES" so the
top-of-file SPDX and copyright declarations match CI expectations.

In `@tests/test_test_scenario.py`:
- Line 627: The test currently only asserts len(Registry().reports_map) == 21
which can miss wrong or shifted registrations; update the test to also assert
that the specific DynamoMocker reporter key exists in Registry().reports_map and
that its value is the expected reporter (e.g., instance/class name or type) —
locate the Registry() usage and reports_map access in the test and add a
key-level assertion like checking "DynamoMocker" (or the exact key used during
registration) in Registry().reports_map and assert the mapped object is of the
expected class/type.

---

Duplicate comments:
In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh`:
- Around line 62-65: The --port parsing branch currently assigns port="$2"
without validation, so non-numeric values later break when assembling
localhost:${port}; after the existing _require_value call in the --port case
(the branch that sets port="$2" and shift 2), validate that port contains only
digits (e.g., use a regex like ^[0-9]+$ or an integer check) and if it fails,
print a clear error and exit non‑zero; update the error path near the --port
handling (not the aiperf invocation) so invalid ports are rejected early.

In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh`:
- Around line 219-273: The wildcard/raw passthrough branches (e.g., the lines
that set engine_extra_args["--${1#--engine-}"]="$2"],
mocker_extra_args["--${1#--mocker-}"]="$2"],
frontend_extra_args["--${1#--frontend-}"]="$2"], prefill_extra_args_raw="$2",
prefill_args_extra["--${1#--prefill-args-}"]="$2"], decode_extra_args_raw="$2",
decode_args_extra["--${1#--decode-args-}"]="$2"], aiperf_cmd="$2",
aiperf_extra_args_raw="$2", aiperf_extra_args["--${1#--aiperf-}"]="$2"],
genai_perf_cmd="$2", genai_perf_extra_args_raw="$2",
genai_perf_extra_args["--${1#--genai-perf-}"]="$2"]) read $2 unguarded and will
break under malformed input; add the same guard used in the fixed-value branches
before consuming $2 (check that $# -ge 2 and/or that $2 is not another flag) and
emit a clear error and exit when the next argument is missing or looks like a
flag, then only assign and shift 2 when the guard passes.

In `@src/cloudai/workloads/dynamo_mocker/genai_perf.sh`:
- Around line 64-66: The --port parsing currently accepts any non-empty token
into variable port, so add a numeric validation step right after the --port
assignment (the parsing branch that calls _require_value and sets port="$2") to
reject non-numeric or out-of-range values before the script uses
"localhost:${port}"; implement a check (e.g., regex to ensure all digits and
optionally enforce 1-65535) and if invalid print a clear error and exit
non-zero, referencing the --port option and the port variable so the parsing
block will fail fast on typos.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 6cafdb9c-7b84-4efb-8b55-362180f9fd39

📥 Commits

Reviewing files that changed from the base of the PR and between 19a2f75 and 9bd1ace.

📒 Files selected for processing (6)
  • src/cloudai/workloads/dynamo_mocker/aiperf.sh
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
  • src/cloudai/workloads/dynamo_mocker/genai_perf.sh
  • src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py
  • tests/test_init.py
  • tests/test_test_scenario.py

Comment thread src/cloudai/workloads/dynamo_mocker/aiperf.sh
Comment thread src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
Comment thread src/cloudai/workloads/dynamo_mocker/genai_perf.sh
Comment thread src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py Outdated
Comment thread tests/test_test_scenario.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/workloads/dynamo_mocker/test_command_gen_strategy_standalone.py`:
- Around line 418-501: The tests in test_command_gen_strategy_standalone.py rely
on forwarded --prefill-*/--decode-* flags but do not explicitly set
disaggregation_mode, making behavior ambiguous; update each affected
DynamoMockerCmdArgs construction in the listed tests (e.g.,
test_prefill_args_extra_forwarded, test_decode_args_extra_forwarded,
test_prefill_and_decode_different_values, test_prefill_args_not_in_decode_flags,
test_decode_args_not_in_prefill_flags, test_prefill_args_bool_lowercased and the
ones around lines 533-554) to include disaggregation_mode="prefill_decode" on
the top-level DynamoMockerCmdArgs so the tests explicitly exercise the
disaggregated prefill/decode forwarding path when building args via
strategy._build_script_args.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: daab1138-36f1-4549-99b9-15b8f1f241ec

📥 Commits

Reviewing files that changed from the base of the PR and between 9bd1ace and ad4eb27.

📒 Files selected for processing (9)
  • conf/experimental/dynamo_mocker/system/standalone_system.toml
  • conf/experimental/dynamo_mocker/test/dynamo_mocker.toml
  • conf/experimental/dynamo_mocker/test_scenario/dynamo_mocker.toml
  • src/cloudai/workloads/dynamo_mocker/__init__.py
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.py
  • src/cloudai/workloads/dynamo_mocker/report_generation_strategy.py
  • src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py
  • tests/workloads/dynamo_mocker/__init__.py
  • tests/workloads/dynamo_mocker/test_command_gen_strategy_standalone.py

@podkidyshev
Copy link
Copy Markdown
Contributor

@saivishal1999 please resolve coderabbit comments and ping me directly again

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh (1)

2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update the copyright header year to 2026.

This file is new but still carries 2024 in the copyright line.

🔧 Proposed fix
-# Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh` at line 2, Update the
copyright header year from 2024 to 2026 by editing the top-of-file copyright
line (the single comment that currently reads "Copyright (c) 2024 NVIDIA
CORPORATION & AFFILIATES All rights reserved.") so it reads 2026 instead of
2024; ensure the exact punctuation/capitalization is preserved except for the
year change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/workloads/dynamo_mocker/aiperf.sh`:
- Line 2: Update the copyright header in the aiperf.sh file from "2024" to
"2026" so it matches repo header expectations; locate the top-of-file header
comment in src/cloudai/workloads/dynamo_mocker/aiperf.sh and replace the year
token "2024" with "2026" in that copyright line.

In `@src/cloudai/workloads/dynamo_mocker/genai_perf.sh`:
- Line 2: Update the copyright header string "Copyright (c) 2024 NVIDIA
CORPORATION & AFFILIATES All rights reserved." in genai_perf.sh to use the year
2026 instead of 2024 so the file header matches current source header checks.

In `@src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py`:
- Around line 199-200: test_cmd is built with naive double-quote wrapping which
breaks when args contain quotes/backslashes; update the assembly in
StandaloneCommandGenStrategy to use shell-safe quoting (e.g., shlex.quote) for
self._script_path and each item in self._script_args and similarly quote
self._wrapper_path when building full_cmd so the persisted test_cmd/ full_cmd is
a valid, safe shell command even when args contain special characters.

---

Duplicate comments:
In `@src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh`:
- Line 2: Update the copyright header year from 2024 to 2026 by editing the
top-of-file copyright line (the single comment that currently reads "Copyright
(c) 2024 NVIDIA CORPORATION & AFFILIATES All rights reserved.") so it reads 2026
instead of 2024; ensure the exact punctuation/capitalization is preserved except
for the year change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: e8fedb74-ba19-434f-84a0-c77564d66165

📥 Commits

Reviewing files that changed from the base of the PR and between ad4eb27 and 8b88767.

📒 Files selected for processing (6)
  • conf/experimental/dynamo_mocker/test/dynamo_mocker.toml
  • src/cloudai/workloads/dynamo_mocker/aiperf.sh
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.py
  • src/cloudai/workloads/dynamo_mocker/dynamo_mocker.sh
  • src/cloudai/workloads/dynamo_mocker/genai_perf.sh
  • src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py

Comment thread src/cloudai/workloads/dynamo_mocker/aiperf.sh Outdated
Comment thread src/cloudai/workloads/dynamo_mocker/genai_perf.sh Outdated
Comment thread src/cloudai/workloads/dynamo_mocker/standalone_command_gen_strategy.py Outdated
Adds the DynamoMocker workload to cloudai — a lightweight GPU-free LLM
inference simulator using dynamo.mocker + dynamo.frontend, benchmarked
with genai-perf or aiperf.

- Pydantic models for engine, worker, frontend, and benchmark config
- StandaloneCommandGenStrategy that invokes dynamo_mocker.sh via an
  isolated uv-managed venv (PythonEnvironment installable)
- ReportGenerationStrategy reusing AIDynamoReportGenerationStrategy
  for benchmark_report.csv parsing
- Shell scripts: dynamo_mocker.sh, genai_perf.sh, aiperf.sh
- Supports combined and disaggregated (prefill/decode) topologies
- Supports round_robin and kv_router frontend modes
- Staging configs under conf/experimental/dynamo_mocker/
- 55 unit tests covering all CLI arg paths and flag passthrough
- tests/test_init.py: add DynamoMocker to CMD_GEN_STRATEGIES and
  test_definitions list; update expected counts (25→26, +1 strategy)
- tests/test_test_scenario.py: update reports_map expected count (20→21)
- dynamo_mocker.sh: add _require_value() validation for all value-bearing
  flags; fail-fast on invalid --venv-python; explicit error for unknown
  benchmark_tool
- genai_perf.sh: add _require_value() validation; remove stale artifact
  dir before each run
- aiperf.sh: remove stale artifact dir before each run
- standalone_command_gen_strategy.py: apply shlex.quote() to script_path,
  stdout_file, and stderr_file in wrapper script generation
…L files

All dynamo_mocker Python files had 2025 in their copyright headers, but
they were first committed to cloudai in 2026 — the test derives the
expected year from git log in this repo, so 2026 is correct.

The three TOML conf files were missing the Apache-2.0 copyright header
entirely; add it with year 2026.
…ommand

nats-server is not bundled with ai-dynamo so users must supply it.
Mirrors ai_dynamo's nats_cmd pattern: add nats_cmd field to
DynamoMockerCmdArgs (default "nats-server -js") so users can supply a
full path when nats-server is not on PATH. Wire --nats-cmd through
standalone_command_gen_strategy.py and dynamo_mocker.sh.

Also append 'profile' subcommand after parsing a custom --cmd override
in aiperf.sh and genai_perf.sh, making --cmd a binary-path override
consistent with the default path. Update field descriptions accordingly.
…test_run quoting

Update copyright year from 2024 to 2026 in dynamo_mocker.sh,
aiperf.sh, and genai_perf.sh.

Replace naive double-quote wrapping in store_test_run() with
shlex.quote() for both script_path and all args, matching the
shell-safe quoting used in gen_exec_command().
…ine wrap

Add regex validation (^[0-9]+$) for --port in aiperf.sh and
genai_perf.sh, and for --http-port in dynamo_mocker.sh, so invalid
values fail early with a clear message rather than at tool invocation.

Reformat store_test_run() shlex.quote expression to satisfy ruff E501.
…orter test; explicit disaggregation_mode in tests

dynamo_mocker.sh: replace hardcoded 'genai_perf' in write_node_roles()
with \${benchmark_tool} so the roles file reflects the actual selected
benchmark tool (aiperf or genai_perf).

test_test_scenario.py: add (DynamoMockerTestDefinition,
{DynamoMockerReportGenerationStrategy}) to test_custom_reporters
parametrize list, providing key-level registration verification.

test_command_gen_strategy_standalone.py: add
disaggregation_mode="prefill_decode" to the 8 tests in
TestPrefillDecodeExtraArgs that exercise --prefill-args-* /
--decode-args-* / --prefill-extra-args / --decode-extra-args
forwarding, making the intended disaggregated-mode contract explicit.
Move dynamo_mocker import to correct alphabetical position (after deepep,
before jax_toolbox) to satisfy ruff I001 import-order lint rule.
@saivishal1999 saivishal1999 force-pushed the spothula/dynamo-mocker branch from 66c68d5 to fe16067 Compare May 18, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants