feat: update definitions generation by pdp2121 · Pull Request #1002 · XRPLF/xrpl-py

pdp2121 · 2026-05-26T16:34:57Z

High Level Overview of Change

Replace the regex-based definitions.json generation with artifact-based approach that downloads server_definitions.json directly from rippled CI.

Added definitions validation step to CI (unit-test) to ensure definitions.json is being checked for correctness before any PR is merged.

Context of Change

rippled PR#6858 added server_definitions as artifact for every merged PRs, making it usable for downstream consumers

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (non-breaking change that only restructures code)
Tests (You added tests for code that already exists, or your new feature included in this PR)
Documentation Updates
Release

Did you update CHANGELOG.md?

Yes
No, this change does not impact library users

Test Plan

CI passes.

coderabbitai · 2026-05-26T16:35:12Z

Warning

Review limit reached

@pdp2121, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 1 second. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e232d7b0-bd18-4106-85b7-9801b61c0e67

📥 Commits

Reviewing files that changed from the base of the PR and between c50b6b1 and 14440a1.

📒 Files selected for processing (1)

CONTRIBUTING.md

Walkthrough

This PR modernizes the rippled definitions workflow by replacing header-parsing logic with GitHub Actions artifact retrieval, adding automated validation against a benchmark, and integrating validation into CI. Generate_definitions.py is completely rewritten to query artifacts; validate_definitions.py is introduced as a new validation layer; and Poe task definitions are restructured to coordinate the new commands.

Changes

Definitions Workflow Modernization

Layer / File(s)	Summary
Poe Task Configuration `pyproject.toml`	Task definitions restructured: `definitions`, `validate_definitions`, `generate_tx_models`, and `test` are now explicit Poe subtables with `cmd`, `help`, and argument wiring including `${BRANCH_OR_PR}` for definitions and `${FILE_OR_GITHUB_PATH}` for tx models.
Definitions Generation from CI Artifacts `tools/generate_definitions.py`	Script rewritten to download server-definitions artifacts from GitHub Actions instead of parsing C++ headers. Resolves PR/branch specifiers, queries artifacts by PR head SHA then branch, downloads to temp directory, extracts server_definitions.json, and writes to output path with explicit error handling for missing `gh` CLI, absent artifacts, and extraction failures.
Definitions Validation Against Benchmark `tools/validate_definitions.py`	New script downloads benchmark server_definitions.json from rippled develop branch via GitHub CLI, classifies JSON structure patterns (fields, simple_map, nested_map, format_array), validates required keys and per-key types, and compares local definitions entry-by-entry against the benchmark; prints detailed mismatch details and exits non-zero on failure.
CI Integration and Documentation `.github/workflows/unit_test.yml`, `CONTRIBUTING.md`, `.gitignore`	Added validation step to lint-and-type-check job that runs `poetry run poe validate_definitions` with `GH_TOKEN`; updated CONTRIBUTING.md to document definitions workflow now sourced from rippled CI artifacts with `pr:<id>` and branch specifier forms; added `.claude/worktrees/` to .gitignore.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as Definitions Script
  participant Args as Argument Parser
  participant GH as GitHub CLI
  participant Artifact as Artifact Download
  participant JSON as JSON Extraction
  participant Disk as Output File
  
  CLI->>Args: Parse source (pr/branch/fork)
  Args->>GH: Query PR/artifact info
  GH-->>Args: PR head SHA / branch name
  Args->>GH: Find artifact by SHA or branch
  GH-->>Args: Artifact run_id (non-expired)
  Args->>Artifact: Download artifact to temp dir
  Artifact-->>Args: Extraction complete
  Args->>JSON: Load server_definitions.json
  JSON-->>Args: Parsed definitions
  Args->>Disk: Write definitions.json (pretty-print)
  Disk-->>Args: Output path confirmed

sequenceDiagram
  participant Main as Validation Script
  participant GH as GitHub CLI
  participant Local as Local definitions.json
  participant Bench as Benchmark Artifact
  participant Classify as Structure Classifier
  participant Validate as Structural Validator
  participant Compare as Entry Comparator
  
  Main->>GH: Verify gh CLI available
  Main->>Local: Load local definitions.json
  Main->>GH: Download benchmark from develop
  GH-->>Bench: Artifact fetched and parsed
  Main->>Classify: Classify benchmark structure
  Classify-->>Main: Category map (fields/simple_map/etc)
  Main->>Validate: Enforce REQUIRED_KEYS and types
  Validate-->>Main: Structure valid or errors
  Main->>Compare: Compare by classification category
  Compare-->>Main: Mismatches detected (if any)
  Main->>Main: Exit with status (0=pass, 1=fail)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

XRPLF/xrpl-py#772: Directly related refactor of the same definitions generation script and Poe task wiring.

Suggested reviewers

mvadari
ckeshava
achowdhry-ripple
khancode

🐰 Hop, skip, and a leap so grand,
Definitions flow from artifact land,
No more headers to parse with care,
GitHub's artifacts floating in air! ✨
Validation checks keep all things right,
The workflow now shines ever so bright! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: update definitions generation' clearly describes the main change: replacing regex-based definitions generation with artifact-based approach, which is the primary focus of the PR.
Description check	✅ Passed	The PR description covers the high-level overview, context, type of change, and CHANGELOG consideration. However, it lacks substantive detail in the Test Plan section (only states 'CI passes').
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch update-definitions-generation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pyproject.toml`:
- Around line 94-97: The poe task definitions entry is passing BRANCH_OR_PR as a
variadic positional (multiple = true) causing argparse in
tools/generate_definitions.py (which expects a single optional positional arg,
nargs="?") to fail; change the task config for tool.poe.tasks.definitions so
that the BRANCH_OR_PR arg is not multiple (remove or set multiple = false) and
keep its default as "" so only zero or one positional value is forwarded to the
script, matching the script’s nargs="?" expectation.

In `@tools/generate_definitions.py`:
- Around line 27-31: The _exec function currently calls subprocess.run with
shell=True which allows command injection; change the signature of _exec(cmd:
list[str]) -> str (or similar) to accept an argv list and call
subprocess.run(cmd, shell=False, capture_output=True, text=True, check=True),
and update all callers that build gh api / gh run download commands (the call
sites that construct command strings for gh and the --jq filter) to pass a list
of arguments (separate "--jq" and its filter as its own list element) instead of
a single shell-quoted string so no shell interpolation occurs.

In `@tools/validate_definitions.py`:
- Around line 240-246: The loop over fields in the format_array validation
assumes each item is a dict and calls field.get(...), causing AttributeError for
non-dict entries; update the loop in the format_array validation (the for j,
field in enumerate(fields) block) to first check isinstance(field, dict) and if
not, append a validation error like '{key} "{type_name}"[{j}]: field must be an
object/dict' and continue, otherwise proceed to check field.get("name") and
field.get("optionality") as before.
- Around line 35-38: Replace the unsafe shell invocation in _exec: stop passing
a single command string with shell=True to subprocess.run and instead
accept/construct an argv-style list (or split the command safely) and call
subprocess.run(..., shell=False, args=[...], capture_output=True, text=True,
check=True) so commands are executed without a shell to prevent injection; and
in _validate_structure's format_array branch ensure you validate each item is a
dict before using field.get by adding an explicit type check (e.g., if not
isinstance(field, dict): record a structured validation error for that item) and
only then call field.get(...) so malformed inputs produce validation errors
rather than AttributeError.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7de9c1c2-acf3-4074-ad15-c0625317fb46

📥 Commits

Reviewing files that changed from the base of the PR and between 69573b2 and c50b6b1.

📒 Files selected for processing (7)

.github/workflows/unit_test.yml
.gitignore
CONTRIBUTING.md
pyproject.toml
tools/generate_definitions.py
tools/validate_definitions.py
xrpl/core/binarycodec/definitions/definitions.json

coderabbitai · 2026-05-26T16:41:47Z

+[tool.poe.tasks.definitions]
+help = "Generate definitions.json from rippled CI artifacts"
+cmd = "python3 tools/generate_definitions.py ${BRANCH_OR_PR}"
+args = [{ name = "BRANCH_OR_PR", positional = true, default = "", multiple = true }]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate and show the relevant Poe task config in pyproject.toml echo "=== pyproject.toml: tool.poe.tasks.definitions ===" rg -n --hidden --no-ignore -S "tool\.poe\.tasks\.definitions|\\[tool\\.poe\\.tasks\\.definitions\\]" pyproject.toml || true sed -n '80,120p' pyproject.toml # Locate the script and inspect its CLI/argparse handling echo echo "=== tools/generate_definitions.py argparse ===" ls -la tools/generate_definitions.py 2>/dev/null || true rg -n --hidden --no-ignore -S "argparse|add_argument|nargs=|sys\.argv" tools/generate_definitions.py || true python3 - <<'PY' import itertools from pathlib import Path p=Path("tools/generate_definitions.py") if not p.exists(): print("tools/generate_definitions.py not found") raise SystemExit(0) lines=p.read_text(encoding="utf-8").splitlines() for i in range(0,len(lines),1): if "ArgumentParser" in lines[i] or "add_argument" in lines[i] or "nargs" in lines[i]: start=max(0,i-5); end=min(len(lines), i+15) print("\n--- snippet around line", i+1, "---") for j in range(start,end): print(f"{j+1:4d}: {lines[j]}") PY # Check for any other parser entrypoints or wrapper scripts echo echo "=== usages of generate_definitions.py ===" rg -n --hidden --no-ignore -S "generate_definitions\.py" . || true

Repository: XRPLF/xrpl-py

Length of output: 5021

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Context around tool.poe.tasks.definitions in pyproject.toml ===" sed -n '60,120p' pyproject.toml echo echo "=== Search for BRANCH_OR_PR and multiple=true usage ===" rg -n --hidden --no-ignore -S "BRANCH_OR_PR|multiple\s*=\s*true" pyproject.toml

Repository: XRPLF/xrpl-py

Length of output: 2109

🌐 Web query:

Poe the poet multiple=true positional args expansion into command line

💡 Result:

In "poethepoet," you can handle multiple values for a positional argument by setting the "multiple" option to true in your task configuration [1][2]. When this is enabled, the task will accept one or more values for that argument [1]. Key rules and usage: 1. Positioning: Only the last positional argument in a task's argument list may have the "multiple" option set to true [1][2]. 2. Availability: For cmd tasks, the values are provided as a single string of whitespace-separated values, which can be injected into your command using templates (e.g., ${ARG_NAME}) [1][2]. For script tasks, the values are passed as a list of values to your Python function [1]. 3. Example Configuration: [tool.poe.tasks.save] cmd = "echo ${FILE_PATHS}" args = [{ name = "FILE_PATHS", positional = true, multiple = true }] When you run this task from the command line, extra arguments provided after the task name are captured into the variable [1][3][4]: poe save file1.txt file2.txt file3.txt # Results in: echo file1.txt file2.txt file3.txt Note that "multiple" is not compatible with arguments of type "boolean" [1][2]. If you need to handle additional arbitrary arguments that do not fit into a defined positional structure, recent versions of poethepoet also provide mechanisms for forwarding "free" arguments (e.g., via $POE_EXTRA_ARGS) [3][5].

Citations:

1: https://pypi.org/project/poethepoet/0.15.0/

2: http://poethepoet.natn.io/guides/args_guide.html

3: No way to pass free arguments to a specific place nat-n/poethepoet#155

4: https://github.com/nat-n/poethepoet?tab=readme-ov-file

5: feat: add support for forwarding free arguments via $POE_EXTRA_ARGS nat-n/poethepoet#380

Fix Poe positional argument cardinality for definitions.

tools/generate_definitions.py accepts exactly one optional positional argument (source, nargs="?"), but pyproject.toml configures ${BRANCH_OR_PR} with multiple = true, so supplying more than one value can forward extra positional args and make argparse fail (“unrecognized/too many positional arguments”).

Suggested fix

[tool.poe.tasks.definitions] help = "Generate definitions.json from rippled CI artifacts" cmd = "python3 tools/generate_definitions.py ${BRANCH_OR_PR}" -args = [{ name = "BRANCH_OR_PR", positional = true, default = "", multiple = true }] +args = [{ name = "BRANCH_OR_PR", positional = true, default = "" }]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

[tool.poe.tasks.definitions]

help = "Generate definitions.json from rippled CI artifacts"

cmd = "python3 tools/generate_definitions.py ${BRANCH_OR_PR}"

args = [{ name = "BRANCH_OR_PR", positional = true, default = "", multiple = true }]

[tool.poe.tasks.definitions]

help = "Generate definitions.json from rippled CI artifacts"

cmd = "python3 tools/generate_definitions.py ${BRANCH_OR_PR}"

args = [{ name = "BRANCH_OR_PR", positional = true, default = "" }]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pyproject.toml` around lines 94 - 97, The poe task definitions entry is passing BRANCH_OR_PR as a variadic positional (multiple = true) causing argparse in tools/generate_definitions.py (which expects a single optional positional arg, nargs="?") to fail; change the task config for tool.poe.tasks.definitions so that the BRANCH_OR_PR arg is not multiple (remove or set multiple = false) and keep its default as "" so only zero or one positional value is forwarded to the script, matching the script’s nargs="?" expectation.

coderabbitai · 2026-05-26T16:41:47Z

+def _exec(cmd: str) -> str:
+    """Run a shell command and return its stripped stdout."""
+    result = subprocess.run(
+        cmd, shell=True, capture_output=True, text=True, check=True
+    )


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the relevant section of the file echo "=== tools/generate_definitions.py (lines 1-120) ===" nl -ba tools/generate_definitions.py | sed -n '1,120p' echo echo "=== usages of _exec in tools/generate_definitions.py ===" rg -n "_exec\\(" tools/generate_definitions.py echo echo "=== show call sites around each _exec usage ===" # Print ~20 lines around each usage line number rg -n "_exec\\(" tools/generate_definitions.py | cut -d: -f1 | while read -r ln; do start=$((ln-15)); end=$((ln+25)); nl -ba tools/generate_definitions.py | sed -n "${start},${end}p" | sed "s/^/ /" echo "---" done echo echo "=== find where CLI args parsed (likely argparse/typer/click) ===" rg -n "argparse|click|typer|ArgumentParser|add_argument\\(|sys\\.argv" tools/generate_definitions.py

Repository: XRPLF/xrpl-py

Length of output: 170

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="tools/generate_definitions.py" if [ ! -f "$FILE" ]; then echo "Missing $FILE" exit 1 fi echo "=== tools/generate_definitions.py (lines 1-140) with numbers ===" awk '{printf "%5d %s\n", NR, $0}' "$FILE" | sed -n '1,140p' echo echo "=== usages of _exec in tools/generate_definitions.py ===" rg -n "_exec\\(" "$FILE" || true echo echo "=== show call-site neighborhoods for _exec usages ===" rg -n "_exec\\(" "$FILE" | cut -d: -f1 | while read -r ln; do start=$((ln-20)); end=$((ln+40)); awk '{printf "%5d %s\n", NR, $0}' "$FILE" | sed -n "${start},${end}p" | sed 's/^/ /' echo "---" done echo echo "=== locate CLI parsing/inputs in this file ===" rg -n "argparse|click|typer|add_argument\\(|sys\\.argv" "$FILE" || true echo echo "=== locate any references to branch/owner/pr/fork (common variable names) ===" rg -n "branch|owner|pr\\b|fork|pull_request|github\\.com" "$FILE" || true

Repository: XRPLF/xrpl-py

Length of output: 24622

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="tools/generate_definitions.py" awk 'NR>=150 && NR<=310 {printf "%5d %s\n", NR, $0}' "$FILE"

Repository: XRPLF/xrpl-py

Length of output: 5243

Remove shell=True from _exec in tools/generate_definitions.py.

_exec (lines 27-32) runs subprocess.run(cmd, shell=True, ...) where cmd is a shell string composed from CLI source (parsed into pr_number, branch, and fork_owner) and interpolated into gh api / gh run download commands (e.g., call sites at ~53, 69, 90, 104, 120). This enables command injection if the input contains shell metacharacters/quotes. Refactor _exec to take list[str] and call subprocess.run(..., shell=False); update call sites to build argv lists end-to-end (including passing the --jq filter as its own argument instead of embedding shell-quoted text).

🧰 Tools

🪛 OpenGrep (1.22.0)

[ERROR] 29-31: Dynamic command passed to subprocess with shell=True. Use a command list without shell=True, or use shlex.quote() to sanitize input.

(coderabbit.command-injection.python-shell-true)

🪛 Ruff (0.15.14)

[error] 29-29: subprocess call with shell=True identified, security issue

(S602)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/generate_definitions.py` around lines 27 - 31, The _exec function currently calls subprocess.run with shell=True which allows command injection; change the signature of _exec(cmd: list[str]) -> str (or similar) to accept an argv list and call subprocess.run(cmd, shell=False, capture_output=True, text=True, check=True), and update all callers that build gh api / gh run download commands (the call sites that construct command strings for gh and the --jq filter) to pass a list of arguments (separate "--jq" and its filter as its own list element) instead of a single shell-quoted string so no shell interpolation occurs.

coderabbitai · 2026-05-26T16:41:47Z

+def _exec(cmd: str) -> str:
+    result = subprocess.run(
+        cmd, shell=True, capture_output=True, text=True, check=True
+    )


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate and show the relevant sections of tools/validate_definitions.py FILE="tools/validate_definitions.py" echo "== _exec definition ==" nl -ba "$FILE" | sed -n '1,140p' | sed -n '30,90p' echo echo "== validate structure around format_array ==" nl -ba "$FILE" | sed -n '200,320p' echo echo "== Search for _validate_structure and field handling ==" rg -n "def _validate_structure|format_array|field\.get\(" "$FILE" || true

Repository: XRPLF/xrpl-py

Length of output: 141

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="tools/validate_definitions.py" echo "== _exec definition ==" cat -n "$FILE" | sed -n '30,90p' echo echo "== validate structure around format_array ==" cat -n "$FILE" | sed -n '200,320p' echo echo "== Search for _validate_structure and field handling ==" rg -n "def _validate_structure|format_array|field\.get\(" "$FILE" || true

Repository: XRPLF/xrpl-py

Length of output: 8865

Harden command execution and validation logic in tools/validate_definitions.py

Remove shell=True from _exec (subprocess.run(..., shell=True, ...), lines 35-38) and switch to argv-style invocation to avoid command-injection risks from dynamically built command strings.

In _validate_structure’s format_array branch, validate that each field is a dict before calling field.get(...) (lines 240-248) to prevent malformed input from raising AttributeError instead of producing structured validation errors.

🧰 Tools

🪛 OpenGrep (1.22.0)

[ERROR] 36-38: Dynamic command passed to subprocess with shell=True. Use a command list without shell=True, or use shlex.quote() to sanitize input.

(coderabbit.command-injection.python-shell-true)

🪛 Ruff (0.15.14)

[error] 36-36: subprocess call with shell=True identified, security issue

(S602)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/validate_definitions.py` around lines 35 - 38, Replace the unsafe shell invocation in _exec: stop passing a single command string with shell=True to subprocess.run and instead accept/construct an argv-style list (or split the command safely) and call subprocess.run(..., shell=False, args=[...], capture_output=True, text=True, check=True) so commands are executed without a shell to prevent injection; and in _validate_structure's format_array branch ensure you validate each item is a dict before using field.get by adding an explicit type check (e.g., if not isinstance(field, dict): record a structured validation error for that item) and only then call field.get(...) so malformed inputs produce validation errors rather than AttributeError.

coderabbitai · 2026-05-26T16:41:47Z

+                for j, field in enumerate(fields):
+                    if not isinstance(field.get("name"), str):
+                        errors.append(
+                            f'{key} "{type_name}"[{j}]: "name" must be a string'
+                        )
+                    if not isinstance(field.get("optionality"), int):
+                        errors.append(


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard non-dict entries before .get() in format_array validation.

If fields contains a non-dict entry, field.get(...) crashes with AttributeError instead of reporting a validation error.

Suggested fix

for j, field in enumerate(fields): + if not isinstance(field, dict): + errors.append( + f'{key} "{type_name}"[{j}]: entry must be an object' + ) + continue if not isinstance(field.get("name"), str): errors.append( f'{key} "{type_name}"[{j}]: "name" must be a string' )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for j, field in enumerate(fields):

if not isinstance(field.get("name"), str):

errors.append(

f'{key} "{type_name}"[{j}]: "name" must be a string'

)

if not isinstance(field.get("optionality"), int):

errors.append(

for j, field in enumerate(fields):

if not isinstance(field, dict):

errors.append(

f'{key} "{type_name}"[{j}]: entry must be an object'

)

continue

if not isinstance(field.get("name"), str):

errors.append(

f'{key} "{type_name}"[{j}]: "name" must be a string'

)

if not isinstance(field.get("optionality"), int):

errors.append(

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/validate_definitions.py` around lines 240 - 246, The loop over fields in the format_array validation assumes each item is a dict and calls field.get(...), causing AttributeError for non-dict entries; update the loop in the format_array validation (the for j, field in enumerate(fields) block) to first check isinstance(field, dict) and if not, append a validation error like '{key} "{type_name}"[{j}]: field must be an object/dict' and continue, otherwise proceed to check field.get("name") and field.get("optionality") as before.

depthfirst-app · 2026-05-26T16:47:53Z

+        raw = _exec(
+            f'gh api "repos/{repo}/actions/artifacts'
+            f"?name={ARTIFACT_NAME}&per_page=50\""
+            f" --jq '[.artifacts[] | select(.workflow_run.head_branch == \"{branch}\""


🟡 Severity: MEDIUM

The branch parameter is interpolated into a shell=True command via _exec. When called from the pr:NUMBER flow (line 236), branch is the GitHub API's headRefName. Git/GitHub allow single quotes in branch names, which break out of the jq argument's single-quoted shell context, enabling arbitrary command execution. Confirmed via PoC: a branch named test';touch /tmp/pwned;echo ' executes touch /tmp/pwned.
Helpful? Add 👍 / 👎

💡 Fix Suggestion

Suggestion: Refactor _find_artifact_by_branch (and similarly _find_artifact_by_sha and _find_pr_for_fork_branch) to avoid interpolating user-controlled values into shell commands. The recommended approach is to:

Remove the --jq flag from the gh api call and retrieve raw JSON instead.

Parse the JSON response in Python using json.loads().

Perform the filtering (e.g., matching workflow_run.head_branch) in Python code, which eliminates the shell injection vector entirely.

For example, replace _find_artifact_by_branch with something like:

def _find_artifact_by_branch(repo: str, branch: str) -> str | None: try: raw = _exec( f'gh api "repos/{repo}/actions/artifacts' f'?name={ARTIFACT_NAME}&per_page=50"' ) data = json.loads(raw) for artifact in data.get("artifacts", []): wr = artifact.get("workflow_run", {}) if wr.get("head_branch") == branch and not artifact.get("expired", True): return str(wr.get("id")) except (subprocess.CalledProcessError, json.JSONDecodeError): pass return None

Apply the same pattern to _find_artifact_by_sha (filter on head_sha instead) and _find_pr_for_fork_branch. Also consider refactoring _exec to accept a list of arguments and avoid shell=True entirely, or at minimum validate/sanitize repo parameter which is also interpolated.

Patel-Raj11 · 2026-05-29T14:27:51Z

@@ -1,376 +1,288 @@
-"""Script to generate the definitions.json file from rippled source code."""
+"""Script to generate the definitions.json file from rippled CI artifacts.


Hi Phu, Based on XRPLF/rippled#6858 and checking with Claude, the server-definitions artifact actually gets produced and retained for 3 days only when a cron job is run on each weekday (and not on every commit into develop branch, so many runs don't have those files). That means:

The pr:, owner:branch, and arbitrary-branch flows do exactly the right API queries — but no PR-triggered or branch-push-triggered run ever uploads the artifact, so they consistently come up empty. So do we need to fix the upstream rippled workflow to allow file generate on every commit (on develop and feature branches)?

Also the poetry run poe definitions won't give server-definitions for the latest commit on develop branch, but instead it would give the file generated based on the most recent run, due to same reason as upstream generates the files on a scheduled basis. And if there are no commits for > 3 days, the poetry run poe definitions starts failing.

pdp2121 added 4 commits April 29, 2026 16:05

add new definitions extraction

797f9b4

Merge branch 'main' into update-definitions-generation

9ead890

merge gitignore

8b556e6

add validation step

c50b6b1

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

depthfirst-app Bot reviewed May 26, 2026

View reviewed changes

Merge branch 'main' into update-definitions-generation

14440a1

Patel-Raj11 reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: update definitions generation#1002

feat: update definitions generation#1002
pdp2121 wants to merge 5 commits into
mainfrom
update-definitions-generation

pdp2121 commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Review limit reached

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

depthfirst-app Bot May 26, 2026

Uh oh!

Patel-Raj11 May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -1,376 +1,288 @@
		"""Script to generate the definitions.json file from rippled source code."""
		"""Script to generate the definitions.json file from rippled CI artifacts.

Uh oh!

Conversation

pdp2121 commented May 26, 2026

High Level Overview of Change

Context of Change

Type of Change

Did you update CHANGELOG.md?

Test Plan

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

depthfirst-app Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Patel-Raj11 May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 26, 2026 •

edited

Loading