Skip to content

Security: Remediation of RCE via Insecure Pickle Deserialization in Model I/O#667

Open
JoshuaProvoste wants to merge 1 commit intogoogle:mainfrom
JoshuaProvoste:security/remediate-pickle-rce
Open

Security: Remediation of RCE via Insecure Pickle Deserialization in Model I/O#667
JoshuaProvoste wants to merge 1 commit intogoogle:mainfrom
JoshuaProvoste:security/remediate-pickle-rce

Conversation

@JoshuaProvoste
Copy link
Copy Markdown

Compliance with CONTRIBUTING.md:

Consistently with the project guidelines, this PR includes:

  • Unit Tests: Added brax/io/model_test.py to validate core security logic and prevent regressions.
  • Verified fix: Successfully reproduced the RCE vulnerability on localhost using a malicious __reduce__ payload and verified its neutralization by the new security gate.
  • Code Hygiene: Enforced Google-specific code style using isort and pyink across all modified files.

Description

This PR addresses a critical security vulnerability in the parameter I/O system (brax.io.model). The current implementation relies on pickle for deserialization, which is inherently insecure as it allows the execution of arbitrary Python code during the loading process (Remote Code Execution - RCE).

To remediate this, I have transitioned the framework to use Msgpack, a data-only binary serialization format. This move aligns Brax with modern industry standards for machine learning security—similar to the transition from Pickle to safetensors in the Transformers ecosystem—moving the project from a "vulnerable by design" state to a "secure by default" architecture.

Key Benefits:

  1. System Security: Passive data parsing prevents system-level takeover via model files.
  2. Explicit Opt-in: Legacy .pkl files are blocked by default, requiring user consent (allow_pickle=True) and triggering a SecurityWarning.
  3. Lossless Pytree Support: Native support for nested JAX arrays and Flax dataclasses (e.g., RunningStatisticsState).

Technical Implementation Details

  • brax/io/model.py:
    • Implemented _encode_pytree and _decode_pytree for recursive, type-preserving serialization.
    • Added a signature check (0x80) to the load_params buffer to identify and intercept legacy Pickle headers.
    • Established a SecurityWarning category for transparent but secure backward compatibility.
  • brax/training/agents/:
    • Refactored the testNetworkEncoding function in all training agent suites (ppo, sac, bc, apg, es, ars) to utilize the new secure model API.
  • brax/io/model_test.py:
    • New test suite verifying:
      1. Data integrity of complex nested structures.
      2. Default blocking of unauthorized .pkl files.
      3. Neutralization of active RCE payloads.

Verification Performed

  • Reproduced Vulnerability: Confirmed that a crafted .pkl file could execute system commands when loaded via the original load_params.
  • Validated Fix:
    • Verified that the new implementation rejects the exploit payload with a RuntimeError.
    • Confirmed that Pytrees are reconstructed with 100% precision (using np.testing.assert_allclose).
    • Verified that training agents can still save and load parameters seamlessly using the secure format.
  • Linting: Successfully ran isort and pyink on all 8 modified files.

Checklist

  • Followed project guidelines for code hygiene.
  • Includes automated tests for the new logic.
  • No changes to metadata XML files.
  • Verified backward compatibility with an explicit security override.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant