[pull] main from openai:main#58
Open
pull[bot] wants to merge 2709 commits into
Open
Conversation
## Summary Clear inherited legacy `notify` from Guardian review session config, since we should not be passing auto review threads into `notify` targets. Keeps legacy notify payload and hook runtime behavior unchanged for normal user turns. ## Testing - [x] add a Guardian config regression and dedicated Guardian integration test so review sessions cannot inherit parent notify hooks
This reverts commit 5381240. Gov cloud should not be supported # External (non-OpenAI) Pull Request Requirements External code contributions are by invitation only. Please read the dedicated "Contributing" markdown file for details: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.
## Why The extracted goal runtime needs a host-callable path for turns that stop because the workspace usage limit is reached. In that case, any in-turn goal progress should be accounted before the goal becomes terminal, and active goal accounting must be cleared so later tool-finish or turn-stop handling does not keep charging usage to a stopped goal. ## What changed - Adds `GoalRuntimeHandle::usage_limit_active_goal_for_turn`, which accounts current active-goal progress, marks the active or budget-limited thread goal as `UsageLimited`, records terminal metrics when the status changes, clears active goal accounting, and emits the updated goal event. - Covers both active and budget-limited goals in `ext/goal/tests/goal_extension_backend.rs`, including the invariant that later token/tool events do not add usage after the goal has been usage-limited. ## Testing - Added `usage_limit_active_goal_accounts_progress_and_clears_accounting`. - Added `usage_limit_budget_limited_goal_accounts_remaining_progress`.
## Summary - Add the missing additional_context field to the guardian review Op::UserInput test initializer. ## Test plan - just fmt - just test -p codex-core guardian_review - just test -p codex-core (compiles, then fails on local environment issues: sandbox-exec Operation not permitted, missing test_stdio_server helper binary, and unrelated timeouts)
## Why Extensions can currently observe thread start, resume, and stop, but they do not have a lifecycle point for the host to say that immediately pending thread work has drained. That makes idle follow-up behavior harder to express as extension-owned logic instead of host-specific plumbing. This adds an explicit idle lifecycle hook so an extension can react when a thread becomes idle while the host keeps ownership of whether any submitted follow-up input starts a turn, is queued, or is ignored. ## What changed - Added `ThreadIdleInput` with access to the session-scoped and thread-scoped extension stores. - Added a default `on_thread_idle` method to `ThreadLifecycleContributor`. - Re-exported `ThreadIdleInput` from the extension API surface. ## Testing Not run; this only extends the extension API trait surface with a default hook and exported input type.
## Summary - Change last-`n` fork truncation to start at the first fork-turn boundary instead of returning the full rollout when the fork history is shorter than the requested window. - Add coverage for the startup-prefix case in both rollout truncation tests and agent control spawn behavior. - Ensure bounded forked children still rebuild context after the cached prefix is truncated. ## Testing - Added unit coverage for truncation behavior when the parent history is under the requested fork-turn limit. - Added an agent control test covering bounded fork spawn behavior with startup context present. - Not run (not requested).
## Why Plugin and marketplace mutations are applied by the app server, but several TUI follow-up paths still refreshed state from the TUI host config. In remote workspace mode, that can leave plugin UI state tied to stale client-local `config.toml` after the server has already applied the mutation. ## What - Stop reloading the TUI host config after app-server-owned plugin, marketplace, skill, and app mutations. - Use the same app-server-owned refresh path for local and remote sessions: ask the app server to reload user config where the running session needs it, then refetch plugin list/detail state from the app server. - Build plugin mention candidates from existing app-server `plugin/list` and `plugin/read` data in both local and remote sessions instead of TUI-host plugin config. - Avoid the duplicate local config reload after `ReloadUserConfig` asks the app server to reload config. ## Verification Manually launched a local WebSocket app-server with a temp server `CODEX_HOME`, launched the TUI with a separate temp host `CODEX_HOME` and `--remote`, installed a sample plugin from a temp local marketplace through `/plugins`, and confirmed the TUI refreshed to installed state while only the server config gained `[plugins."sample@debug"]`. Trace logs showed the TUI using app-server `plugin/list` and `plugin/read` for the refresh path.
## Why The TUI Vim composer currently diverges from normal Vim editing in two common workflows: pressing `e` repeatedly can remain stuck at an existing word end, and normal mode does not support `C` for changing through the end of the line. The existing `D` behavior also removes the newline when the cursor is already at the line boundary, which makes the new `C` action and existing deletion action surprising in multiline prompts. Closes #23926. Closes #24238. ## What Changed - Make normal-mode `e` advance from the current word end to the next word end, including for operator motions such as `de`. - Add configurable Vim normal-mode `change_to_line_end` behavior, bound to `C` by default, which deletes to the end of the current line and enters Insert mode. - Keep the newline intact when `D` or `C` is pressed at the end-of-line boundary. - Add regression coverage for repeated `e`, `de`, `C`, and the multiline `C`/`D` boundary behavior. - Regenerate the config schema and update the keymap picker snapshots for the new Vim action. ## How to Test 1. Run Codex with Vim composer mode enabled: ```bash cd codex-rs cargo run --bin codex -- -c tui.vim_mode_default=true ``` 2. Enter `alpha beta gamma`, press `Esc`, `0`, then press `e` repeatedly. Confirm the cursor advances through the ends of `alpha`, `beta`, and `gamma`. 3. Enter `hello world`, press `Esc`, `0`, `w`, then `C`. Confirm `world` is deleted and the composer enters Insert mode. 4. Enter a multiline prompt with `hello` above `world`, press `Esc`, `k`, `$`, and then `D`. Confirm the newline is preserved and the two lines do not join. 5. At the same boundary, press `C` and type `!`. Confirm the composer enters Insert mode and yields `hello!` above `world`, preserving the newline. Targeted automated verification: - `just fix -p codex-tui` - `just argument-comment-lint-from-source -p codex-tui -p codex-config` - `cargo insta pending-snapshots` reports no pending snapshots. - `just test -p codex-tui` validates the new Vim and keymap snapshot coverage, but the command remains red due to two reproducible unrelated failures in `app::tests::update_feature_flags_disabling_guardian_*`. ## Validation Note The workspace-wide `just argument-comment-lint` form is currently blocked during Bazel analysis by the existing LLVM `compiler-rt` missing `include/sanitizer/*.h` failure; package-scoped source linting for the changed Rust crates passed.
## Why Codex stores thread, log, goal, and memory state in bundled SQLite databases through SQLx. We have a suspected SQLite WAL-reset corruption issue under heavy concurrent writer load, especially when multiple subagents are active. The existing `sqlx 0.8.6` dependency kept us on an older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack far enough forward to pick up the newer bundled SQLite library. ## What changed - Bump the workspace `sqlx` dependency to `0.9.0`. - Use the SQLx 0.9 feature names explicitly: `runtime-tokio`, `tls-rustls`, and `sqlite-bundled`. - Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys 0.37.0`. - Refresh `MODULE.bazel.lock` for the dependency changes. - Adapt `codex-state` to SQLx 0.9: - build dynamic state queries with `QueryBuilder<Sqlite>` instead of passing dynamic `String`s to `sqlx::query`; - remove the old `QueryBuilder` lifetime parameter from helper signatures; - preserve SQLx's new `Migrator` fields when constructing runtime migrators. ## Verification - `just test -p codex-state` - `just bazel-lock-check` - `cargo check -p codex-state --tests`
# Summary The standalone update action currently downloads and runs the Codex installer as an interactive command. When an existing managed Codex install is present, accepting an update can therefore enter an installer prompt instead of completing the update. This change runs the standalone installer with `CODEX_NON_INTERACTIVE=1` on macOS/Linux and Windows. The installer environment-variable support is introduced by the parent PR; this PR wires that behavior into the Codex CLI update action. The rendered Windows command remains shell-safe, and long update commands wrap within the update-notice card. The standard test target snapshots the standalone notice for both platforms. # Stack 1. [#21567](#21567) - Adds environment-controlled release selection and noninteractive installer behavior. 2. [#24637](#24637) - Runs standalone updates with `CODEX_NON_INTERACTIVE=1`. (current) 3. [#24639](#24639) - Removes explicit release argument inputs in favor of `CODEX_RELEASE`. # Evidence Standalone updater-shaped macOS install with an existing npm-managed Codex on `PATH`: https://github.com/user-attachments/assets/a27fe9e9-db3a-4c39-a514-24bd3d1f01e8 # Testing Tests: targeted `codex-tui` update-action and update-notice snapshot tests, Rust formatting, benchmark smoke validation, macOS live-terminal standalone-update smoke testing, Windows ARM64 PowerShell standalone-update smoke testing through Parallels, and CI.
move `DEV_WEBSITE_VERCEL_DEPLOY_HOOK_URL` to a repo environment secret. to keep scope of use of that env secret small, move the vercel website redeploy to its own post-release job.
…ds (#23950) This adds slash command completion behavior for argument-taking commands, where text after the partially typed command becomes inline arguments instead of being discarded. This addresses the workflow of drafting text first, moving to the start, and completing a slash command around that existing draft. Before this change, this workflow would remove all user-input text aside from the slash command, which can be frustrating if the user had just typed out a long and well thought out goal. - Preserves the draft tail for inline-argument slash commands like `/goal` and `/review` when completing with `Tab` or `Enter`. - Keeps popup filtering focused on the command fragment under the cursor rather than the full draft text. - Leaves slash commands that do not support inline arguments unchanged, so completion still replaces the existing draft tail for those commands. - Adds focused TUI tests under slash input covering preserved arguments, cursor edge cases, and the negative case for a command without inline args. Follow-up simplification and test relocation from #24683 folded into this PR. --------- Co-authored-by: Eric Traut <etraut@openai.com>
## Context `docs/tui-chat-composer.md` was removed by #20896 as part of removing local-only docs/specs from the repository. I checked the #20896 file list and the merge commit: the composer doc was deleted, not moved or copied, and current `main` does not contain a replacement composer narrative doc. Current guidance should keep contributors and agents focused on the docs that still exist: the module docs in `chat_composer.rs` and `paste_burst.rs`. ## Summary - Removes the scoped TUI bottom-pane AGENTS.md requirement to update `docs/tui-chat-composer.md`. - Removes stale module-doc references to that deleted narrative doc from `chat_composer.rs` and `paste_burst.rs`. ## Validation - Checked #20896 and the merge commit with rename/copy detection to confirm `docs/tui-chat-composer.md` was deleted rather than moved. - Searched current `main` for a replacement composer narrative doc. - Not run; documentation-only change.
## Summary
- Add `request_kind` values for foreground turn, startup prewarm,
compaction, and detached memory model requests.
- Attach compaction dispatch metadata to local Responses, legacy
`/v1/responses/compact`, and remote v2 compact requests.
- Add the existing logical context-window identifier as `window_id` on
turn-owned model request metadata.
- Keep identity fields optional for detached memory requests, while
still emitting `request_kind="memory"` in non-git/no-sandbox workspaces.
## Root Cause
`x-codex-turn-metadata` has more than one producer. Foreground turns and
compaction requests own a real turn and should carry that turn identity.
Detached memory stage-one requests do not own a foreground turn, so
absent identity fields are valid rather than missing data. Startup
websocket prewarm is also a model request, but it has `generate=false`
and must not be counted as a foreground turn.
`thread_source` or session source identifies where a thread came from
(for example review, guardian, or another subagent). `request_kind`
identifies what the current outbound model request is doing (`turn`,
`prewarm`, `compaction`, or `memory`). A review or guardian thread can
issue either a normal turn request or a compaction request, so source
cannot replace request kind.
## Behavior / Impact
- Ordinary foreground requests send `request_kind="turn"`, their real
identity fields, and `window_id="<thread_id>:<window_generation>"`.
- Startup websocket warmup requests send `request_kind="prewarm"` so
they are not counted as foreground turns.
- Compaction requests send `request_kind="compaction"`, their real
owning turn identity, the existing `window_id`, and
`compaction.{trigger,reason,implementation,phase,strategy}`.
- Detached memory stage-one requests send `request_kind="memory"`
without `session_id`, `thread_id`, `turn_id`, or `window_id`; when no
workspace metadata exists, the kind-only header is still emitted.
- `session_id`, `thread_id`, `turn_id`, and `window_id` remain optional
in the header schema because detached memory requests do not own a
foreground turn or context window.
- `window_id` is not a new ID system: it is copied from the already-sent
`x-codex-window-id` / WS client metadata value at model-request dispatch
time.
- Existing `x-codex-window-id` HTTP/WS emission, value format,
generation advancement, resume behavior, and fork reset behavior are
unchanged.
- `request_kind`, `window_id`, and upstream turn-owned identity fields
remain schema-owned; input `responsesapi_client_metadata` cannot replace
their canonical values.
- No table, DAG, export, app-server API, or MCP `_meta` schema changes
are included.
A compaction attempt stopped by a pre-compact hook issues no model
request and therefore has no request header; its outcome remains in
analytics events. Status, error, duration, and token deltas also remain
analytics fields rather than request-header fields.
Future detached-memory attribution using a real initiating turn ID as
`trigger_turn_id` is intentionally not part of this PR.
## Sync With Main
- Final pushed head `716342e79` is rebased onto `origin/main@0d37db4`.
- The metadata conflict came from upstream `#24160`, which added
`forked_from_thread_id` on the same `turn_metadata` surface. Resolution
preserves that field and its protection from client metadata override
alongside this PR's request-kind, compaction, and window-id fields.
- While resolving the overlapping commits, I removed an accidental
recursive model-request overlay and a duplicate detached-memory header
builder before completing the rebase.
## Latency / User Experience Boundary
- Foreground turns perform no new filesystem, git, or network work. New
fields are inserted into metadata already serialized for outgoing
requests.
- Compaction issues the same model/HTTP requests with the same prompt,
model, service tier, and sampling settings; only metadata bytes change.
- Startup prewarm already sent metadata; it is now correctly classified
as `prewarm`.
- Non-git detached memory now sends a small kind-only metadata header
rather than no header.
- This client diff adds no user-visible latency mechanism beyond
negligible serialization and header bytes on already-existing requests.
## Validation
On conflict-resolved head `1d35c2cfb` based on `origin/main@4875217`:
- `just fmt` (passed)
- `just fix -p codex-core` (passed)
- `git diff --check origin/main...HEAD` (passed)
- `just test -p codex-core -E 'test(turn_metadata) |
test(websocket_first_turn_uses_startup_prewarm_and_create) |
test(responses_stream_includes_turn_metadata_header_for_git_workspace_e2e)
|
test(responses_websocket_forwards_turn_metadata_on_initial_and_incremental_create)
| test(remote_compact_v2_retries_failures_with_stream_retry_budget) |
test(window_id_advances_after_compact_persists_on_resume_and_resets_on_fork)'`
(`23 passed`; `bench-smoke` passed)
- `just test -p codex-app-server -E
'test(turn_start_forwards_client_metadata_to_responses_request_v2) |
test(turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2)
| test(auto_compaction_remote_emits_started_and_completed_items)'` (`3
passed`; `bench-smoke` passed)
- `just test -p codex-memories-write` (`29 passed`; `bench-smoke`
passed)
## Why
The Python SDK currently exposes sandbox selection differently depending
on where it is used: thread lifecycle methods accept `SandboxMode`,
while turns accept the lower-level `SandboxPolicy` shape. For the common
case of choosing an access level, that leaks app-server wire details
into otherwise straightforward SDK usage.
This makes the common path explicit and discoverable: callers choose a
named sandbox preset once, using the same keyword on threads and turns.
The preset name `workspace_write` also makes the granted capability
clear at the callsite.
## What changed
- Added a root-level `Sandbox` enum with documented presets:
- `Sandbox.read_only`: read files without allowing writes.
- `Sandbox.workspace_write`: the normal default for projects with a
recorded trust decision; read files and write inside the workspace and
configured writable roots.
- `Sandbox.full_access`: run without filesystem access restrictions.
- Documented that omitting `sandbox=` delegates to app-server's
configured default, while explicit turn overrides remain sticky for
subsequent turns.
- Updated sync and async thread lifecycle and turn APIs to consistently
accept `sandbox=Sandbox...`, translating to the existing app-server
thread and turn representations internally.
- Updated the public API artifact generator so regenerated SDK wrappers
retain the friendly enum shape.
- Replaced low-level policy construction in Python docs, examples, and
the walkthrough notebook with the preset API.
- Added focused coverage for root exports, method signatures,
preset-to-wire mapping, and rejection of raw string sandbox inputs.
## API impact
High-level turn calls now use `sandbox=` instead of `sandbox_policy=`:
```python
from openai_codex import Codex, Sandbox
with Codex() as codex:
thread = codex.thread_start(sandbox=Sandbox.workspace_write)
result = thread.run("Review the diff only.", sandbox=Sandbox.read_only)
```
`thread_start(...)` already defaults to `ApprovalMode.auto_review`, so
normal writable usage is concise:
```python
with Codex() as codex:
thread = codex.thread_start(sandbox=Sandbox.workspace_write)
thread.run("Update the files in this workspace.")
```
With that combination, edits inside `cwd` and configured writable roots
run within the workspace-write sandbox. Operations that require
approval, such as edits outside those roots, are routed through auto
review. When `sandbox=` is omitted, app-server resolves its configured
default. A sandbox supplied to `run(...)` or `turn(...)` applies to that
turn and subsequent turns.
## Test coverage
- `sdk/python/tests/test_public_api_signatures.py` covers the public
export and parameter names, including the default approval mode.
- `sdk/python/tests/test_public_api_runtime_behavior.py` covers preset
mappings to the existing wire types and raw string rejection.
## Why Vim mode currently supports some normal-mode operators and motions, but common text-object combinations like `ciw`, `daw`, `di(`, and quote/bracket variants are still missing. That makes the composer feel incomplete for users who expect operator + text object editing to work inside prompts. Closes #21383. ## What Changed - Add Vim pending-state support for operator/text-object sequences. - Add `c` as a normal-mode operator for text objects, so combinations like `ciw` delete the object and enter insert mode. - Support word, WORD, delimiter, and quote text objects: - `iw`, `aw`, `iW`, `aW` - `i(`, `a(`, `i)`, `a)`, `ib`, `ab` - `i[`, `a[`, `i]`, `a]` - `i{`, `a{`, `i}`, `a}`, `iB`, `aB` - `i"`, `a"`, `i'`, `a'`, `i\``, `a\`` - Add configurable keymap entries and keymap picker coverage for the new Vim text-object context. - Regenerate the config schema and update keymap picker snapshots. ## How to Test Manual smoke test: 1. Start Codex with Vim composer mode enabled. 2. Type a draft such as: ```text alpha beta gamma call(foo[bar], {"x": "hello world"}) say "one \"two\" three" now ``` 3. Put the cursor on `beta`, press `ciw`, and confirm `beta` is removed and the composer enters insert mode. 4. Escape back to normal mode, put the cursor on `gamma`, press `daw`, and confirm `gamma` plus surrounding whitespace is removed. 5. Put the cursor inside `foo[bar]`, press `di[`, and confirm only `bar` is removed. 6. Put the cursor inside `call(...)`, press `da(`, and confirm the whole parenthesized section is removed. 7. Put the cursor inside the quoted text, press `ci"`, and confirm the quote contents are removed and insert mode starts. 8. Verify cancellation does not edit text: press `d` then `Esc`, and press `d` then `i` then `Esc`. Targeted tests: - `cargo test -p codex-tui --lib vim_` - `cargo nextest run -p codex-tui keymap_setup::tests` Additional local checks: - `just write-config-schema` - `just fmt` - `just fix -p codex-tui` - `git diff --check` - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml` Local full-suite note: `just test -p codex-tui` ran to completion. The keymap snapshot failures were expected and accepted. Two unrelated guardian feature-flag tests still fail locally: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history` `just argument-comment-lint` is currently blocked locally by Bazel analysis before the lint runs because `compiler-rt` has an empty `include/sanitizer/*.h` glob in the local Bazel cache. The touched Rust diff was manually inspected for opaque positional literals.
## Why Interrupting an active turn is currently fixed to `Esc`, which is easy to hit accidentally and cannot be customized through `/keymap`. This gives users a less accidental binding while preserving the existing default. ## What Changed - Adds `tui.keymap.chat.interrupt_turn` to `/keymap`, defaulting to `esc` and supporting remapping or unbinding. - Uses the configured interrupt binding for running-turn status, queued steer interruption, and `request_user_input`, including the visible hints. - Preserves local `Esc` behavior for popups, Vim insert mode, and `/agent` editing while validating conflicts with fixed/backtrack and request-input navigation bindings. - Adds behavior and snapshot coverage for remapped interruption paths. ## How to Test 1. Run Codex and open `/keymap`, then set **Interrupt Turn** to `f12`. 2. Start a turn and confirm `Esc` no longer interrupts it while `f12` does; the running hint should display `f12 to interrupt`. 3. Queue a steer while a turn is running and confirm the preview displays `f12`; pressing it should interrupt and submit the steer immediately. 4. Trigger a `request_user_input` prompt and confirm its footer uses `f12`; with notes open, `Esc` should still clear notes while `f12` interrupts the turn. 5. Clear the Interrupt Turn binding and confirm the key-specific interrupt hint is removed while `Ctrl+C` remains available. Targeted validation: - `just write-config-schema` - `just fix -p codex-config` - `just fix -p codex-tui` - `just fmt` - `just argument-comment-lint-from-source -p codex-config -p codex-tui` - `just test -p codex-config` - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml` - `just test -p codex-tui keymap_setup::tests` - `just test -p codex-tui` (fails in two pre-existing guardian feature-flag tests unrelated to this diff; the intentional picker snapshot updates were reviewed and accepted)
## Stack - **Current: #24489 [1 of 2]** - render markdown tables in app style. - **Stacked follow-up: #24636 [2 of 2]** - render cramped markdown tables as key/value records. ## Why Markdown tables currently render as boxed terminal grids, which gives ordinary assistant output a heavier visual treatment than surrounding rich text. This row-separated layout is the best match for how the App renders tables, while accented headers remain distinguishable even when a terminal font renders bold subtly. <table> <tr><td> <p align="center">Codex CLI - Before</p> <img width="1722" height="742" alt="CleanShot 2026-05-25 at 18 46 17" src="https://github.com/user-attachments/assets/f673d92a-ebd8-46e2-b414-3d985e41b6a4" /> </td></tr> <tr><td> <p align="center">Codex CLI - After</p> <img width="1720" height="957" alt="image" src="https://github.com/user-attachments/assets/36a3d331-bea1-439b-b5be-e97b0731bd6f" /> </td></tr> <tr><td> <p align="center">Codex App</p> <img width="979" height="1293" alt="CleanShot 2026-05-25 at 18 45 04" src="https://github.com/user-attachments/assets/7d97cae0-9256-4f6e-a4b3-8b8f22b0d901" /> </td></tr> </table> ## What Changed - Render markdown tables as padded, aligned rows without an enclosing box. - Style table headers with the active syntax-theme accent plus bold text, while keeping separators low contrast and theme-aware. - Use a segmented heavy header rule and thin body-row rules, preserving wrapping, narrow-width fallback, streaming parity, and rich-history rendering. - Update focused assertions and snapshots for the final table layout. ## How to Test 1. Render a markdown table in the TUI with several rows and columns. 2. Confirm the header uses the active theme accent, rows use one-character interior padding, and the table has no enclosing box. 3. Confirm the header is followed by segmented `━` rules and multiple body rows are separated by muted segmented `─` rules. 4. Render the same table while streaming and in history/raw-mode toggles; the final rich layout should remain stable. 5. Render a narrow table with long content and verify wrapping or pipe fallback does not overflow. ## Validation - `just test -p codex-tui table` - `just test -p codex-tui streaming::controller::tests` - `just argument-comment-lint-from-source -p codex-tui -- --all-targets` - `just fix -p codex-tui` - `just fmt` `just test -p codex-tui` was also run after accepting the snapshots; it fails only in the unrelated existing guardian app tests `update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` and `update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`.
Client-side namespace tools are now supported by bedrock. Enable `namespace_tools` for the Amazon Bedrock provider while continuing to disable unsupported hosted tools such as image generation and web search.
## Why Interrupted `shell_command` calls can race with the outer tool-dispatch cancellation path. When that happens, the runtime future may be dropped before the spawned process gets a chance to run `SIGTERM` cleanup. For bwrapd-backed Linux sandbox commands, that can leave synthetic protected-path mount bookkeeping such as `.git/.codex` registrations under `/tmp` behind after a TUI interruption. The relevant cancellation points are the outer dispatch race in [`core/src/tools/parallel.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/parallel.rs#L91-L132) and the process shutdown logic in [`core/src/exec.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/exec.rs#L1367-L1393). ## What changed - Keep `shell_command` dispatch alive long enough for the runtime to finish cancellation cleanup instead of immediately returning the synthetic aborted response. - Fold shell-turn cancellation into the existing `ExecExpiration` path in [`core/src/tools/runtimes/shell.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/runtimes/shell.rs#L267-L274), so cancellation and timeout behavior stay centralized. - On cancellation, send `SIGTERM` first, wait briefly for cleanup to run, then hard-kill any remaining descendants in the original process group. - Treat `ESRCH` as an already-gone process-group cleanup case in `codex-utils-pty`, which keeps best-effort teardown from surfacing a stale-process race as an error. ## Verification - `cargo test -p codex-core cancellation` - Added regression coverage for: - `shell_tool_cancellation_waits_for_runtime_cleanup` - `process_exec_tool_call_cancellation_allows_sigterm_cleanup`
## Why Wrapped URLs in rich TUI output, especially URLs rendered inside Markdown tables, are split across terminal rows. In terminals that support OSC 8 hyperlinks, treating each visible fragment as part of the complete destination enables reliable open-link and copy-link actions even after table layout wraps the URL. This addresses the semantic-link portion of #12200 and the behavior described in #12200 (comment). It does not change ordinary drag-selection across bordered table rows. ## What Changed - Added shared TUI OSC 8 support that validates `http://` and `https://` destinations, sanitizes terminal payloads, and applies metadata separately from visible line width/layout. - Added semantic web-link annotations to assistant and proposed-plan Markdown, including explicit web links and bare web URLs in prose and table cells while excluding code and non-web Markdown destinations. - Preserved complete URL targets through table wrapping, narrow pipe fallback, streaming, transcript overlay rendering, history insertion, and resize replay. - Routed intentional Codex-owned links in notices, status/setup/app-link, feedback, onboarding, MCP/plugin help, memories, and update surfaces through the shared hyperlink handling. ## How to Test 1. Run Codex in a terminal with OSC 8 link support, such as Ghostty, and request an assistant response containing a Markdown table whose last column contains a long `https://` URL. 2. Make the terminal narrow enough for the URL to wrap across multiple bordered table rows. 3. Use the terminal's open-link or copy-link action on more than one wrapped URL fragment and confirm each fragment resolves to the complete original URL. 4. Resize the terminal after the table is rendered and repeat the link action to confirm the destination survives scrollback replay. 5. Open the transcript overlay while rich output is present and confirm web links remain interactive there. 6. As a regression check, render inline/fenced code containing URL text and a Markdown link such as `[https://example.com](mailto:support@example.com)`; confirm these do not acquire a web OSC 8 destination. Targeted automated coverage exercised Markdown links and exclusions, wrapped and pipe-fallback tables, streaming/transcript overlay propagation, status-link truncation, and rendered word-wrapping cell alignment. `just test -p codex-tui` was also run; it passed the hyperlink coverage and reproduced two unrelated existing guardian feature-flag test failures.
…2] (#24636) ## Stack - **Base: #24489 [1 of 2]** - render markdown tables in app style. - **Current: #24636 [2 of 2]** - render cramped markdown tables as key/value records. Review this PR against `fcoury/app-style-markdown-tables`; it contains only the fallback behavior for cramped tables. ## Why The row-separated markdown table rendering in #24489 remains readable while columns have usable room. Once long links or multiple prose-heavy columns are compressed into narrow allocations, however, the grid can turn words and paths into tall vertical strips that are difficult to scan. In those cases the content matters more than preserving the grid shape. ## What Changed <table> <tr><td> <p align="center"><b> Normal </b></p> <img width="1722" height="619" alt="CleanShot 2026-05-27 at 14 32 57" src="https://github.com/user-attachments/assets/d04f5fbd-6064-4acd-91bd-072d19b983df" /> </td></tr> <tr><td> <p align="center"><b> Narrow </b></p> <img width="863" height="1013" alt="CleanShot 2026-05-27 at 14 33 12" src="https://github.com/user-attachments/assets/6a7d2968-0a68-48fd-ab5d-209b3dbaf03e" /> </td></tr> <tr><td> <p align="center"><b> Very narrow </b></p> <img width="435" height="746" alt="CleanShot 2026-05-27 at 14 33 47" src="https://github.com/user-attachments/assets/f6a59e30-b1d2-4063-9c05-43933abc77d6" /> </td></tr> </table> - Detect tables whose grid allocation causes systemic token fragmentation or starves multiple prose-heavy columns. - Render those tables as repeated key/value records instead of retaining an unreadable grid. - Use aligned label/value records when there is useful horizontal room, and switch to a stacked narrow-record layout where each label is followed by a full-width value when width is especially constrained. - Preserve the themed label color, rich inline formatting, links, and the existing grid presentation for tables that remain readable. - Add snapshot coverage for path-heavy narrow tables, prose-heavy issue tables, systemic compact fragmentation, and a control case that should continue to render as a grid. ## How to Test 1. Start Codex from this branch and render a normal multi-column markdown table at a comfortable terminal width. Confirm it still appears as the styled row-separated grid from #24489. 2. Render a table containing a long linked record identifier or file-like value, then narrow the terminal until the grid would split the value into vertical fragments. Confirm it switches to key/value records, with labels above values at very narrow widths. 3. Render a table with multiple prose-heavy columns, such as an issue summary table with `Issue`, `Activity`, `Complexity`, and `Why start`. Confirm a cramped width switches to records rather than wrapping several columns into hard-to-read strips. 4. Render a compact table where only one value wraps mildly. Confirm it stays in grid form rather than switching prematurely. ## Validation - Ran `just test -p codex-tui` while developing the fallback and reviewed/accepted the intended new markdown-render snapshots. The command still reports two unrelated existing guardian feature-flag test failures outside this diff. - Ran `just fix -p codex-tui` and `just fmt` after the Rust changes were complete. - `just argument-comment-lint` cannot reach source linting locally because Bazel fails while resolving LLVM sanitizer headers; touched positional literal callsites were inspected manually and annotated where needed.
## Overview Allow remote `codex exec-server` registration to use existing API-key auth while restricting where those credentials can be sent. - Accept `CodexAuth::ApiKey` for the normal `--remote` registration path. - Restrict API-key remote registration to HTTPS `openai.com` and `openai.org` hosts and subdomains, with explicit HTTP loopback support for local development. - Disable registry registration redirects so credentials cannot be forwarded to an unvalidated destination. - Retain `--use-agent-identity-auth` as the explicit Agent Identity path. - Document remote registration using `CODEX_API_KEY`. ## Big picture Callers can now provide an API key directly to `exec-server` registration without first establishing ChatGPT login state: ```sh CODEX_API_KEY="$OPENAI_API_KEY" \ codex exec-server \ --remote "https://<host>.openai.org/api" \ --environment-id "$ENVIRONMENT_ID" ``` ## Validation - `cargo fmt --all` (`just fmt` is not installed on this host) - `cargo test -p codex-cli -p codex-exec-server`
WIll make it easier to uprev when the new draft spec is supported. Also updates reqwest where needed for compatibility but doesn't update it everywhere since this is already a large diff. The new version of rmcp handles certain kinds of authentication failures differently, this patch includes support for identifying the failing scope in a WWW-Authenticate header.
## Why The key/value markdown table renderer added in #24636 still operates on `Line` values, while table cells and rendered table output now carry `HyperlinkLine`. That mismatch breaks `codex-tui` compilation on `main` and would risk losing semantic web-link annotations if corrected by flattening the values. ## What changed - Make key/value record rendering wrap and emit `HyperlinkLine` values consistently with the existing grid renderer. - Remap wrapped hyperlink ranges and shift them when value content is prefixed by record-mode indentation or labels. - Add focused coverage verifying key/value fallback output preserves web-link destinations. ## Verification - `just test -p codex-tui -E 'test(key_value_table_keeps_web_annotations) | test(/table_renders_(key_value_records_when_compact_fragmentation_is_systemic_snapshot|stacked_key_value_records_when_path_column_becomes_too_narrow_snapshot|records_when_multiple_prose_columns_are_starved_snapshot)/)'`
## Why
`AppServerConfig` is exported as part of the ergonomic Python SDK
surface and passed to `Codex(...)` and `AsyncCodex(...)`. That name
exposes the underlying app-server transport at the same layer where
users are configuring the Codex client. `CodexConfig` makes the common
callsite read naturally and names the object it configures.
## What changed
- Renamed the public configuration dataclass from `AppServerConfig` to
`CodexConfig`.
- Updated `Codex`, `AsyncCodex`, and the transport clients to accept
`CodexConfig`.
- Updated binary-resolution messages, package exports, docs, examples,
and related coverage to use the new public name.
## API impact
```python
from openai_codex import Codex, CodexConfig
with Codex(config=CodexConfig(codex_bin="/path/to/codex")) as codex:
...
```
Callers should now import and construct `CodexConfig`; `AppServerConfig`
is no longer part of the Python SDK surface.
## Validation
- `uv run --frozen --extra dev ruff check src/openai_codex scripts
examples tests`
- Tests are deferred to online CI for this PR.
## Why Dynamic tools are defined at thread start and already stored in rollout `SessionMeta`, which restores resumed and forked sessions. Persisting the same tools through SQLite creates a second runtime persistence path that is unnecessary prework for the explicit namespace refactor. ## What changed - Restore missing thread-start dynamic tools directly from rollout history, including when SQLite is enabled. - Remove SQLite dynamic-tool reads, writes, backfill, and thread metadata patch plumbing. - Add SQLite-enabled resume integration coverage that verifies a rollout-defined dynamic tool is still sent after resume. ## Compatibility The existing `thread_dynamic_tools` table is intentionally not dropped even though it's now unused. Older Codex binaries are allowed to open databases migrated by newer binaries and still reference this table; dropping it would break that mixed-version path. See [here](https://github.com/openai/codex/blob/main/codex-rs/state/src/migrations.rs#L10-L11). ## Verification - `just test -p codex-state -p codex-rollout -p codex-thread-store` - `just test -p codex-core --test all resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
## Summary - Splits the monolithic `codex-cloud-config` implementation into focused modules. - Keeps behavior unchanged from the preceding config bundle runtime switch. ## Details This is the reviewability follow-up after the lineage-preserving migration PRs. The split separates backend transport, loader construction, cache handling, metrics, validation, service orchestration, and focused tests into named files. Verification: `just fmt`; `just test -p codex-cloud-config`.
## Why `code_mode_only` moved ordinary runtime tools behind `exec`, but it also hid hosted Responses tools. Hosted `web_search` and `image_generation` do not have a nested `exec` runtime path, so code-only sessions lost those capabilities entirely even when their existing provider, auth, model, and configuration gates passed. ## What changed - Keep hosted Responses tools top-level in `code_mode_only` sessions after their existing gates pass. - Preserve the existing nested-tool behavior for ordinary runtimes and the direct-only behavior for multi-agent v2 tools. - Add planner coverage for `code_mode_only` with default multi-agent v2 settings, hosted live web search, and hosted image generation. ## Verification - Added focused regression coverage in `codex-rs/core/src/tools/spec_plan_tests.rs`. - Left execution to CI per repository workflow.
## Stack 1. #25850 - Key request-permission grants by environment: stores and applies sticky permission grants per environment id. 2. #25858 - Add `environmentId` to `request_permissions`: lets the model target a selected environment and resolves relative permission paths against it. 3. #25862 - Propagate permission approval environment id: carries the selected environment id through approval events, app-server requests, TUI prompts, and delegate forwarding. 4. This PR (#25867) - Add remote request permissions integration coverage: verifies the selected remote environment across request, approval, grant reuse, and exec. This PR is stacked on #25862 and should be reviewed after #25850, #25858, and #25862. ## Why The environment-scoped permission stack needs one end-to-end check that exercises the CCA-shaped path, not only unit-level parsing. This verifies that a model-sent `environmentId` on `request_permissions` reaches the approval event, stores the grant under the selected environment, and is reused by a later tool call in that same environment. ## What Changed - Adds a remote executor integration test for `request_permissions` with `environmentId: remote` and a relative write root. - Asserts the permission event reports the remote environment and cwd, and that the normalized grant resolves under the remote cwd. - Approves the grant, then runs a remote `exec_command` without explicit per-call permissions and verifies it completes without another exec approval and writes only in the remote filesystem. ## Verification - Not run locally per instruction. - `git diff --check`
## Why `profile_sandbox_mode` was left over from the old selected legacy profile path. Production now always derives permissions without that value, and legacy profile contents are ignored, so keeping a parameter that is always `None` makes `derive_permission_profile` look like it still supports a fallback that no longer exists. ## What Changed - Removed the `profile_sandbox_mode` argument from `ConfigToml::derive_permission_profile`. - Updated the production caller and legacy sandbox-policy test helper to match. - Dropped the stale unselected legacy-profile sandbox test that only protected the removed fallback shape. ## Verification - `just test -p codex-config` - `just test -p codex-core 'config::'` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/25943). * #25926 * __->__ #25943
## Why Standalone image generation remained top-level-only in code-mode sessions. ## What changed - Change imagegen exposure from `DirectModelOnly` to `Direct`. - Keep direct-mode access while enabling nested code-mode access. - Add a focused regression test for the exposure contract. ## Validation - `just test -p codex-image-generation-extension`
## Summary - stop publishing Python runtime wheels as a side effect of Rust releases - publish runtime wheels from the Python SDK release workflow, either explicitly before updating the SDK pin or immediately before a `python-v*` SDK release - resolve the runtime release from the requested version or the SDK package's exact `openai-codex-cli-bin` pin - build two musllinux-tagged wheels from the Rust-release Linux package archives alongside the six existing runtime wheels - validate SDK beta tags before any PyPI write ## Release configuration - update the `openai-codex-cli-bin` PyPI trusted publisher to trust `.github/workflows/python-sdk-release.yml` and the `publish-python-runtime` job ## Pin update flow - run the `python-sdk-release` workflow manually with the new runtime version before opening or updating the SDK pin PR - after the pin lands, a `python-v*` SDK tag republishes with `skip-existing: true` before publishing the SDK package ## Validation - ran `just fmt` - validated the edited workflow YAML - validated the embedded `publish-python-runtime` Bash with `bash -n` - validated manual `0.136.0 -> rust-v0.136.0` mapping - validated tag-driven `python-v0.1.0b3 -> 0.132.0 -> rust-v0.132.0` mapping - validated rejection of an invalid SDK tag before publication - confirmed `rust-v0.136.0` contains the two required Linux package archives - CI will provide the full test signal
## Disclaimer This is only here for iteration purpose! Do not make any code rely on this ## Why Skills still live behind `codex-core` discovery and injection paths, but the extension system needs an authority-aware home before that logic can move. This adds that boundary without changing current skills behavior, and keeps host, executor, and remote skills distinct so future list/read/search flows do not collapse back to ambient local paths. ## What changed - Add the `codex-skills-extension` workspace/Bazel crate under `ext/skills`. - Define the initial catalog, authority, provider, and turn-state types for authority-bound skill packages and resources. - Register placeholder thread/config/prompt/turn lifecycle contributors plus host, executor, and remote provider aggregation points. - Capture the remaining extraction work as TODOs, including the missing extension API hooks needed for per-turn catalog construction and typed skill injection. - Keep plugins outside the runtime skills model: plugin-installed skills are treated as materialized host-owned skill sources once available. ## Verification - Not run locally.
## Why #25156 moved Bazel CI launches into a shared Python wrapper. On Windows, launching Bazel with `os.execvp` can split the spaced `--test_env=PATH=...` argument and fail to propagate the eventual Bazel exit status, allowing jobs to pass without running tests. This reapplies the wrapper after #25909 with a Windows-safe launch path. ## What changed Use a waited `subprocess.run` launch on Windows while preserving `os.execvp` on Unix. Add a process-level regression test for spaced arguments and child exit status, and run it on Windows Bazel shard 1. ## Experiment To confirm Bazel was actually invoking tests, patch `87b61d0be6` temporarily added an intentionally failing `codex-core` unit test. Bazel failed on that sentinel on all three major platforms: - [Linux Bazel test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062486) - [macOS Bazel test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062362) - [Windows Bazel test shard 1/4](https://github.com/openai/codex/actions/runs/26841132773/job/79151062155) The sentinel was removed after collecting this evidence. Windows Bazel [clippy](https://github.com/openai/codex/actions/runs/26841132773/job/79151062914) and [release verification](https://github.com/openai/codex/actions/runs/26841132773/job/79151062739) also passed. ## Validation After removing the sentinel, `just test -p codex-core` no longer reported it. The local run retained two unrelated environment-specific failures.
) ## Why `PermissionProfile` is becoming the default way to represent Codex permissions, but the implicit default behavior should stay the same for now: - trusted projects use `:workspace` - untrusted projects also use `:workspace` - roots without a trust decision use `:read-only` - unsandboxed Windows falls back to `:read-only` This keeps the existing sandbox semantics while making silent config defaults observable as built-in permission profiles instead of treating the legacy `SandboxPolicy` projection as the primary shape. ## What Changed - Refactored legacy sandbox derivation to resolve the configured sandbox mode once, then apply the implicit project fallback only when no sandbox mode was configured. - Preserved the existing trust-decision fallback: trusted and untrusted projects default to workspace-write where supported. - Added empty-config coverage asserting that an untrusted project resolves to the built-in active permission profile (`:workspace` outside unsandboxed Windows). ## Verification - `just fmt` - `just test -p codex-core 'config::'` - `just test -p codex-config` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/25926). * __->__ #25926
## Disclaimer Do not use for now ## Why Extensions can already contribute prompt fragments and request same-turn item injection, but there was no host-owned hook for contributing structured `ResponseItem`s while Codex is assembling a new turn's initial model input. This change adds that seam so extensions can attach turn-local input that depends on the submitted user input and resolved turn environments without routing through prompt text or late injection. ## What changed - add `TurnInputContributor` to `codex_extension_api` and export the new `TurnInputContext` / `TurnInputEnvironment` types it receives - teach `ExtensionRegistry` to register and expose turn-input contributors alongside the existing extension hooks - call registered turn-input contributors from `core/src/session/turn.rs` while building the initial injected input for a turn, then append their returned `ResponseItem`s after the skill and plugin injections
## Summary Allow EDU ChatGPT workspaces to fetch cloud config bundles. The existing cloud config eligibility gate only allowed business-like and enterprise plans, which meant EDU admins could configure managed policies in the UI but the Codex client would skip fetching them. This keeps individual/pro and team-like usage-based plans excluded, and adds service-level coverage for both `edu` and `education` plan aliases. ## Validation - `just fmt` - `just test -p codex-cloud-config` - Built the Codex app locally, created a new EDU ChatGPT workspace, and verified config bundles can be fetched and are properly applied.
## Why
Remote-control clients need to list and revoke controller-device grants
without enabling or enrolling the local relay. These are signed-in
account-management operations, so coupling them to websocket, pairing,
enrollment, or persisted relay state would prevent clients from managing
stale grants from the picker.
Related enhancement request: N/A. This adds the Codex app-server surface
for the planned upstream environment-scoped revoke endpoint.
## What Changed
- Added experimental app-server v2 RPCs:
- `remoteControl/client/list`
- `remoteControl/client/revoke`
- Added picker-oriented protocol types and standard generated schema
fixtures. The list response intentionally omits backend account id,
enrollment status, and location fields.
- Added `app-server-transport/src/transport/remote_control/clients.rs`
for environment-scoped GET and DELETE requests. It builds escaped URL
path segments, forwards optional pagination query fields, sends ChatGPT
auth plus `chatgpt-account-id`, converts RFC3339 `last_seen_at` values
to Unix seconds, accepts `204 No Content` revoke responses, and retries
once after a `401`.
- Extracted shared ChatGPT auth loading and recovery into
`app-server-transport/src/transport/remote_control/auth.rs` so
websocket, pairing, and client management use the same account-auth
boundary.
- Retained the configured remote-control base URL on
`RemoteControlHandle` and resolve management URLs lazily, preserving
deferred validation while relay startup is disabled.
- Registered list as `global_shared_read("remote-control-clients")` and
revoke as `global("remote-control-clients")`.
## Verification
- Added transport coverage proving list and revoke work while relay
state is disabled, IDs are escaped, picker-only fields are returned,
timestamps are converted, revoke accepts `204`, auth headers are
forwarded, `401` retries exactly once, `403` is not retried, and
malformed list payloads retain decode context.
- Added an app-server integration test proving both JSON-RPC methods
work before relay enablement and successful revoke returns `{}`.
- Regenerated and validated experimental and standard app-server schema
fixtures.
## Why `codex-core` currently owns the generic contextual-fragment trait and several reusable fragment implementations. That makes it harder for other crates to share the same host-owned model-input abstraction without depending on all of `codex-core`. This change extracts the reusable fragment machinery into a small `codex-context-fragments` crate so future extension and skills work can depend on the fragment abstraction directly. ## What Changed - Added the `codex-context-fragments` crate with: - `ContextualUserFragment` - `FragmentRegistration` / `FragmentRegistrationProxy` - additional-context fragment types - Moved `SkillInstructions` into `codex-core-skills`, since skill-specific rendering belongs with skills rather than generic core context machinery. - Kept `codex-core` re-exporting the fragment types it still uses internally, so existing call sites keep the same shape. - Updated Cargo and Bazel workspace metadata for the new crate. ## Verification - `cargo metadata --locked --format-version 1 --no-deps` - `just bazel-lock-update` - `just bazel-lock-check`
## Why `close_agent` is a parent-owned coordination tool: a worker should return its result, then let its parent decide when to close it. Before this change, if an MAv2 worker targeted itself, the resolved target could flow through the normal close path and ask the agent control layer to close the current conversation. ## What changed - Reject `close_agent` when the resolved target is the current session's `conversation_id`, returning a model-visible error that tells the worker to return its result instead. - Keep the guard after target resolution so it covers both thread-id targets and task-path targets. - Add coverage for self-targeting by thread id and by task name in `multi_agents_tests.rs`. Relevant code: - [`handle_close_agent`](https://github.com/openai/codex/blob/7c24e6641b693a3eed933dd376ce8f424ab6ea5f/codex-rs/core/src/tools/handlers/multi_agents_v2/close_agent.rs#L39-L57) - [`multi_agent_v2_close_agent_rejects_self_target_by_id` / `multi_agent_v2_close_agent_rejects_self_target_by_task_name`](https://github.com/openai/codex/blob/7c24e6641b693a3eed933dd376ce8f424ab6ea5f/codex-rs/core/src/tools/handlers/multi_agents_tests.rs#L3936-L4070) ## Testing Not run locally.
## Why The skills extension needs the resolved turn environments to build a real per-turn `SkillListQuery`. The previous `TurnLifecycleContributor` hook only had a turn id, so it could only seed a placeholder query and never carry the executor authorities that executor-scoped skill routing will need. Moving catalog resolution onto `TurnInputContributor` puts the skills extension on the same turn-preparation path that already has the environment ids and working directories for the submitted turn, while keeping the actual prompt injection work for follow-up changes. ## What changed - switch `ext/skills` from `TurnLifecycleContributor` to `TurnInputContributor` - build `executor_authorities` from `TurnInputContext.environments` and pass them through `SkillListQuery` - keep storing the resolved catalog in `SkillsTurnState`, but drop the placeholder query helper that no longer matches the real data flow - update the extension TODOs to reflect that per-turn catalog resolution now happens in the turn-input contributor, and that prompt/context injection still needs to move later ## Testing - Not run locally.
## Why Goal progress accounting can be reached from multiple completion paths for the same thread. Each path takes a progress snapshot, writes the usage delta, and then marks that snapshot as accounted. When two tool-completion hooks run at the same time, they can both observe the same unaccounted delta and charge it twice. ## What changed - Added a per-thread progress-accounting permit to `GoalAccountingState`. - Held that permit across the snapshot/write/mark-accounted critical section for active-turn, idle, and tool-finish accounting. - Added regression coverage for parallel tool-finish hooks so a shared token delta is charged once and only one progress event is emitted. ## Testing - Not run locally. - Added `parallel_tool_finish_accounts_active_goal_progress_once`.
Rename `Session::conversation_id` to `Session::thread_id` with an auto refactor in RustRover
## Why The skills extension needs a real turn-time path before host, executor, or remote skills can be routed through it. The previous code was mostly a placeholder catalog/provider sketch, so there was no bounded available-skills fragment, no source-owned `SKILL.md` read, and no place for warnings or per-turn selection state to live. This PR makes `ext/skills` the authority-preserving flow for listing candidate skills and injecting only explicitly selected main prompts, without adding more of that logic to `codex-core`. ## What changed - Expands catalog entries with `main_prompt`, display path, short description, dependency metadata, enabled/prompt visibility flags, and authority/package-aware read requests. - Replaces the placeholder `providers/*` modules with `SkillProviderSource` and `SkillProviders`, routing list/read/search calls by source kind and surfacing provider failures as warnings. - Adds bounded available-skills rendering and `SKILL.md` main-prompt truncation before the fragments enter model context. - Resolves explicit skill selections from structured `UserInput::Skill`, skill-file mentions, `skill://...` paths, and plain `$skill` text mentions, then reads selected prompts through their owning provider. - Stores mutable per-thread skills config and per-turn catalog/selection/warning state. - Adds `install_with_providers` so tests and future host wiring can supply concrete providers. ## Testing - Not run locally. - Added `codex-rs/ext/skills/tests/skills_extension.rs` coverage for available-catalog injection, selected prompt injection through the owning provider, and prompt-hidden skills that remain invokable.
## Why #23764 removed Windows resource stamping from `codex-windows-sandbox`, but it also removed the setup helper's UAC manifest. That manifest was doing more than cosmetic version metadata: Microsoft documents `requestedExecutionLevel level="asInvoker"` as the setting that makes an executable run at the same permission level as the process that started it: https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#trustinfo In the reported session, `codex-windows-sandbox-setup.exe` was launched for a non-elevated setup refresh and `CreateProcess` failed with `os error 740` (`The requested operation requires elevation`). Restoring an explicit `asInvoker` manifest records the helper's intended default launch contract: normal launches inherit the caller's token, and elevation only happens through the code paths that request it explicitly. The setup helper has two launch modes: - setup refresh uses a normal `Command::new(...)` spawn and should never trigger UAC - full setup explicitly uses `ShellExecuteExW` with the `runas` verb when elevation is required Restoring `asInvoker` keeps refresh non-elevated by default while preserving the explicit elevated path for full setup. ## What changed - Restored a minimal `codex-windows-sandbox-setup.manifest` containing only `requestedExecutionLevel level="asInvoker"`. - Added a small build script that passes setup-helper-scoped manifest linker args for MSVC and the Windows GNU/LLVM target used by Bazel. - Wired the manifest into Bazel build-script data. This does not restore `winres`, `FileDescription`, `ProductName`, or package-wide resource stamping, so other Codex binaries that link `codex-windows-sandbox` do not inherit metadata from this package. ## Verification - `cargo fmt -p codex-windows-sandbox` - `cargo build -p codex-windows-sandbox --bin codex-windows-sandbox-setup` - `cargo build -p codex-windows-sandbox --bin codex-command-runner` - `cargo build -p codex-windows-sandbox --lib` - Build-script output simulation for `CARGO_CFG_TARGET_ENV=msvc` emits `/MANIFEST:EMBED` and `/MANIFESTINPUT:<manifest>`. - Build-script output simulation for `CARGO_CFG_TARGET_ENV=gnu` + `CARGO_CFG_TARGET_ABI=llvm` emits `-Wl,-Xlink=/manifest:embed` and `-Wl,-Xlink=/manifestinput:<manifest>`. - Inspected the built binaries and confirmed: - `codex-windows-sandbox-setup.exe` contains `requestedExecutionLevel` / `asInvoker` - `codex-command-runner.exe` does not contain those manifest strings - Windows `VersionInfo` remains blank for `FileDescription` / `ProductName` - `just test -p codex-windows-sandbox` ran through Nextest, with 114 passing, 2 skipped, and 1 existing Windows sandbox failure: `unified_exec::tests::legacy_non_tty_cmd_emits_output` fails with `CreateRestrictedToken failed: 87`.
Simple prompt change for MAv2 because of OOD compared to CBv9
Skip turn git metadata enrichment when a turn has remote or multiple executors, so we do not report the orchestrator checkout as executor workspace metadata. Test: `just test -p codex-core` (blocked by existing `Session::conversation_id` compile error in `close_agent.rs`).
Fixes #26025. ## Why `/goal edit` opens `CustomPromptView`, which did not use the paste-burst handling that protects the main composer when terminals deliver paste as rapid key events. On Windows terminals, the first pasted newline could be treated as Enter-to-submit, truncating the goal edit and leaving the rest of the paste behind. ## What This reuses `PasteBurst` in `CustomPromptView` as a lightweight Enter-suppression detector for paste-like key streams. Characters still insert directly, explicit paste still goes through the view paste path, and ordinary text entry still submits on Enter.
## Why #25450 attempts a broad `SandboxPolicy` removal across several unrelated surfaces, which makes it hard to review and still leaves new helper code moving legacy policies around. This PR is a narrower alternative: migrate only the exec-side Windows sandbox plumbing so the review can focus on one production path and one compatibility boundary. The goal is to stop threading `SandboxPolicy` through exec code without expanding the migration into app-server, protocol, telemetry, config, or session behavior. ## What changed - Removed `ExecRequest::compatibility_sandbox_policy()`. - Changed the Windows restricted-token and elevated filesystem override helpers to accept `PermissionProfile` plus the split filesystem/network policies instead of a `SandboxPolicy`. - Kept the remaining legacy projection local to the writable-root comparison that still needs to compare split policy behavior against the legacy Windows backend model. - Rejected restricted split filesystem policies that still grant full-disk writes before using the Windows restricted-token backend, preserving the previous clear-failure behavior for profiles that project to `ExternalSandbox`. - Updated the Windows sandbox override tests to exercise the new call shape and cover the full-write split-profile regression. ## Verification - `just test -p codex-core windows_restricted_token` - `just test -p codex-core windows_elevated`
## Why Codex-created linked worktrees do not include ignored files from the main worktree. Bazel users who keep local overrides in `user.bazelrc` therefore lose those settings in every new worktree. The setup must also work on Windows and must not overwrite a file that already exists in the worktree. ## What changed The checked-in Codex environment now invokes `.codex/environments/setup.py`. The script resolves the main worktree and current worktree, then uses `copy_from_main_worktree_to_worktree(repo_relative_path)` to copy ignored files into new worktrees without overwriting existing destinations. `main()` currently copies `user.bazelrc`. Additional repository-relative paths can be added as further calls to the same helper. ## Validation - Ran the setup script in a linked worktree and confirmed it handles a missing main-worktree `user.bazelrc`. - Verified the helper copies a main-worktree file, preserves an existing worktree file, and creates parent directories for a nested path.
## Summary - pin the Python SDK runtime to `openai-codex-cli-bin==0.137.0a4` - refresh generated protocol artifacts from `rust-v0.137.0-alpha.4` - refresh `sdk/python/uv.lock` with all eight published runtime wheels ## Runtime publication - published `openai-codex-cli-bin==0.137.0a4` through the `python-sdk-release` workflow - includes macOS, manylinux, musllinux, and Windows wheels - publication run: https://github.com/openai/codex/actions/runs/26905608531 ## Validation - ran `just fmt` - generated artifacts from the `rust-v0.137.0-alpha.4` release wheel - ran `uv lock --check --default-index https://pypi.org/simple` - did not run tests locally, per request; CI provides the test signal
## Summary - Read `default_prompts` from remote plugin release metadata. - Prefer the plural prompt list over legacy `default_prompt`. - Fall back to `default_prompt` as a single-item list for backward compatibility. ## Testing - `just test -p codex-core-plugins` - `just test -p codex-app-server`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )