Skip to content

Add ipynb formatting buffer#12462

Open
evelyn-with-warp wants to merge 12 commits into
masterfrom
evelyn/ipynb-parser-buffer
Open

Add ipynb formatting buffer#12462
evelyn-with-warp wants to merge 12 commits into
masterfrom
evelyn/ipynb-parser-buffer

Conversation

@evelyn-with-warp

@evelyn-with-warp evelyn-with-warp commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

Breakdown #12242 into

  1. Ipynb parser and integrate that into Buffer::from_ipynb to bootstrap a buffer from ipynb contents
  2. Wire existing app-level logic to support opening an .ipynb file in our file notebook view
    This PR is part 1

Comparison with #12242

This PR (#12462) is a strict subset of #12242. Breakdown of which changes each PR covers:

Change (high-level) #12242 #12462
New ipynb_parser crate: convert nbformat v4 notebooks → FormattedText (markdown/code/raw cells, saved outputs, images, ANSI stripping, size/length truncation)
Register ipynb_parser as workspace member + warp_editor dependency (Cargo.toml, Cargo.lock, crates/editor/Cargo.toml)
Editor buffer integration: ContentFormat::Ipynb, InitialBufferState::ipynb, Buffer::from_ipynb (raw-JSON fallback on parse failure), replace() match arm
Editor model API: RichTextEditorModel::reset_with_ipynb to bootstrap a buffer from notebook contents
.ipynb file detection helper is_jupyter_notebook_file + tests (crates/warp_util/src/file_type.rs)
New JupyterNotebookRendering feature flag + dogfood enablement (crates/warp_features, app/src/features.rs, app/Cargo.toml)
Treat .ipynb as an openable/notebook file type + tests (app/src/util/openable_file_type.rs)
File-tree / code-view wiring to open .ipynb in the notebook view (app/src/code/file_tree/view.rs, app/src/code/view.rs)
Notebook file model/view support for .ipynb (app/src/notebooks/file/mod.rs + tests, app/src/notebooks/editor/model.rs, editor/view.rs)
"Open in Warp" + URI routing for .ipynb (app/src/terminal/view.rs, terminal/view/open_in_warp.rs, app/src/uri/mod.rs + tests)
Tooltip + workspace view updates for opening notebooks (app/src/util/tooltips.rs, app/src/workspace/view.rs)

Linked Issue

  • The linked issue is labeled ready-to-spec or ready-to-implement.
  • Where appropriate, screenshots or a short video of the implementation are included below (especially for user-visible or UI changes).

Testing

  • I have manually tested my changes locally with ./script/run
    No user perceived changes in this PR

Screenshots / Videos

https://www.loom.com/share/f6be00f6ab824bb094d74685050675e4
since it changed markdown image rendering by limiting image size and handling base64 string, we tested loading md file in warpdev and warplocal: markdown with images (smaller than 16MB) could be successfully opened/rendered in both (before & after this change)

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

evelyn-with-warp and others added 2 commits June 10, 2026 10:50
Add a self-contained ipynb_parser crate that converts nbformat v4 Jupyter
notebooks directly into FormattedText, and integrate it into the editor:
ContentFormat::Ipynb, InitialBufferState::ipynb, Buffer::from_ipynb (with
raw-JSON fallback on parse failure), and RichTextEditorModel::reset_with_ipynb.

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label Jun 10, 2026
@oz-for-oss

oz-for-oss Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@evelyn-with-warp

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an ipynb_parser crate and integrates a new Ipynb content format so notebook JSON can be converted into editor FormattedText with raw fallback handling on parse failures.

Concerns

  • Malformed notebooks that declare nbformat: 4 but omit the required cells field currently deserialize as an empty notebook, causing a blank render instead of the promised raw-content fallback.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread crates/ipynb_parser/src/lib.rs Outdated
evelyn-with-warp and others added 2 commits June 10, 2026 11:02
Add the .ipynb file-detection helper and its tests to warp_util. The helper
is independent of the app-level wiring and is shared groundwork for opening
notebooks, so it lands with the parser/buffer foundation.

Co-Authored-By: Oz <oz-agent@warp.dev>
Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
@evelyn-with-warp

Copy link
Copy Markdown
Contributor Author

/oz-review

@oz-for-oss

oz-for-oss Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@evelyn-with-warp

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an ipynb_parser crate and editor buffer/model integration for converting nbformat v4 notebooks into FormattedText, with tests covering cell rendering, saved outputs, embedded images, fallback behavior, and .ipynb filename detection.

Concerns

  • Stream and text/plain outputs still bypass the ANSI cleanup used for tracebacks, so notebooks that saved colored stdout/stderr can render raw control sequences. See the inline suggestion.
  • No approved spec context was available, and the security pass did not identify security findings.

Verdict

Found: 0 critical, 0 important, 1 suggestions

Approve with nits

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread crates/ipynb_parser/src/lib.rs Outdated
Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
@evelyn-with-warp

Copy link
Copy Markdown
Contributor Author

/oz-review

@oz-for-oss

oz-for-oss Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@evelyn-with-warp

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an ipynb_parser crate and editor buffer plumbing to render nbformat v4 notebooks as FormattedText, including markdown/code/raw cells and selected saved outputs.

Concerns

  • Notebook image outputs are emitted as data: URI image sources, but the existing editor image resolver treats non-HTTP(S), non-absolute sources as local file paths, so those outputs will fail to load instead of rendering.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread crates/ipynb_parser/src/lib.rs
@evelyn-with-warp

Copy link
Copy Markdown
Contributor Author

/oz-review

@oz-for-oss

oz-for-oss Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@evelyn-with-warp

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an ipynb_parser crate, editor buffer/reset support for .ipynb content, a notebook file-type helper, and base64 data: URI asset loading for rendered images. No approved spec context was provided.

Concerns

  • data_uri_source accepts arbitrarily large base64 payloads from rendered Markdown/notebook content and decodes them without an asset-level size limit; see the inline security comment.
  • This diff changes user-visible rendered image behavior by teaching the Markdown image resolver to render data: URI images, but the PR does not include screenshots or a screen recording. For this user-facing change, please include screenshots or a screen recording demonstrating it working end to end.

Security

  • Unbounded base64 data URI decoding can cause avoidable memory pressure when rendering crafted Markdown/notebook content.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz


// base64 payloads may contain embedded whitespace/newlines; strip it before
// decoding.
let payload: String = payload.chars().filter(|c| !c.is_whitespace()).collect();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] source comes from Markdown/notebook content, and this path clones then decodes an unbounded data: payload; reject payloads above a fixed size before collecting/decoding so a crafted image cannot spike memory usage.

@evelyn-with-warp

Copy link
Copy Markdown
Contributor Author

/oz-review

@oz-for-oss

oz-for-oss Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

This is your last /oz-review for the current 24-hour window. Your next slot opens in ~55m.

@oz-for-oss

oz-for-oss Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@evelyn-with-warp

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an ipynb_parser crate for rendering nbformat v4 notebooks into FormattedText, integrates .ipynb buffer initialization/reset paths in the editor, adds data-URI image handling for rendered notebook/markdown images, and adds file-type/parser coverage tests.

Concerns

  • No blocking correctness, security, spec-drift, or performance concerns found in the annotated diff.

Verdict

Found: 0 critical, 0 important, 0 suggestions

Approve

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz


/// Creates an [`AssetSource::Async`] that decodes an inline base64 `data:` URI
/// (e.g. `data:image/png;base64,<payload>`) into its raw bytes.
pub fn data_uri_source(source: &str) -> Option<AssetSource> {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we making changes in our asset cache code? I don't quite understand the motivation here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation comes from Oz comment, as jupyter notebook supports embedding images as base64 inline, which is different from markdown image rendering as an external reference

base_directory: Option<&Path>,
) -> AssetSource {
if source.starts_with("http://") || source.starts_with("https://") {
if let Some(data_uri_source) = asset_cache::data_uri_source(source) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change on supporting inline data URL in editor seems orthogonal to the overall PR on ipynb format parsing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is ipynb parser push_image that would produce inline data, and therefore we need to support this.
Please let me know if i misunderstood

let language = notebook.language();
let mut lines: Vec<FormattedTextLine> = Vec::new();

for cell in &notebook.cells {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterating over notebook cells, converting them into our format text, and then pushing it into a Vec seems not optimal for performance due to the incremental allocation we need to do as we iterate

Could we use std::iter::Map here to do the conversion without incrementally allocating space?

/// parseable notebook (malformed JSON, unsupported nbformat version, etc.). The
/// raw content is placed in a single code block so any Markdown/HTML inside it
/// is shown verbatim, never re-interpreted.
pub fn raw_fallback_formatted_text(content: &str) -> FormattedText {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we return an error and render that in a raw code editor (potentially with a user visible UX)?

I don't think silently falling back to a giant block of JSON is the right approach

return;
}
lines.push(FormattedTextLine::Image(FormattedImage {
alt_text: "output".to_string(),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we hard-coding alt text here?

@evelyn-with-warp evelyn-with-warp Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a place holder: if a cell generates image (calling plot libraries like seaborn, matplotlib..etc), it would carry no alt-text metadata and display the image in cell output, like

   "outputs": [
    {
     "data": {
      "image/png": some_base64
      "text/plain": [
       "<Figure size 1427.25x450 with 2 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }

Hence just picked the placeholder name semantically, as FormattedImage.alt_text is non_optional.
Hard to decide if put an empty string here is better.

Comment thread crates/ipynb_parser/src/lib.rs Outdated
return;
}
if base64.len() > MAX_IMAGE_DATA_CHARS {
push_text_output(lines, "[output image omitted: exceeds size limit]");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same default behavior we do for markdown images?

@evelyn-with-warp evelyn-with-warp Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call. This is only for jupyter notebook UI optimization (more strict than markdown, but maybe not necessary). i dropped it to be consistent with markdown too. 24a8907
We already have a size guard in asset cache that applied to both jupyter notebook and markdown.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow up: silent drop maybe not desirable; therefore attempted to display a hint when images are too big to display a2c7def

if text.is_empty() {
return;
}
if text.chars().count() > MAX_TEXT_OUTPUT_CHARS {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have some arbitrary text length truncation here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this truncation only applies for cell output, not cell (code or markdown), to prevent cell output bloating the buffer;
For vscode, we will see something like this
image
Added a TODO here, if we also want to support a text truncation UI like vscode (for now i feel it's a little overhead on the scope of rendering a notebook)


/// Strip ANSI escape sequences (CSI/SGR colors, OSC, and simple escapes) so
/// tracebacks render as readable plain text.
fn strip_ansi(input: &str) -> String {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to strip ansi?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

called out by this Oz-comment
The push_text_output will be rendered as plan, unhighlighted code block instead of a terminal/ansi-aware surface. strip_ansi in all input paths (stream, text/plain, and error) of push_text_output in order to avoid garbage rendering

fn sanitize_language(raw: &str) -> String {
let trimmed = raw.trim();
let is_safe = !trimmed.is_empty()
&& trimmed.chars().count() <= MAX_LANGUAGE_TAG_CHARS

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't quite understand why we need this sanitization process as well. Also seems like the fallback is we just leave an empty string?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language refers to the the notebook metadata metadata.language_info.name, which is attacker-controllable.
it's also a security guard called out by Oz

Leave an empty string means the code block will render as a plain, unhighlighted code block. Therefore, when detects the language is unknown/not safe

  1. push_code_block takes an empty string as param
  2. Flows to CodeBlockType, where lang is used to pick a grammar highlight

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants