Skip to content

[WIP] convert and optimize image in a single pass#289

Draft
cking100 wants to merge 1 commit intoopenzim:mainfrom
cking100:convert-optimize-single-pass
Draft

[WIP] convert and optimize image in a single pass#289
cking100 wants to merge 1 commit intoopenzim:mainfrom
cking100:convert-optimize-single-pass

Conversation

@cking100
Copy link
Copy Markdown

Fixes #284

Changes:

  • Enhanced optimize_xxx functions to accept a convert parameter
  • Modified optimize_image and optimize_gif to accept io.BytesIO as input
  • Added support for bytes in src
  • optimize_xxx now works on single pass

TODO

  • Write test cases

@cking100 cking100 changed the title convert and optimize image in a single pass [WIP] convert and optimize image in a single pass Mar 29, 2026
@benoit74 benoit74 marked this pull request as draft March 30, 2026 08:55
@benoit74
Copy link
Copy Markdown
Collaborator

I've converted PR to draft since it is not ready, please mark it as ready once ... ready 🤣

@benoit74 benoit74 marked this pull request as ready for review March 30, 2026 08:56
@benoit74 benoit74 marked this pull request as draft March 30, 2026 08:56
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 58.99281% with 57 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.32%. Comparing base (8c16fa6) to head (5e59e5c).

Files with missing lines Patch % Lines
src/zimscraperlib/image/optimization.py 58.99% 43 Missing and 14 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #289      +/-   ##
==========================================
- Coverage   99.52%   97.32%   -2.20%     
==========================================
  Files          41       41              
  Lines        2516     2584      +68     
  Branches      354      383      +29     
==========================================
+ Hits         2504     2515      +11     
- Misses          8       51      +43     
- Partials        4       18      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cking100 cking100 force-pushed the convert-optimize-single-pass branch 2 times, most recently from 7148ef8 to e5357ec Compare March 30, 2026 16:12
@cking100 cking100 force-pushed the convert-optimize-single-pass branch from e5357ec to c10b2eb Compare March 30, 2026 23:01
@cking100
Copy link
Copy Markdown
Author

@benoit74 I will add the test cases soon. But, before that, can I please get a review on the changes?

Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few remarks. You are on the right path, thank you

*,
convert: bool = False,
) -> pathlib.Path | io.BytesIO:
"""method to optimize PNG files using a pure python external optimizer"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring is not accurate anymore

"""Optimize image, automatically selecting correct optimizer

Arguments:
fmt: format of the source image, needed when src is io.BytesIO or bytes
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is convenient to have the option to pass this when we already know the format to save the addition format_for, but I don't get why it would be mandatory for io.BytesIO, can you explain? format_for accepts io.BytesIO AFAIK.

I would recommend to rename this to src_format for clarity.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. format_for does detect io.BytesIO and thus, fmt is not mandatory. This is a mistake, I will fix it and also rename it since we would need dst_format to accept io.BytesIO as dst.

def optimize_image(
src: pathlib.Path,
src: pathlib.Path | io.BytesIO | bytes,
dst: pathlib.Path,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like (if reasonably feasible) that we accept at least io.BytesIO as dst, bytes as well if feasible. This mandates to add dst_format as argument, but I'm fine with this.

We also need to add all meaningful signatures with @overloads so that it is clearer when a given argument is optional or mandatory at type checking time (in addition to the live checks the function performs afterwards).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should accept io.BytesIO as dst, but I am not sure how we would be able to work with dst as bytes, as they are immutable. If you have any ideas lmk.

I will push a patch for dst to accept io.BytesIO soon.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, dst as bytes makes no sense, too early comment on my side 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to convert and optimize an image to webp in one step with default settings?

2 participants